Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephfs-top: addition of sort feature and limit option #48111

Merged
merged 2 commits into from Nov 4, 2022

Conversation

neesingh-rh
Copy link
Contributor

@neesingh-rh neesingh-rh commented Sep 15, 2022

This PR intends to add two new features to cephfs-top:

  • sort-by field value
  • limit the number of clients to be displayed

Fixes: https://tracker.ceph.com/issues/55121
Signed-off-by: Neeraj Pratap Singh neesingh@redhat.com

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

Copy link
Contributor

@anthonyeleven anthonyeleven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs lgtm

@neesingh-rh neesingh-rh requested a review from a team September 16, 2022 05:50
Copy link
Contributor

@vshankar vshankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neesingh-rh I'm playing with this from the last couple of days. Works really well 👍

Really nice work!

@vshankar
Copy link
Contributor

@joscollin please review this change.

@joscollin
Copy link
Member

@vshankar Looks good at the first look. Will review more tomorrow.
Just FYI, rbd-top has a smart way of sort-by.

@vshankar
Copy link
Contributor

@vshankar Looks good at the first look. Will review more tomorrow. Just FYI, rbd-top has a smart way of sort-by.

Yeh, I know that - the sort limit field can be specified when registering the query with OSD. That's supported by the MDS too :)

@vshankar
Copy link
Contributor

@vshankar Looks good at the first look. Will review more tomorrow. Just FYI, rbd-top has a smart way of sort-by.

@joscollin Is this good to go?

@vshankar
Copy link
Contributor

jenkins test make check

@joscollin
Copy link
Member

@vshankar Looks good at the first look. Will review more tomorrow. Just FYI, rbd-top has a smart way of sort-by.

@joscollin Is this good to go?

Will review it again the next working day.

doc/cephfs/cephfs-top.rst Show resolved Hide resolved
Copy link
Member

@joscollin joscollin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some limit menu issues:

  1. Launch the limit menu, resize the window multiple times, the menu is gone.
  2. Launch the limit menu, input a number and then resize the window, the menu is gone.
  3. Launch the limit menu, press any random keys, the menu is gone.
  4. Why the inputed number is not displayed on the screen?
  5. Launch the limit menu, pressing 'q' doesn't immediately go back to the previous screen. It needs multiple q presses it seems.

Is there a need for 'Fields Management' menu screen, as it just a redirect to another selection screen? We could do the same from the home page?
Like:
Press 'm' to select a filesystem, 's' for sort, 'l' to limit the clients and and 'q' to quit

@vshankar @gregsfortytwo
Sort is fine, but do we really need a limit as it could be scrolled when #48090 is merged? Looks like once the limit is set, it cannot be reset. We need to exit cephfs-top and launch again.

@joscollin
Copy link
Member

I think it would be better have something like a 'Reset to default' key at the home screen itself. So that the user can reset to the default home screen view anytime, without going inside the Sort screen and select the 'default'. This new key would reset the current sort & limit setting.

"rsp= READ_IO_SPEED", "wtio= WRITE_IO_SIZES", "waio= WRITE_AVG_IO_SIZES",
"wsp= WRITE_IO_SPEED", "rlatavg= AVG_READ_LATENCY", "rlatsd= STDEV_READ_LATENCY",
"wlatavg= AVG_WRITE_LATENCY", "wlatsd= STDEV_WRITE_LATENCY", "mlatavg= AVG_METADATA_LATENCY",
"mlatsd= STDEV_METADATA_LATENCY", "Default"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of maintaining field_menu, could it be created from MAIN_WINDOW_TOP_LINE_METRICS, as we did in refresh_top_line_and_build_coord ? Then it would be easier to maintain the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, I tried using that only for easy maintenance but it wasn't successful. Because currently we are using the field names stored in the field_menu to match for the field chosen and the display structure of field menu is followed the way it is done in top(1). If I use the MAIN_WINDOW_TOP_LINE_METRICS , firstly I will have to call its respective function for eg self.items() and self.avg_items() again and again.Secondly, MAIN_WINDOW_TOP_LINE_METRICS have some fields which are not being displayed directly in cephfs-top display as fields, their avg, std is being used but for us, order is very important.That's why, it looked better and easy to do this way.

@vshankar
Copy link
Contributor

Some limit menu issues:

  1. Launch the limit menu, resize the window multiple times, the menu is gone.
  2. Launch the limit menu, input a number and then resize the window, the menu is gone.
  3. Launch the limit menu, press any random keys, the menu is gone.
  4. Why the inputed number is not displayed on the screen?
  5. Launch the limit menu, pressing 'q' doesn't immediately go back to the previous screen. It needs multiple q presses it seems.

Is there a need for 'Fields Management' menu screen, as it just a redirect to another selection screen? We could do the same from the home page? Like: Press 'm' to select a filesystem, 's' for sort, 'l' to limit the clients and and 'q' to quit

@vshankar @gregsfortytwo Sort is fine, but do we really need a limit as it could be scrolled when #48090 is merged? Looks like once the limit is set, it cannot be reset. We need to exit cephfs-top and launch again.

I still find the limit functionality useful - think of using it with sort and limiting the display to top two clients that are consuming bandwidth. As far as resetting the limit, I think that a bug and should be fixed.

@vshankar
Copy link
Contributor

@neesingh-rh

cephfs-top:318:101: E501 line too long (110 > 100 characters)
cephfs-top:321:101: E501 line too long (103 > 100 characters)
cephfs-top:322:101: E501 line too long (115 > 100 characters)
cephfs-top:401:101: E501 line too long (117 > 100 characters)
cephfs-top:402:21: E128 continuation line under-indented for visual indent
cephfs-top:772:101: E501 line too long (160 > 100 characters)
cephfs-top:776:21: E117 over-indented
cephfs-top:777:101: E501 line too long (145 > 100 characters)
cephfs-top:904:101: E501 line too long (109 > 100 characters)
ERROR: InvocationError for command /home/jenkins-build/build/workspace/ceph-pull-requests/src/tools/cephfs/top/.tox/py3/bin/flake8 --ignore=W503 --max-line-length=100 cephfs-top (exited with code 1)

@neesingh-rh neesingh-rh force-pushed the feature_55121 branch 2 times, most recently from 7188102 to 56a554b Compare October 19, 2022 15:21
@neesingh-rh
Copy link
Contributor Author

neesingh-rh commented Oct 19, 2022

Some limit menu issues:

1. Launch the limit menu, resize the window multiple times, the menu is gone.

2. Launch the limit menu, input a number and then resize the window, the menu is gone.

3. Launch the limit menu, press any random keys, the menu is gone.

Above issues are resolved, RESIZE was also accepted as input earlier.

4. Why the inputed number is not displayed on the screen?

Added this functionality.

5. Launch the limit menu, pressing 'q' doesn't immediately go back to the previous screen. It needs multiple q presses it seems.

Fixed, Earlier it was working by pressing Enter immediately after pressing 'q'.

Is there a need for 'Fields Management' menu screen, as it just a redirect to another selection screen? We could do the same from the home page? Like: Press 'm' to select a filesystem, 's' for sort, 'l' to limit the clients and and 'q' to quit

I don't want to remove the Fields Management. It is keeping the things separated and easy to handle, even when some new functionalities being added later.And follows top(1) structure too.

@vshankar @gregsfortytwo Sort is fine, but do we really need a limit as it could be scrolled when #48090 is merged? Looks like once the limit is set, it cannot be reset. We need to exit cephfs-top and launch again.

It wasn't the bug earlier, we could set the limit value as many times we want, "d" key was there to reset. Now, I have added BACKSPACE too, to change the limit value during input time and currently I have set the length to digits allowed is FOUR.

@neesingh-rh
Copy link
Contributor Author

I think it would be better have something like a 'Reset to default' key at the home screen itself. So that the user can reset to the default home screen view anytime, without going inside the Sort screen and select the 'default'. This new key would reset the current sort & limit setting.

I think its better to have the default "d" key added to FIELDS MANAGEMENT Screen, which will makes the resetting the FIELD MANAGEMENT values only. Its good to have things modularized. I have added "d" key on the FIELDS MANAGEMNT Screen, it will reset the sort as well as limit value to the default.

@neesingh-rh
Copy link
Contributor Author

Some limit menu issues:

  1. Launch the limit menu, resize the window multiple times, the menu is gone.
  2. Launch the limit menu, input a number and then resize the window, the menu is gone.
  3. Launch the limit menu, press any random keys, the menu is gone.
  4. Why the inputed number is not displayed on the screen?
  5. Launch the limit menu, pressing 'q' doesn't immediately go back to the previous screen. It needs multiple q presses it seems.

Is there a need for 'Fields Management' menu screen, as it just a redirect to another selection screen? We could do the same from the home page? Like: Press 'm' to select a filesystem, 's' for sort, 'l' to limit the clients and and 'q' to quit
@vshankar @gregsfortytwo Sort is fine, but do we really need a limit as it could be scrolled when #48090 is merged? Looks like once the limit is set, it cannot be reset. We need to exit cephfs-top and launch again.

I still find the limit functionality useful - think of using it with sort and limiting the display to top two clients that are consuming bandwidth. As far as resetting the limit, I think that a bug and should be fixed.

+1 , Scroll makes the user to have a view of all the clients easily but if user wants to analyze metrics for some top 50 or 60 clients among thousands of clients, he or she will have an option.

@neesingh-rh
Copy link
Contributor Author

@neesingh-rh

cephfs-top:318:101: E501 line too long (110 > 100 characters)
cephfs-top:321:101: E501 line too long (103 > 100 characters)
cephfs-top:322:101: E501 line too long (115 > 100 characters)
cephfs-top:401:101: E501 line too long (117 > 100 characters)
cephfs-top:402:21: E128 continuation line under-indented for visual indent
cephfs-top:772:101: E501 line too long (160 > 100 characters)
cephfs-top:776:21: E117 over-indented
cephfs-top:777:101: E501 line too long (145 > 100 characters)
cephfs-top:904:101: E501 line too long (109 > 100 characters)
ERROR: InvocationError for command /home/jenkins-build/build/workspace/ceph-pull-requests/src/tools/cephfs/top/.tox/py3/bin/flake8 --ignore=W503 --max-line-length=100 cephfs-top (exited with code 1)

Removed flake8 errors. Thanks

@vshankar
Copy link
Contributor

LGTM. @joscollin Need your approval for merge.

Copy link
Member

@joscollin joscollin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really nice now. I have a few comments:

  1. Do we really need a press "d" for default option inside the l menu? As default d is already there in the home screen, the user comes inside only for a custom value.
  2. I think we certainly don't need filter to 0 clients. Please avoid that.
  3. Doc screenshot update?
  4. In the l menu, q quits only when the field is blank. I think we should enable q always in-case the user changes his mind.
  5. In the home screen, change to Press 'd' for default setting.
  6. The HELP is getting bigger every-time. How about shortening it a little bit like: COMMANDS: m - select a filesystem | s - sort menu | l - limit number of clients | d - default setting | q - quit ?

Will check the code tomorrow.

@neesingh-rh
Copy link
Contributor Author

neesingh-rh commented Oct 31, 2022

Looks really nice now. I have a few comments:

1. Do we really need a `press "d" for default` option inside the `l` menu? As default d is already there in the home screen, the user comes inside only for a custom value.

Let's keep it as the existing way only. If user wants default value only for limit not for the sort value, can explicitly do so.

2. I think we certainly don't need filter to `0` clients. Please avoid that.

Done.

3. Doc screenshot update?

Added

4. In the `l` menu, `q` quits only when the field is blank. I think we should enable `q` always in-case the user changes his mind.

Enabled 'q' while giving input too.

5. In the home screen, change to `Press 'd' for default setting`.

updated

6. The `HELP` is getting bigger every-time. How about shortening it a little bit like: `COMMANDS: m - select a filesystem | s - sort menu | l - limit number of clients | d - default setting | q - quit` ?

Done

Will check the code tomorrow.

@joscollin
Copy link
Member

Looks really nice now. I have a few comments:

1. Do we really need a `press "d" for default` option inside the `l` menu? As default d is already there in the home screen, the user comes inside only for a custom value.

Let's keep it as the existing way only. If user wants default value only for limit not for the sort value, can explicitly do so.

Then mention that feature in the doc under d : Default.

@neesingh-rh neesingh-rh force-pushed the feature_55121 branch 3 times, most recently from f6b7c6a to dce6c07 Compare November 2, 2022 07:15
src/tools/cephfs/top/cephfs-top Show resolved Hide resolved
src/tools/cephfs/top/cephfs-top Outdated Show resolved Hide resolved
@joscollin
Copy link
Member

@neesingh-rh Fix the tox failures too.

This commit intends to add:
- sort-by field value feature to cephfs-top.
- feature to limit number of clients displayed

Fixes: https://tracker.ceph.com/issues/55121
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
Fixes: https://tracker.ceph.com/issues/55121
Signed-off-by: Neeraj Pratap Singh <neesingh@redhat.com>
Copy link
Member

@joscollin joscollin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qa doesn't seems necessary (it's just a selftest). Let's go for merge.

@joscollin
Copy link
Member

@vshankar

vstart_runner test succeed:

2022-11-04 07:04:17,323.323 INFO:tasks.cephfs.fuse_mount:Found client admin socket at /home/jcollin/workspace/ceph/build/asok/client.0.1664051.asok
2022-11-04 07:04:17,323.323 DEBUG:__main__:"sudo" was omitted from the following cmd args before execution and logging using function overriding; check vstart_runner.py for more details.
2022-11-04 07:04:17,323.323 DEBUG:__main__:> sudo ./bin/ceph --admin-daemon /home/jcollin/workspace/ceph/build/asok/client.0.1664051.asok status
2022-11-04 07:04:17,425.425 DEBUG:__main__:> stat --file-system '--printf=%T
' -- /tmp/tmpj430z5eb/mnt.0
2022-11-04 07:04:17,427.427 INFO:tasks.cephfs.fuse_mount:ceph-fuse is mounted on /tmp/tmpj430z5eb/mnt.0
2022-11-04 07:04:17,427.427 DEBUG:__main__:> sudo chmod 1777 /tmp/tmpj430z5eb/mnt.0
2022-11-04 07:04:17,436.436 DEBUG:__main__:> ./bin/ceph mgr module enable stats
2022-11-04 07:04:18,679.679 DEBUG:__main__:> /home/jcollin/workspace/ceph/src/tools/cephfs/top/cephfs-top --cluster=hpec --id=admin --selftest
cluster hpec does not exist
2022-11-04 07:04:18,755.755 DEBUG:__main__:> ./bin/ceph mgr module disable stats
2022-11-04 07:04:19,798.798 DEBUG:__main__:> set -ex
dd if=/proc/self/mounts of=/dev/stdout
2022-11-04 07:04:19,799.799 DEBUG:__main__:Discovered MDS IDs: dict_keys(['a', 'b', 'c'])
2022-11-04 07:04:19,799.799 DEBUG:__main__:> ./bin/ceph osd blocklist ls
2022-11-04 07:04:20,039.039 INFO:tasks.cephfs.fuse_mount:Running fusermount -u on local...
2022-11-04 07:04:20,039.039 DEBUG:__main__:> sudo fusermount -u /tmp/tmpj430z5eb/mnt.0
2022-11-04 07:04:20,048.048 INFO:tasks.cephfs.mount:Cleaning up mount local
2022-11-04 07:04:20,048.048 DEBUG:__main__:> rmdir -- /tmp/tmpj430z5eb/mnt.0
2022-11-04 07:04:20,049.049 DEBUG:__main__:> ./bin/ceph fs dump --format=json
2022-11-04 07:04:20,307.307 DEBUG:__main__:Discovered MDS IDs: dict_keys(['a', 'b', 'c'])
2022-11-04 07:04:20,307.307 DEBUG:__main__:> ./bin/ceph fs dump --format=json
2022-11-04 07:04:20,579.579 DEBUG:__main__:> ./bin/ceph osd dump --format=json
2022-11-04 07:04:20,851.851 INFO:tasks.cephfs.filesystem:Destroying file system cephfs and related pools
2022-11-04 07:04:20,851.851 DEBUG:__main__:> ./bin/ceph fs dump --format=json
2022-11-04 07:04:21,115.115 DEBUG:__main__:> ./bin/ceph fs dump --format=json
2022-11-04 07:04:21,372.372 DEBUG:__main__:> ./bin/ceph osd dump --format=json
2022-11-04 07:04:21,606.606 DEBUG:__main__:> ./bin/ceph fs fail cephfs
2022-11-04 07:04:22,984.984 DEBUG:__main__:> ./bin/ceph fs rm cephfs --yes-i-really-mean-it
2022-11-04 07:04:24,008.008 DEBUG:__main__:> ./bin/ceph osd pool rm cephfs_metadata cephfs_metadata --yes-i-really-really-mean-it
2022-11-04 07:04:25,144.144 DEBUG:__main__:> ./bin/ceph osd pool rm cephfs_data cephfs_data --yes-i-really-really-mean-it
2022-11-04 07:04:26,283.283 DEBUG:__main__:> ./bin/ceph config rm mon mon_allow_pool_delete
2022-11-04 07:04:26,552.552 DEBUG:__main__:> ./bin/ceph log 'Ended test tasks.cephfs.test_fstop.TestFSTop.test_fstop_non_existent_cluster'
2022-11-04 07:04:27,229.229 INFO:__main__:Stopped test: test_fstop_non_existent_cluster (tasks.cephfs.test_fstop.TestFSTop) in 28.025532s
2022-11-04 07:04:27,229.229 INFO:__main__:test_fstop (tasks.cephfs.test_fstop.TestFSTop) ... ok
2022-11-04 07:04:27,229.229 INFO:__main__:test_fstop_non_existent_cluster (tasks.cephfs.test_fstop.TestFSTop) ... ok
2022-11-04 07:04:27,229.229 INFO:__main__:
2022-11-04 07:04:27,229.229 INFO:__main__:----------------------------------------------------------------------
2022-11-04 07:04:27,229.229 INFO:__main__:Ran 2 tests in 71.518s
2022-11-04 07:04:27,230.230 INFO:__main__:
2022-11-04 07:04:27,230.230 INFO:__main__:
2022-11-04 07:04:27,230.230 DEBUG:__main__:> ip netns list
2022-11-04 07:04:27,231.231 DEBUG:__main__:> sudo ip link delete ceph-brx
Cannot find device "ceph-brx"
2022-11-04 07:04:27,243.243 INFO:__main__:OK
2022-11-04 07:04:27,243.243 INFO:__main__:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cephfs Ceph File System documentation
Projects
None yet
4 participants