-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Allow 'zpool events' filtering by pool name #6762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6762 +/- ##
==========================================
- Coverage 75.25% 75.07% -0.19%
==========================================
Files 297 297
Lines 94374 94392 +18
==========================================
- Hits 71021 70862 -159
- Misses 23353 23530 +177
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @loli10K
One more thing, the zpool manpage indicates that multiple pool names can be provided. Your implementation supports only one pool being provided. So the manpage or implementation should be made to be consistent.
| log_must zpool clear $TESTPOOL | ||
| done | ||
|
|
||
| # 4. Verify 'zpool events -f' successfully recorded these new events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event processing may not occur immediately, might need to have a sleep (or something equivalent) above this. See previous comment about zpool clear and synchronous behavior.
| done | ||
| # Again, this is racy: we may be counting some events less than expected, so | ||
| # we wait a bit to allow the kernel module to process every new event. | ||
| sleep 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may not be a race here like you think. zpool clear may be performed synchronously. You should confirm if that is the case.
If zpool clear is in fact asynchronous, can we do something smarter here? Maybe some form of while loop to determine the number of events seen so far compared to what is expected? You can even bound the loop by some number of retries? You'd even be able to abstract it out into a function (can name it something like zpool_events_settle) and use it in other test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually already implemented a new function wait_events in zpool_events.kshlib (code not pused yet), but zpool_events_settle is objectively better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a race condition here, though zpool_events_poolname.ksh is the script that is now triggering it: http://build.zfsonlinux.org/builders/Ubuntu%2014.04%20i686%20%28TEST%29/builds/4383/steps/shell_8/logs/log
Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zpool_events/zpool_events_poolname (run as root) [00:04] [FAIL]
19:05:02.26 ASSERTION: 'zpool events poolname' should only display events from poolname.
19:05:02.27 SUCCESS: truncate -s 268435456 /mnt/newpool.dat
19:05:02.39 SUCCESS: zpool create newpool /mnt/newpool.dat
19:05:02.40 cleared 28 events
19:05:02.41 SUCCESS: zpool events -c
19:05:02.45 SUCCESS: zpool clear testpool
19:05:02.48 SUCCESS: zpool clear testpool
19:05:02.52 SUCCESS: zpool clear testpool
19:05:02.55 SUCCESS: zpool clear testpool
19:05:02.58 SUCCESS: zpool clear testpool
19:05:02.61 SUCCESS: zpool clear testpool
19:05:02.64 SUCCESS: zpool clear testpool
19:05:02.68 SUCCESS: zpool clear testpool
19:05:02.76 SUCCESS: zpool clear testpool
19:05:02.79 SUCCESS: zpool clear testpool
19:05:02.85 SUCCESS: zpool clear testpool
19:05:02.88 SUCCESS: zpool clear testpool
19:05:02.90 SUCCESS: zpool clear testpool
19:05:02.94 SUCCESS: zpool clear testpool
19:05:03.06 SUCCESS: zpool clear testpool
19:05:03.08 SUCCESS: zpool clear newpool
19:05:03.11 SUCCESS: zpool clear newpool
19:05:03.14 SUCCESS: zpool clear newpool
19:05:03.17 SUCCESS: zpool clear newpool
19:05:03.19 SUCCESS: zpool clear newpool
19:05:06.23 NOTE: waited 3 seconds
19:05:06.24 Oct 19 2017 19:05:02.401430000 sysevent.fs.zfs.history_event
19:05:06.24 Oct 19 2017 19:05:02.445430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.477430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.509430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.541430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.573430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.605430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.633430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.669430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.753430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.781430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.837430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.869430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.897430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.937430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.057430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.077430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.105430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.133430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.161430000 sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.185430000 sysevent.fs.zfs.config_sync
19:05:06.27 Unexpected events (newpool): 6 != 5
19:05:06.27 NOTE: Performing test-fail callback (/usr/share/zfs/zfs-tests/callbacks/zfs_dbgmsg.ksh)
Though zpool clear is indeed synchronous the fact we can see a "sysevent.fs.zfs.history_event" (which is from the zpool create newpool /mnt/newpool.dat) after zpool events -c seems to confirm event delivery to be asynchronous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for looking into that. This is going to pose a problem to your count based tests, isn't it? You could get around that for zpool_events_poolname.ksh by simply not caring about the number of events. Just ensure that zpool events pool output is from pool. Maybe something like the following will work in your case?
zpool events -v newpool | grep "pool =" | grep -v "pool = \"newpool\""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This race could theoretically affect other tests where we assume 0 events after zpool events -c (zpool_events_follow, zpool_events_clear), though is much more difficult to trigger because the other events must be from previous test cases.
I'm going to try your proposed solution locally and force-push when the test seems reliable.
| if [[ "$EVENTS_LOG" != "$EVENTS_NUM" ]]; then | ||
| log_fail "Unexpected number of events: $EVENTS_LOG != $EVENTS_NUM" | ||
| else | ||
| log_note "Successfully recorded $EVENTS_LOG events" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to put the log_note in an else. If you put it above the log_pass, it will still behave as you expect. I believe log_fail calls exit.
| if [[ "$EVENTS_BYNAME" != "$EVENTS_TESTPOOL" ]]; then | ||
| log_fail "Unexpected events: $EVENTS_BYNAME != $EVENTS_TESTPOOL" | ||
| else | ||
| log_note "Successfully recorded $EVENTS_BYNAME events" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| if [[ "$EVENTS_BYNAME" != "$EVENTS_NEWPOOL" ]]; then | ||
| log_fail "Unexpected events: $EVENTS_BYNAME != $EVENTS_NEWPOOL" | ||
| else | ||
| log_note "Successfully recorded $EVENTS_BYNAME events" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| { | ||
| typeset command="$1" | ||
|
|
||
| $command > /dev/null & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_must eval can be used to ensure that the command is logged.
|
|
||
| function cleanup | ||
| { | ||
| poolexists $NEWPOOL && destroy_pool $NEWPOOL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destroy_pool checks if the pool exists. poolexists $NEWPOOL can be removed.
| log_must zpool clear $NEWPOOL | ||
| done | ||
|
|
||
| # 4. Verify 'zpool events poolname' successfully display events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need a delay or check here to determine if all events were processed. See previous comment about zpool clear and synchronous behavior.
|
|
||
| log_assert "'zpool events -c' should successfully clear events." | ||
|
|
||
| EVENTS_NUM="$(random 15)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can random return 0? Also, I'm not sure we would want a random value for reproducibility reasons. It may be worth defining EVENTS_NUM in zpool_events.cfg.
| if [[ "$CLEAR_OUTPUT" != "cleared $EVENTS_NUM events" ]]; then | ||
| log_fail "Failed to clear $EVENTS_NUM events: $CLEAR_OUTPUT" | ||
| else | ||
| log_note "Successfully cleared $EVENTS_NUM events" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need an else clause here.
@dinatale2 you're right, |
Supporting a single pool is fine with me, but adding support for multiple pools doesn't look like it would be too much work. Take a look at the |
|
@behlendorf i wonder if |
|
@loli10K good point. That would definitely not work, so then let's limit it to a single pool rather than special case |
03a70b1 to
1e2e90c
Compare
| then | ||
| log_fail "Failed to sync pool $pool (snapshot $i)" | ||
| return 1 | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking in to this, I've been seen these occasional failures too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since i just reproduced this same failure on a local ubuntu-14.04.4-server-i386 builder i'm dropping this change from this PR and will continue working on this "offline"; i'm going to open a new PR if/when i'm able to fix the underlying issue.
| # STRATEGY: | ||
| # 1. Clear all ZFS events | ||
| # 2. Generate some new ZFS events | ||
| # 3. Verify 'zpool events -c' successfully clear new events |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/clear/clears
| else | ||
| i=$((i+1)) | ||
| fi | ||
| sleep 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest either pulling the sleep 1 in to the else block or removing the else case entirely. This way it's clear that both of these things must happen together. Which is true now, but not obvious at a glance.
|
|
||
| while [[ $i -lt $timeout ]]; do | ||
| count=$(zpool events -H | wc -l) | ||
| if [[ $count -eq $eventnum ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -eq here makes it possible you'll overshoot and skip over the expected event count. Should this fail immediately in that case rather than wait for the timeout?
73ae1e3 to
c4ff96e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. @loli10K when you're happy with this too go ahead and remove the "work in progress" tag.
|
I was reading more carefully the last comment in #3285 before dropping the WIP tag and
Unfortunately this implementation doesn't address it; maybe
This would require minimal changes to the manpage/tests and EDIT: update pushed. |
|
@loli10K you may need to push again, I don’t see where the buildbot received your change. |
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes openzfs#3285 Closes openzfs#6762
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes openzfs#3285 Closes openzfs#6762
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes openzfs#3285 Closes openzfs#6762
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes openzfs#3285 Closes openzfs#6762
Additionally add four new tests: * zpool_events_clear: verify 'zpool events -c' functionality * zpool_events_cliargs: verify command line options and arguments * zpool_events_follow: verify 'zpool events -f' * zpool_events_poolname: verify events filtering by pool name Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: loli10K <ezomori.nozomu@gmail.com> Closes openzfs#3285 Closes openzfs#6762
Description
This PR allows to only display
zpool eventsfor a specific pool.Additionally add four new tests:
Motivation and Context
Fix #3285
How Has This Been Tested?
Tests added to the ZTS
Types of changes
Checklist:
Signed-off-by.