Skip to content

Conversation

@loli10K
Copy link
Contributor

@loli10K loli10K commented Oct 15, 2017

Description

This PR allows to only display zpool events for a specific pool.

Additionally add four new tests:

  • zpool_events_clear: verify 'zpool events -c' functionality
  • zpool_events_cliargs: verify command line options and arguments
  • zpool_events_follow: verify 'zpool events -f'
  • zpool_events_poolname: verify events filtering by pool name

Motivation and Context

Fix #3285

How Has This Been Tested?

Tests added to the ZTS

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@loli10K loli10K added the Status: Work in Progress Not yet ready for general review label Oct 15, 2017
@codecov
Copy link

codecov bot commented Oct 16, 2017

Codecov Report

Merging #6762 into master will decrease coverage by 0.18%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6762      +/-   ##
==========================================
- Coverage   75.25%   75.07%   -0.19%     
==========================================
  Files         297      297              
  Lines       94374    94392      +18     
==========================================
- Hits        71021    70862     -159     
- Misses      23353    23530     +177
Flag Coverage Δ
#kernel 74.72% <ø> (-0.06%) ⬇️
#user 67.3% <100%> (-0.26%) ⬇️
Impacted Files Coverage Δ
cmd/zpool/zpool_main.c 76.52% <100%> (+0.42%) ⬆️
module/zfs/vdev_missing.c 60% <0%> (-30%) ⬇️
cmd/zed/agents/zfs_agents.c 74.21% <0%> (-15.63%) ⬇️
cmd/zed/agents/fmd_serd.c 70.29% <0%> (-9.91%) ⬇️
cmd/zed/agents/fmd_api.c 80.42% <0%> (-6.05%) ⬇️
cmd/zed/agents/zfs_diagnosis.c 68.03% <0%> (-5.58%) ⬇️
module/zfs/lzjb.c 96.29% <0%> (-3.71%) ⬇️
cmd/zed/agents/zfs_retire.c 77.01% <0%> (-2.49%) ⬇️
cmd/zed/zed_disk_event.c 81.15% <0%> (-2.18%) ⬇️
module/zfs/mmp.c 95.7% <0%> (-1.85%) ⬇️
... and 41 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 64b8c58...e10e851. Read the comment docs.

Copy link
Contributor

@dinatale2 dinatale2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @loli10K

One more thing, the zpool manpage indicates that multiple pool names can be provided. Your implementation supports only one pool being provided. So the manpage or implementation should be made to be consistent.

log_must zpool clear $TESTPOOL
done

# 4. Verify 'zpool events -f' successfully recorded these new events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event processing may not occur immediately, might need to have a sleep (or something equivalent) above this. See previous comment about zpool clear and synchronous behavior.

done
# Again, this is racy: we may be counting some events less than expected, so
# we wait a bit to allow the kernel module to process every new event.
sleep 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may not be a race here like you think. zpool clear may be performed synchronously. You should confirm if that is the case.

If zpool clear is in fact asynchronous, can we do something smarter here? Maybe some form of while loop to determine the number of events seen so far compared to what is expected? You can even bound the loop by some number of retries? You'd even be able to abstract it out into a function (can name it something like zpool_events_settle) and use it in other test cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually already implemented a new function wait_events in zpool_events.kshlib (code not pused yet), but zpool_events_settle is objectively better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a race condition here, though zpool_events_poolname.ksh is the script that is now triggering it: http://build.zfsonlinux.org/builders/Ubuntu%2014.04%20i686%20%28TEST%29/builds/4383/steps/shell_8/logs/log

Test: /usr/share/zfs/zfs-tests/tests/functional/cli_root/zpool_events/zpool_events_poolname (run as root) [00:04] [FAIL]
19:05:02.26 ASSERTION: 'zpool events poolname' should only display events from poolname.
19:05:02.27 SUCCESS: truncate -s 268435456 /mnt/newpool.dat
19:05:02.39 SUCCESS: zpool create newpool /mnt/newpool.dat
19:05:02.40 cleared 28 events
19:05:02.41 SUCCESS: zpool events -c
19:05:02.45 SUCCESS: zpool clear testpool
19:05:02.48 SUCCESS: zpool clear testpool
19:05:02.52 SUCCESS: zpool clear testpool
19:05:02.55 SUCCESS: zpool clear testpool
19:05:02.58 SUCCESS: zpool clear testpool
19:05:02.61 SUCCESS: zpool clear testpool
19:05:02.64 SUCCESS: zpool clear testpool
19:05:02.68 SUCCESS: zpool clear testpool
19:05:02.76 SUCCESS: zpool clear testpool
19:05:02.79 SUCCESS: zpool clear testpool
19:05:02.85 SUCCESS: zpool clear testpool
19:05:02.88 SUCCESS: zpool clear testpool
19:05:02.90 SUCCESS: zpool clear testpool
19:05:02.94 SUCCESS: zpool clear testpool
19:05:03.06 SUCCESS: zpool clear testpool
19:05:03.08 SUCCESS: zpool clear newpool
19:05:03.11 SUCCESS: zpool clear newpool
19:05:03.14 SUCCESS: zpool clear newpool
19:05:03.17 SUCCESS: zpool clear newpool
19:05:03.19 SUCCESS: zpool clear newpool
19:05:06.23 NOTE: waited 3 seconds
19:05:06.24 Oct 19 2017 19:05:02.401430000	sysevent.fs.zfs.history_event
19:05:06.24 Oct 19 2017 19:05:02.445430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.477430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.509430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.541430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.573430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.605430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.633430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.669430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.753430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.781430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.837430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.869430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.897430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:02.937430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.057430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.077430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.105430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.133430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.161430000	sysevent.fs.zfs.config_sync
19:05:06.24 Oct 19 2017 19:05:03.185430000	sysevent.fs.zfs.config_sync
19:05:06.27 Unexpected events (newpool): 6 != 5
19:05:06.27 NOTE: Performing test-fail callback (/usr/share/zfs/zfs-tests/callbacks/zfs_dbgmsg.ksh)

Though zpool clear is indeed synchronous the fact we can see a "sysevent.fs.zfs.history_event" (which is from the zpool create newpool /mnt/newpool.dat) after zpool events -c seems to confirm event delivery to be asynchronous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for looking into that. This is going to pose a problem to your count based tests, isn't it? You could get around that for zpool_events_poolname.ksh by simply not caring about the number of events. Just ensure that zpool events pool output is from pool. Maybe something like the following will work in your case?

zpool events -v newpool | grep "pool =" | grep -v "pool = \"newpool\""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This race could theoretically affect other tests where we assume 0 events after zpool events -c (zpool_events_follow, zpool_events_clear), though is much more difficult to trigger because the other events must be from previous test cases.

I'm going to try your proposed solution locally and force-push when the test seems reliable.

if [[ "$EVENTS_LOG" != "$EVENTS_NUM" ]]; then
log_fail "Unexpected number of events: $EVENTS_LOG != $EVENTS_NUM"
else
log_note "Successfully recorded $EVENTS_LOG events"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to put the log_note in an else. If you put it above the log_pass, it will still behave as you expect. I believe log_fail calls exit.

if [[ "$EVENTS_BYNAME" != "$EVENTS_TESTPOOL" ]]; then
log_fail "Unexpected events: $EVENTS_BYNAME != $EVENTS_TESTPOOL"
else
log_note "Successfully recorded $EVENTS_BYNAME events"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

if [[ "$EVENTS_BYNAME" != "$EVENTS_NEWPOOL" ]]; then
log_fail "Unexpected events: $EVENTS_BYNAME != $EVENTS_NEWPOOL"
else
log_note "Successfully recorded $EVENTS_BYNAME events"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

{
typeset command="$1"

$command > /dev/null &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_must eval can be used to ensure that the command is logged.


function cleanup
{
poolexists $NEWPOOL && destroy_pool $NEWPOOL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destroy_pool checks if the pool exists. poolexists $NEWPOOL can be removed.

log_must zpool clear $NEWPOOL
done

# 4. Verify 'zpool events poolname' successfully display events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need a delay or check here to determine if all events were processed. See previous comment about zpool clear and synchronous behavior.


log_assert "'zpool events -c' should successfully clear events."

EVENTS_NUM="$(random 15)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can random return 0? Also, I'm not sure we would want a random value for reproducibility reasons. It may be worth defining EVENTS_NUM in zpool_events.cfg.

if [[ "$CLEAR_OUTPUT" != "cleared $EVENTS_NUM events" ]]; then
log_fail "Failed to clear $EVENTS_NUM events: $CLEAR_OUTPUT"
else
log_note "Successfully cleared $EVENTS_NUM events"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need an else clause here.

@loli10K
Copy link
Contributor Author

loli10K commented Oct 17, 2017

One more thing, the zpool manpage indicates that multiple pool names can be provided. Your implementation supports only one pool being provided. So the manpage or implementation should be made to be consistent.

@dinatale2 you're right, zpool synopsis suggests only 1 pool but the detailed zpool events description seems to support multiple pools. I would personally prefer to support only one pool, mostly because supporting multiple pools would complicate the code quite a bit when it would be probably easier to just use some shell wizardry to string multiple zpool events $poolN together.

@behlendorf
Copy link
Contributor

supporting multiple pools would complicate the code quite a bit when it would be probably easier to just use some shell wizardry to string multiple zpool events $poolN together.

Supporting a single pool is fine with me, but adding support for multiple pools doesn't look like it would be too much work. Take a look at the zpool_do_sync function which does supports multiple pools for an example of how this can be done.

@loli10K
Copy link
Contributor Author

loli10K commented Oct 17, 2017

@behlendorf i wonder if for_each_pool() would work in follow (zpool events -f $pool1 $pool2) mode(?)

@behlendorf
Copy link
Contributor

@loli10K good point. That would definitely not work, so then let's limit it to a single pool rather than special case -f.

@loli10K loli10K force-pushed the issue-3285 branch 4 times, most recently from 03a70b1 to 1e2e90c Compare October 19, 2017 18:05
then
log_fail "Failed to sync pool $pool (snapshot $i)"
return 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking in to this, I've been seen these occasional failures too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since i just reproduced this same failure on a local ubuntu-14.04.4-server-i386 builder i'm dropping this change from this PR and will continue working on this "offline"; i'm going to open a new PR if/when i'm able to fix the underlying issue.

# STRATEGY:
# 1. Clear all ZFS events
# 2. Generate some new ZFS events
# 3. Verify 'zpool events -c' successfully clear new events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/clear/clears

else
i=$((i+1))
fi
sleep 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest either pulling the sleep 1 in to the else block or removing the else case entirely. This way it's clear that both of these things must happen together. Which is true now, but not obvious at a glance.


while [[ $i -lt $timeout ]]; do
count=$(zpool events -H | wc -l)
if [[ $count -eq $eventnum ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -eq here makes it possible you'll overshoot and skip over the expected event count. Should this fail immediately in that case rather than wait for the timeout?

@loli10K loli10K force-pushed the issue-3285 branch 5 times, most recently from 73ae1e3 to c4ff96e Compare October 23, 2017 17:39
Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. @loli10K when you're happy with this too go ahead and remove the "work in progress" tag.

@loli10K
Copy link
Contributor Author

loli10K commented Oct 24, 2017

I was reading more carefully the last comment in #3285 before dropping the WIP tag and

you can't use -f -H -c together.. though the command runs, it will not do what you expect.

Unfortunately this implementation doesn't address it; maybezpool events -c should not accept any other argument or parameter, so no:

  • zpool events -cf
  • zpool events -c onlythispool

This would require minimal changes to the manpage/tests and zpool_do_events()

EDIT: update pushed.

@loli10K loli10K removed the Status: Work in Progress Not yet ready for general review label Oct 24, 2017
@dinatale2
Copy link
Contributor

@loli10K you may need to push again, I don’t see where the buildbot received your change.

Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
@behlendorf behlendorf merged commit 88f9c93 into openzfs:master Oct 26, 2017
@loli10K loli10K deleted the issue-3285 branch November 1, 2017 09:07
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this pull request Nov 4, 2017
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#3285
Closes openzfs#6762
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this pull request Nov 6, 2017
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#3285
Closes openzfs#6762
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this pull request Jan 29, 2018
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#3285
Closes openzfs#6762
Nasf-Fan pushed a commit to Nasf-Fan/zfs that referenced this pull request Feb 13, 2018
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#3285
Closes openzfs#6762
@loli10K loli10K restored the issue-3285 branch November 1, 2018 15:40
FransUrbo pushed a commit to FransUrbo/zfs that referenced this pull request Apr 28, 2019
Additionally add four new tests:

 * zpool_events_clear: verify 'zpool events -c' functionality
 * zpool_events_cliargs: verify command line options and arguments
 * zpool_events_follow: verify 'zpool events -f'
 * zpool_events_poolname: verify events filtering by pool name

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#3285 
Closes openzfs#6762
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants