Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statechange event not always generated when VDEV state changes #9437

Open
cvoltz opened this issue Oct 9, 2019 · 5 comments
Open

statechange event not always generated when VDEV state changes #9437

cvoltz opened this issue Oct 9, 2019 · 5 comments
Labels
Bot: Not Stale Override for the stale bot Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@cvoltz
Copy link
Contributor

cvoltz commented Oct 9, 2019

System information

Type Version/Name
Distribution Name CentOS
Distribution Version 7.6
Linux Kernel 3.10.0-957.10.1
Architecture x86_64
ZFS Version 0.7.13-1
SPL Version 0.7.13-1

Problem

When a drive in a pool is FAULTED (e.g., due to I/O errors) or the drive goes OFFLINE (e.g. the zpool offline command was run), the resource.fs.zfs.statechange event is generated with the vdev_state set appropriately. If the drive is brought online (e.g., the zpool online command was run), the resource.fs.zfs.statechange event is generated with the vdev_state set to ONLINE. However, if the drive is replaced using the zpool replace command, the resource.fs.zfs.statechange event is not generated.

Lustre 2.11 added the ZEDLET statechange-lustre.sh which changes the obdfilter.*.degraded property for a target when the pool's state changes. It sets the degraded property if the pool is DEGRADED and resets the property if the pool is ONLINE. Since ZFS is not always generating the state change event, sometimes the target's degraded property is left set even when the pool is ONLINE, which reduces performance of the Lustre filesystem.

See https://jira.whamcloud.com/browse/LU-12836 for more information (including output from zpool events -v).

Steps to reproduce

  1. Create a pool:
    pool=ost04
    zpool create $pool \
      -o ashift=12 \
      -o cachefile=none \
      -O canmount=off \
      -O recordsize=1024K \
      -f \
      raidz2 /dev/mapper/d8000_sep500C0FF03C1AC73E_bay0{41..50}-0
  2. Select a drive to fail from the pool:
    bad_drive=d8000_sep500C0FF03C1AC73E_bay050-0
  3. Select an unused drive to use as the replacement drive:
    spare_drive=d8000_sep500C0FF03C1AC73E_bay101-0
  4. Verify the pool is ONLINE:
    zpool list -H -o name,health $pool
  5. Wipe the replacement drive so it looks like an unused drive:
    wipefs --all --force /dev/mapper/$spare_drive
  6. Clear the event history (just to make it easier to see what events were generated by the test):
    zpool events -c
  7. Simulate a drive failure for the selected drive by taking it offline:
    zpool offline $pool $bad_drive
  8. Wait for the pool to become DEGRADED:
    ruby -r timeout <<'EOF'
    Timeout::timeout(45) do
      loop do
        print '.'
        break if `zpool status` =~ /DEGRADED/
      end
    end
    EOF
  9. Replace the "failed" drive:
    zpool replace $pool $bad_drive $spare_drive
  10. Wait for resilvering to finish:
    ruby -r timeout <<'EOF'
    Timeout::timeout(45) do
      loop do
        print '.'
        break if `zpool events` =~ /sysevent.fs.zfs.resilver_finish/
      end
    end
    EOF
  11. Verify the pool is ONLINE:
    zpool list -H -o name,health $pool
  12. Dump the event history:
    zpool events -v
    and notice it only has the state change event for the pool going OFFLINE instead of also having a state change event for the pool going ONLINE. The output should have included an event like this:
    Oct  8 2019 09:29:58.726502922 resource.fs.zfs.statechange
            version = 0x0
            class = "resource.fs.zfs.statechange"
            pool = "ost04"
            pool_guid = 0x8159dca79b3945a4
            pool_state = 0x0
            pool_context = 0x0
            vdev_guid = 0x4b6ac5c4c8d5cb1a
            vdev_state = "ONLINE" (0x7)
            vdev_path = "/dev/mapper/d8000_sep500C0FF03C1AC73E_bay101-0"
            vdev_devid = "dm-uuid-mpath-35000c500a63e36f7"
            vdev_laststate = "OFFLINE" (0x2)
            time = 0x5d9c9d66 0x2b4d8e0a
            eid = 0x7c
    

Changing the
zpool replace $pool $bad_drive $spare_drive
command to
zpool online $pool $bad_drive
will result in the resource.fs.zfs.statechange event being generated when the pool goes ONLINE.

The Lustre issue includes the test-degraded-drive script which can be used for testing.

While we are looking at this specific scenario, we should investigate whether there are any other scenarios where the pool could change to ONLINE but not generate a corresponding state change event.

@ofaaland
Copy link
Contributor

ofaaland commented Oct 9, 2019

@tonyhutter can you take a look at this? Thanks

@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Oct 9, 2019
@cvoltz
Copy link
Contributor Author

cvoltz commented Oct 17, 2019

I have a fix for this. I'll generate a PR for it as soon as I am finished running the ZFS test suite on it.

cvoltz added a commit to cvoltz/zfs that referenced this issue Oct 24, 2019
When the "zpool online" command is used bring a faulted or
offline drive back online, a resource.fs.zfs.statechange event
is generated. When the "zpool replace" command is used to bring
a faulted or offline drive back online, a statechange event is
not generated. Add the missing statechange event after
resilvering has finished. The new sequence of events looks like
this:
    sysevent.fs.zfs.vdev_attach
    sysevent.fs.zfs.resilver_start
    sysevent.fs.zfs.history_event (scan setup)
    sysevent.fs.zfs.history_event (scan done)
    sysevent.fs.zfs.resilver_finish
    sysevent.fs.zfs.config_sync
  + resource.fs.zfs.statechange
    sysevent.fs.zfs.vdev_remove
    sysevent.fs.zfs.history_event (vdev attach)
    sysevent.fs.zfs.config_sync
    sysevent.fs.zfs.history_event (detach)

Signed-off-by: Christopher Voltz <christopher.voltz@hpe.com>
External-issue: LU-12836
Closes openzfs#9437
cvoltz added a commit to cvoltz/zfs that referenced this issue Oct 24, 2019
When the "zpool online" command is used bring a faulted or
offline drive back online, a resource.fs.zfs.statechange event
is generated. When the "zpool replace" command is used to bring
a faulted or offline drive back online, a statechange event is
not generated. Add the missing statechange event after
resilvering has finished. The new sequence of events looks like
this:
    sysevent.fs.zfs.vdev_attach
    sysevent.fs.zfs.resilver_start
    sysevent.fs.zfs.history_event (scan setup)
    sysevent.fs.zfs.history_event (scan done)
    sysevent.fs.zfs.resilver_finish
    sysevent.fs.zfs.config_sync
  + resource.fs.zfs.statechange
    sysevent.fs.zfs.vdev_remove
    sysevent.fs.zfs.history_event (vdev attach)
    sysevent.fs.zfs.config_sync
    sysevent.fs.zfs.history_event (detach)

Signed-off-by: Christopher Voltz <christopher.voltz@hpe.com>
External-issue: LU-12836
Closes openzfs#9437
cvoltz added a commit to cvoltz/zfs that referenced this issue Oct 28, 2019
When the "zpool online" command is used bring a faulted or
offline drive back online, a resource.fs.zfs.statechange event
is generated. When the "zpool replace" command is used to bring
a faulted or offline drive back online, a statechange event is
not generated. Add the missing statechange event after
resilvering has finished. The new sequence of events looks like
this:
    sysevent.fs.zfs.vdev_attach
    sysevent.fs.zfs.resilver_start
    sysevent.fs.zfs.history_event (scan setup)
    sysevent.fs.zfs.history_event (scan done)
    sysevent.fs.zfs.resilver_finish
    sysevent.fs.zfs.config_sync
  + resource.fs.zfs.statechange
    sysevent.fs.zfs.vdev_remove
    sysevent.fs.zfs.history_event (vdev attach)
    sysevent.fs.zfs.config_sync
    sysevent.fs.zfs.history_event (detach)

Signed-off-by: Christopher Voltz <christopher.voltz@hpe.com>
External-issue: LU-12836
Closes openzfs#9437
cvoltz added a commit to cvoltz/zfs that referenced this issue Oct 28, 2019
When the "zpool online" command is used bring a faulted or
offline drive back online, a resource.fs.zfs.statechange event
is generated. When the "zpool replace" command is used to bring
a faulted or offline drive back online, a statechange event is
not generated. Add the missing statechange event after
resilvering has finished. The new sequence of events looks like
this:
    sysevent.fs.zfs.vdev_attach
    sysevent.fs.zfs.resilver_start
    sysevent.fs.zfs.history_event (scan setup)
    sysevent.fs.zfs.history_event (scan done)
    sysevent.fs.zfs.resilver_finish
    sysevent.fs.zfs.config_sync
  + resource.fs.zfs.statechange
    sysevent.fs.zfs.vdev_remove
    sysevent.fs.zfs.history_event (vdev attach)
    sysevent.fs.zfs.config_sync
    sysevent.fs.zfs.history_event (detach)

Signed-off-by: Christopher Voltz <christopher.voltz@hpe.com>
External-issue: LU-12836
Closes openzfs#9437
@stale
Copy link

stale bot commented Oct 16, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Oct 16, 2020
@stale stale bot closed this as completed Jan 14, 2021
@behlendorf
Copy link
Contributor

I'm reopening this since this hasn't yet been addressed to my knowledge.

@behlendorf behlendorf reopened this Jan 15, 2021
@stale stale bot removed the Status: Stale No recent activity for issue label Jan 15, 2021
@stale
Copy link

stale bot commented Jan 15, 2022

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Jan 15, 2022
@behlendorf behlendorf added Bot: Not Stale Override for the stale bot and removed Status: Stale No recent activity for issue labels Jan 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bot: Not Stale Override for the stale bot Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants