-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zinject: inject device errors into ioctls #16061
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never liked type IOCTL, since it is meaningless. I only hope that potential users of zinject know that in ZFS it means flush.
Agreed. I wasn't too bothered because its "only" zinject, which you're already not supposed to use without a lot of knowledge and confidence. I do have designs to change it to be just "flush" (much like "trim"), but it never really mattered too much to me and I was going to tackle it after this unit of work (this is an early commit in a much larger series). I could take a swing at renaming it first, if you'd prefer? I know you weren't asking for that, but you're not wrong, and maybe it'd be better to clean it up now before making it more visible? |
My only worry is that renaming may be pretty invasive, possibly complicating some merges, so I am torn. |
Adds 'ioctl' as a valid IO type for device error injection, so we can simulate a flush error (which OpenZFS currently ignores, but that's by the by). To support this, adding ZIO_STAGE_VDEV_IO_DONE to ZIO_IOCTL_PIPELINE, since that's where device error injection happens. This needs a small exclusion to avoid the vdev_queue, since flushes are not queued, and I'm assuming that the various failure responses are still reasonable for flush failures (probes, media change, etc). This seems reasonable to me, as a flush failure is not unlike a write failure in this regard, however this may be too aggressive or subtle to assume in just this change. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
ee21245
to
a572999
Compare
Rough version here: https://github.com/robn/zfs/commits/zio-ioctl-flush-rename/ (+158/-201) |
LGTM. Just the concept "TRIM ops and bytes are reported to user space as ZIO_TYPE_FLUSH." is weird. Also we could scrap io_cmd field and zio_ioctl() function after that. |
Thanks for that. Opened #16064. Since its not really related to this PR, I'm happy to continue with this one as a separate thing. Whichever one lands first, I'll make the appropriate adjustment to the other. |
This looks good. Let's land this first and you can fix up the names in #16064. |
Before openzfs#16061 zio_vdev_io_done() was not used for FLUSH requests. Addition of it triggers reprobe each TXG for vdevs not supporting them. Since those errors are often expected, they are normally handled by individual vdev drivers and should be ignored here. Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc.
Before #16061 zio_vdev_io_done() was not used for FLUSH requests. Addition of it triggers reprobe each TXG for vdevs not supporting them. Since those errors are often expected, they are normally handled by individual vdev drivers and should be ignored here. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16110
Motivation and Context
I'm working on flush error responses. Being able to inject flush errors is very useful!
Description
Adds 'ioctl' as a valid IO type for device error injection, so we can simulate a flush error (which OpenZFS currently ignores, but that's by the by).
To support this, adding
ZIO_STAGE_VDEV_IO_DONE
toZIO_IOCTL_PIPELINE
, since that's where device error injection happens. This needs a small exclusion to avoid the vdev_queue, since flushes are not queued, and I'm assuming that the various failure responses are still reasonable for flush failures (probes, media change, etc). This seems reasonable to me, as a flush failure is not unlike a write failure in this regard, however this may be too aggressive or subtle to assume in just this change.How Has This Been Tested?
OpenZFS currently ignores flush failures, so I added some logging just to show non-zero returns from ioctl IOs internally. It fired when a fault was injected, and were silent otherwise.
Light sanity checking suggests that when there's no fault injected, everything else is working fine.
Test suite will have to get the rest.
Types of changes
Checklist:
Signed-off-by
.