-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arc: Drop an incorrect assert #12246
Conversation
Unfortunately, there was an overzealous assertion that was (in pretty specific circumstances) false, causing failure. Let's not, and say we did. Closes: openzfs#9897 Closes: openzfs#12020 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
The assertion is based on the understanding that anonymous HDR's do not have DVA's, which I think is one of the defining characteristic of anonymous HDR's. During correct operation, how do we come to have an anonymous HDR with a DVA? |
I believe we can come to have an anonymous buffer here with a non-empty header in the event that it was overridden after being sync'd out. If you look a few lines down in |
@rincebrain I'm still not understanding how we get into this situation. @grwilson and I have been looking at the code trying to see what could be happening here. We noticed some interesting code paths that are taken when using dedup. When you reproduce the problem, are you using dedup (or have you ever used it on this storage pool)? (Specifically: for dedup'ed-away writes, if the matching block is already in the ARC, then arc_write_done() seems to set b_dva but not call arc_access() (because |
No DDT entries are present on any pool on my original system, according to On the testbed I was playing with this on, no, never dedup enabled, ever. |
@grwilson and @ahrens, can we come to a consensus on this proposed change? As @behlendorf has pointed out, there is an explicit clear of the header just below this ASSERT with a comment indicating that the DVA could be non-null in some circumstances (BTW, both code and ASSERT came from George in the same change set). It seems like either the code or the ASSERT have to be in error here. |
Merely a follow up that I have seen this also on FreeBSD 14.0 CURRENT ( amd64/x86_64 ) and the fix is trivial. Just comment out line 6538 from arc.c and life goes on just fine. Please bear in mind that we only ever see this panic when running a full debug kernel with witness options etc etc etc. Even more rare and strange is that I can trigger the panic repeatedly when I run QEMU on the FreeBSD target system. |
I was able to go back through our internal repos to find the origin of the assertion and after looking at various commits, it is clear to me that the assertion was added in error. The code change proposed here is correct. |
Unfortunately, there was an overzealous assertion that was (in pretty specific circumstances) false, causing failure. This assertion was added in error, so we're removing it. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#9897 Closes openzfs#12020 Closes openzfs#12246
Unfortunately, there was an overzealous assertion that was (in pretty specific circumstances) false, causing failure. This assertion was added in error, so we're removing it. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#9897 Closes openzfs#12020 Closes openzfs#12246
Unfortunately, there was an overzealous assertion that was (in pretty specific circumstances) false, causing failure. This assertion was added in error, so we're removing it. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#9897 Closes openzfs#12020 Closes openzfs#12246
Unfortunately, there was an overzealous assertion that was (in pretty specific circumstances) false, causing failure. This assertion was added in error, so we're removing it. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: George Wilson <gwilson@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#9897 Closes openzfs#12020 Closes openzfs#12246
Motivation and Context
#9897, #12020
Description
Dropped an assert that was, in fact, merely often true when correct operation was happening.
How Has This Been Tested?
I had two reliable reproducers on my Debian buster systems - both of them, without this change, would trip this assert in minutes; with this change, they never did in over an hour of load each, and no obvious bad behavior occurred either. (The two reproducers, just in case this regresses, were "mild/moderate IO from inside a chroot on a zpool over NFS from a sparc64 client (rebuilding mandb, in particular, was quite good at it)" and "disk IO on a file on a zpool using kvm and virtio-blk/virtio-scsi storage". (Some of these constraints may be unnecessarily specific, but that is what I used.))
Types of changes
Checklist:
Signed-off-by
.