Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abd_iter_page: rework to handle multipage scatterlists #16108

Merged
merged 1 commit into from
Apr 19, 2024

Conversation

robn
Copy link
Contributor

@robn robn commented Apr 17, 2024

Motivation and Context

@bwatkinson reported assertion trips and memory corruption after rebasing #10018 onto recent master. He traced the problem to scatterlists received from userspace carrying multiple contiguous non-compound pages, and asked me about it.

Description

Previously, abd_iter_page() would assume that every scatterlist would contain a single page (compound or no), because that's all we ever create in abd_alloc_chunks(). However, scatterlists can contain multiple pages of arbitrary provenance, and if we get one of those, we'd get all the math wrong.

This reworks things to handle multiple pages in a scatterlist, by properly finding the right page within it for the given offset, and understanding better where the end of the page is and not crossing it.

How Has This Been Tested?

Full ZTS run completed successfully against kernel 6.1.76.

Compile checked, sanity tested and "known troublesome" tests (ie zpool_reopen_003_pos) passed on 3.10-EL7, 4.14.336, 5.10.214, 6.1.83 and 6.8.2.

Using @bwatkinson's rebase of #10018 to master, the -T direct test suite asserts out early without this change (specifically the dio_aligned_block test), and all tests pass with it. See robn/directio-test.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@robn
Copy link
Contributor Author

robn commented Apr 17, 2024

@behlendorf @tonyhutter fwiw, assuming this passes muster, I'm softly against taking it to 2.2.4 at the last minute. While the scatterlist page iterator is not quite right, I'm pretty sure that it works properly for the only kinds of scatterlists it can currently receive, which should all be compound, and this code is off by default on 2.2 anyway. On the other hand, we know how many weird and wild Linux variants are out there, and we wanted this on 2.2 so we could ask people already hitting peculiar cases to try it, so one could argue they're already in a weird world. So I could go either way, but my default position is to err towards caution. I won't argue if you wanted to go the other way.

@robn robn force-pushed the abd-iter-page-scatterlist-multipage branch 2 times, most recently from 83197b8 to 553c0d7 Compare April 18, 2024 00:01
@tonyhutter
Copy link
Contributor

 * Disable compound pages before kernel 4.5. See comment below for why.

Just crudely looking this over - is this something that only affects < 4.5 kernels? Only asking since CentOS 7 is going to go EOL in June 2024...

@robn
Copy link
Contributor Author

robn commented Apr 18, 2024

Just crudely looking this over - is this something that only affects < 4.5 kernels? Only asking since CentOS 7 is going to go EOL in June 2024...

@tonyhutter No, all kernels. The comments and defines are just moving some of the logic from the existing 4.5 check up, because I need the newer ABD_PAGE_SIZE macro. The explainer is down in the comment; I tried moving it up but it didn't entirely make sense out of context.

You're right though, its not easy to follow though. I'll rework the commentary (code stays the same).

@robn robn force-pushed the abd-iter-page-scatterlist-multipage branch from 553c0d7 to a9068d9 Compare April 18, 2024 01:27
@robn
Copy link
Contributor Author

robn commented Apr 18, 2024

I'll rework the commentary (code stays the same).

Pushed update, lmk!

Previously, abd_iter_page() would assume that every scatterlist would
contain a single page (compound or no), because that's all we ever
create in abd_alloc_chunks(). However, scatterlists can contain multiple
pages of arbitrary provenance, and if we get one of those, we'd get all
the math wrong.

This reworks things to handle multiple pages in a scatterlist, by
properly finding the right page within it for the given offset, and
understanding better where the end of the page is and not crossing it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reported-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
@robn robn force-pushed the abd-iter-page-scatterlist-multipage branch from a9068d9 to e4ae602 Compare April 18, 2024 05:29
Copy link
Contributor

@bwatkinson bwatkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm this PR does fix the issue I was encountering in the Direct I/O PR (#10018).

@behlendorf behlendorf added the Status: Accepted Ready to integrate (reviewed, tested) label Apr 19, 2024
@behlendorf behlendorf merged commit f4f1561 into openzfs:master Apr 19, 2024
23 of 25 checks passed
robn added a commit to robn/zfs that referenced this pull request Jul 17, 2024
Previously, abd_iter_page() would assume that every scatterlist would
contain a single page (compound or no), because that's all we ever
create in abd_alloc_chunks(). However, scatterlists can contain multiple
pages of arbitrary provenance, and if we get one of those, we'd get all
the math wrong.

This reworks things to handle multiple pages in a scatterlist, by
properly finding the right page within it for the given offset, and
understanding better where the end of the page is and not crossing it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reported-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#16108
(cherry picked from commit f4f1561)
robn added a commit to robn/zfs that referenced this pull request Jul 17, 2024
Previously, abd_iter_page() would assume that every scatterlist would
contain a single page (compound or no), because that's all we ever
create in abd_alloc_chunks(). However, scatterlists can contain multiple
pages of arbitrary provenance, and if we get one of those, we'd get all
the math wrong.

This reworks things to handle multiple pages in a scatterlist, by
properly finding the right page within it for the given offset, and
understanding better where the end of the page is and not crossing it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reported-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#16108
(cherry picked from commit f4f1561)
@robn robn mentioned this pull request Jul 17, 2024
13 tasks
robn added a commit to robn/zfs that referenced this pull request Jul 17, 2024
Previously, abd_iter_page() would assume that every scatterlist would
contain a single page (compound or no), because that's all we ever
create in abd_alloc_chunks(). However, scatterlists can contain multiple
pages of arbitrary provenance, and if we get one of those, we'd get all
the math wrong.

This reworks things to handle multiple pages in a scatterlist, by
properly finding the right page within it for the given offset, and
understanding better where the end of the page is and not crossing it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reported-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes openzfs#16108
(cherry picked from commit f4f1561)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants