Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement allocation size ranges and use for gang leaves #17111

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

pcd1193182
Copy link
Contributor

Motivation and Context

This PR pulls out the dynamic allocation size changes from #17004 into their own PR. It also removes the de-ganging logic, which isn't necessary and is difficult to implement without some pretty unpleasant-looking code.

Description

We teach the metaslab code how to allocate a range of sizes, not just a single target size, and use that in the gang block code to try to get it to allocate fewer leaves when possible.

How Has This Been Tested?

Manual testing + the new gang block test.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@amotin amotin added the Status: Revision Needed Changes are required for the PR to be accepted label Mar 3, 2025
@github-actions github-actions bot removed the Status: Revision Needed Changes are required for the PR to be accepted label Mar 3, 2025
Paul Dagnelie added 7 commits March 25, 2025 15:40
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
@pcd1193182 pcd1193182 added the Status: Code Review Needed Ready for review and testing label Mar 28, 2025
align = size & -size;
cursor = &msp->ms_lbas[highbit64(align) - 1];
offset = metaslab_block_picker(rt, cursor, size,
size, metaslab_df_max_search, found_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that asking here for only minimum size we will just get what we asked, while there may be more. It goes out of the scope of this PR, but I have strong wish to do something with this alignment mechanism. I have difficulties to believe in its efficiency in its current state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the cursor setup in the df_allocator isn't great. The ndf allocator seems more logical, but I don't know if it's actually working yet. And in general, it would be nice if we could do a better job of separating allocations into different metaslabs by size. I've got a couple ideas I'm playing around with on this front, but they're major rewrites of the allocation code.

I'll switch this to ask for the full range instead

*
* Note that this function will not take into account the effect of gang
* headers, which also modify the ASIZE of the DVAs. It is purely a reverse of
* the psize_to_asize function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't even guess why would RAIDZ code care about gang headers to mention it here.
Also an empty line is missing before the comment.

Copy link
Contributor Author

@pcd1193182 pcd1193182 Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is there to warn any future users who might want this function for other things to not use this without considering the possibility of gang headers. A function labelled "asize to psize" could otherwise fool some future developer, and I'd rather not introduce a potential landmine if it's easily avoided. I'm happy to move this to the default impl rather than the raidz one, though.

Comment on lines +4135 to +4138
if (zio->io_flags & ZIO_FLAG_PREALLOCATED) {
ASSERT3U(zio->io_child_type, ==, ZIO_CHILD_GANG);
return (zio);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we discussed these ZIO changes, and same as before I am unhappy about adding more crutches there. For the case when we have space allocated, etc, there is zio_rewrite() call. If it misses something, it could be cleaner to add that instead. Bypassing pipeline stages after getting into them IMHO a hack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember your issues, but as I said in the previous discussion, it's worse if you try to use rewrites. I tried it that way originally, and it was more invasive to make the necessary changes to the rewrite path than it was to just have the PREALLOCATED flag.

I have an idea for a slightly different way to implement the PREALLOCATED flag, though, let me try that and see what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants