Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per Pool Dirty Data Settings #15949

Open
Haravikk opened this issue Mar 1, 2024 · 0 comments
Open

Per Pool Dirty Data Settings #15949

Haravikk opened this issue Mar 1, 2024 · 0 comments
Labels
Type: Feature Feature request or new feature

Comments

@Haravikk
Copy link

Haravikk commented Mar 1, 2024

Describe the feature would like to see added to OpenZFS

I would like to see a zpool properties added to allow individual zpools to be given a smaller share of the ZFS dirty data buffer (tunable.zfs_dirty_data_max).

My proposal would be for the following zpool properties to be added:

  • dirty_data_max – set to a size in bytes/megabytes etc. which is the maximum amount of data that the pool can have within the global dirty data buffer. The global limit still sets the maximum size of the buffer as normal, so a larger size set by the property will have no effect.
  • dirty_data_max_percent - allows the size to instead be set as a percentage of the global dirty data buffer size. So with tunable.zfs_dirty_data_max=4294967296 (4gb) and dirty_data_max_percent=50 the maximum size would be 2gb.

Basically the amount of dirty data (pending writes) that the pool can have is the smaller of tunable.zfs_dirty_data_max, dirty_data_max and dirty_data_max_percent.

The behaviour of I/O throttling (via tunable.async_write_min_dirty_pct etc.) would also be adjusted to use the smaller of these sizes, so a pool with a smaller limit may be throttled while other pools remain unaffected (so long as their limits remain based on the global size).

How will this feature improve OpenZFS?

This change would allow dirty data buffering to be configured on a per-pool basis on setups with multiple pools, rather than the current behaviour of the buffer being shared globally with no way to control it (except to increase the global size).

Sharing the buffer between multiple pools currently becomes problematic if one has slower write performance than another, as it means that a lot of write activity to the slower pool can fill the buffer and result in throttling write performance of the faster pool(s) as well. While increasing the size of the buffer will help, it's only masking the actual problem, as a sufficiently large burst in write activity to the slow pool will still fill the buffer and throttle all other pools as well.

By allowing a slower pool to be given a stricter limit, the rest of the buffer remains free, which should maintain performance for any other pool(s).

Additional Context

This same problem could be solved in some cases via #11982 (writeback cache) as providing a true writeback cache device to a slow pool would allow it to complete writes faster (and thus clear the dirty data buffer sooner), which is a better solution if the slow pool's performance needs to be improved but does require additional hardware of an appropriate size and speed to handle possible surges in write activity.

Per-pool dirty data settings allows the same problem to be solved without additional hardware, in the sense of maintaining the performance of other pools, by allowing a slow pool to be configured to throttle separately. In this way only the slow pool gets slower until the writing backlog is cleared.

@Haravikk Haravikk added the Type: Feature Feature request or new feature label Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

1 participant