Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v21.11.x] storage/cloud_storage: make read buffer sizes configurable #3436

Merged
merged 6 commits into from
Jan 11, 2022

Conversation

jcsp
Copy link
Contributor

@jcsp jcsp commented Jan 11, 2022

Cover letter

Backport of #3421

Release notes

Improvements

  • Memory utilization on systems with large number of partitions can now be tweaked using configuration properties storage_read_buffer_size (default 128KiB) and storage_read_readahead_count (default 10). These properties may be decreased to more conservative values such as buffer_size=16KiB, readahead_count=1 to reduce the per-partition memory overhead and improve stability when the number of partitions is large (e.g. more than 10000).

Anyone constructing a reader should be mindful of
the buffer size in use.  128_KiB is only small if
you don't have tens of thousands of whatever
you're constructing.

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit 78bd658)
There was a buried default of '10' here which
in some cases is a 10x factor on the memory
consumption of segment readers busy with
reads (e.g. on recovery after startup...)

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit dd8a805)
These parameters are the dominant factor in redpanda
memory consumption when the number of partitions
is large (and by extension the number of segment
readers is large).

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit f21031a)
Fixes: redpanda-data#3412

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit 05e8c99)
These are only used in tests now.  Give tests their
own hardcoded values and remove the ones in storage::
to prevent confusion in future.

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit 7c62d6e)
...and remove hardcoded defaults.  Cloud storage
might want to use different values to the global
setting in future, but for the moment they were
the same & can always be split out later.

Signed-off-by: John Spray <jcs@vectorized.io>
(cherry picked from commit e5ba17b)
@jcsp
Copy link
Contributor Author

jcsp commented Jan 11, 2022

Backporting because:

  • Enables better stress testing of SI memory load, by getting bloated input buffers out the way
  • Believe certain users may start pushing partition counts soon

@jcsp jcsp merged commit d8635f4 into redpanda-data:v21.11.x Jan 11, 2022
@jcsp jcsp deleted the backport-issue-3412 branch January 11, 2022 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants