Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka/protocol: use frag vec for offset fetch #15918

Merged
merged 1 commit into from
Jan 3, 2024

Conversation

rockwotj
Copy link
Contributor

@rockwotj rockwotj commented Jan 2, 2024

We've seen oversized allocs with this field when group fetches are used.
This switches to a frag vec for partitions.

Fixes: #15909

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

Bug Fixes

  • Prevent oversized allocs when group fetching from many partitions.

We've seen oversized allocations in many partition test with group
fetches, switch to using frag vec to prevent that.

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>
@@ -62,7 +63,8 @@ struct offset_fetch_response final {
data.error_code = error_code::none;
if (topics) {
for (auto& topic : *topics) {
std::vector<offset_fetch_response_partition> partitions;
small_fragment_vector<offset_fetch_response_partition>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I'm probably missing some context here, but do we have a sense of what the "usual" size of this vector might be? As in, whether the small_fragment_vector fragment size (1KiB I believe) is the best/safest choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah large_fragment_vector is the other option, which is close to 128KiB and probably fine? I'm not really sure to be honest. I'll switch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetchable_partition_response is also small? So I guess it's expected to stay small because it's only data from this shard I believe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, i don't really know. I assume you could use fragmented_vector with any arbitrary fragment size if you were so inclined (rather than one of the convenience typedefs), but it's not clear to me whether there's any special reason to do so.

Certainly the 1KiB fragment size should prevent bad_alloc in all cases, so I don't want to hold this up. Just curious mostly.

Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rockwotj rockwotj merged commit 10fca00 into redpanda-data:dev Jan 3, 2024
22 checks passed
@rockwotj rockwotj deleted the group-fetch-large-alloc branch January 3, 2024 14:33
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-15918-v23.1.x-847 remotes/upstream/v23.1.x
git cherry-pick -x 7e9111efe99120b6bcad95f8e4eafefa3bba737f

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Oversized allocation: 327680 bytes in kafka::group::handle_offset_fetch
3 participants