Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed node_hash_map caused oversized allocations in cluster module #17423

Merged
merged 3 commits into from
Mar 27, 2024

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Mar 27, 2024

Replaced absl::node_hash_map with the newly introduced chunked_hash_map to prevent oversized allocation needed for the indexing array.

Fixes: #16960
Fixes: #16959
Fixes: #16951

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

Copy link
Member

@StephanDollberg StephanDollberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine from what I can tell.

Reminder that we lose pointer stability like this so this is something to check for whether it's needed.

@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/46871#018e7f8e-df46-41e6-8570-d2f18f4137a1:

"rptest.tests.controller_snapshot_test.ControllerSnapshotTest.test_join_restart_catch_up"

Replaced `node_hash_map` data structures that were used in `topic_table`
to store topic and partition specific data with chunked hash maps to
avoid large allocation of index table.

Fixes: redpanda-data#16960

Signed-off-by: Michał Maślanka <michal@redpanda.com>
Fixes: redpanda-data#16959
Signed-off-by: Michał Maślanka <michal@redpanda.com>
Fixes: redpanda-data#16951

Signed-off-by: Michał Maślanka <michal@redpanda.com>
@dotnwat
Copy link
Member

dotnwat commented Mar 27, 2024

Reminder that we lose pointer stability like this so this is something to check for whether it's needed.

This is something I don't know how to address in a sane way. I can't think of a way to verify this without auditing all the uses of the container.

@StephanDollberg
Copy link
Member

This is something I don't know how to address in a sane way. I can't think of a way to verify this without auditing all the uses of the container.

Yeah not sure I am pretty sure you can write some form of clang-tidy analyzer (IIRC that's what google did with abseil) but it's certainly not easy.

The possibility is to use debug iterators. We do actually implicitly get those as we use chunked_vector under the hood and I think this helped Michal find an issue.

@vbotbuildovich
Copy link
Collaborator

@dotnwat dotnwat merged commit 8ee4914 into redpanda-data:dev Mar 27, 2024
17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-17423-v23.3.x-3 remotes/upstream/v23.3.x
git cherry-pick -x 88ab0b0c1d1526ffabe4c579cee5b27f1a4be3cd 1f0428e02b2eb4f47f032a6dc55714b0b7cc84be ca9522b43a4b4fbd1e0e49c930e84cc9aff4bb39

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants