Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable large alloc warnings above 128KiB by default #16828

Merged
merged 2 commits into from
Mar 2, 2024

Conversation

StephanDollberg
Copy link
Member

We already have the reactor stall warnings on in "prod" and arguably the
large allocs are more important as they actually might result in a
crash.

Further the large alloc threshold automatically increases after each
warning so it's unlikely for this to be very spamy.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

Improvements

  • Large allocations are now logged by default (similar to reactor stalls)

We already have the reactor stall warnings on in "prod" and arguably the
large allocs are more important as they actually might result in a
crash.

Further the large alloc threshold automatically increases after each
warning so it's unlikely for this to be very spamy.
Remove the passing of the large alloc warning threshold in DT.

We now set it to 128KiB by default in the binary so we don't want to
override it 256KiB in DT.

Note this doesn't affect the actual BadLogLines check as that has a
separate limit of 400KiB for marking log lines as hard failures.
@StephanDollberg StephanDollberg force-pushed the stephan/enable-large-alloc-warning-by-default branch from bd37c88 to a9003cc Compare March 1, 2024 12:33
@redpanda-data redpanda-data deleted a comment from vbotbuildovich Mar 1, 2024
@redpanda-data redpanda-data deleted a comment from vbotbuildovich Mar 1, 2024
@@ -158,7 +158,7 @@ node_config::node_config() noexcept
"memory_allocation_warning_threshold",
"Enables log messages for allocations greater than the given size.",
{.visibility = visibility::tunable},
std::nullopt)
128_KiB + 1) // 128 KiB is the largest allowed allocation size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the warning triggers > or >= the threshold but I guess this works OK either way since acutal 128K + 1 allocations would seem to be extremely rare.

@@ -3432,12 +3432,6 @@ def write_node_conf_file(self,
except:
cur_ver = None

# This node property isn't available on versions of RP older than 23.2.
if cur_ver and cur_ver >= (23, 2, 0):
memory_allocation_warning_threshold_bytes = 256 * 1024 # 256 KiB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stephan or @ballard26 do you know why this was set to 256K in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally scraping all the test results to find oversized allocations. And to avoid too much log spam I set it to 512K initially, then lowered it to 256K, with the eventual goal of lowering it to 128K once the larger oversized allocations were fixed.

@piyushredpanda piyushredpanda merged commit b47aa4c into dev Mar 2, 2024
17 checks passed
@piyushredpanda piyushredpanda deleted the stephan/enable-large-alloc-warning-by-default branch March 2, 2024 00:00
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants