Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust maximum value for memory_share_for_fetch in MemoryStressTest.test_fetch_with_many_partitions #11533

Merged
merged 1 commit into from
Jun 22, 2023

Conversation

dlex
Copy link
Contributor

@dlex dlex commented Jun 19, 2023

The failing case is OOM when memory reserved for fetch is 80% of kafka memory, but apparently some knobs in memory control do not reflect what is going on with allocations indeed. For this specific test, the setting should go down gradually until this crash is gone, and that should become the highest recommended setting for now.

Getting it down to 0.7.

Fixes #11458

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

  • none

The test still fails at 0.8, getting it down to 0.7
@dlex dlex self-assigned this Jun 20, 2023
@dlex dlex marked this pull request as ready for review June 20, 2023 14:41
@dlex
Copy link
Contributor Author

dlex commented Jun 20, 2023

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine. My only question is was there a reason 0.8 was selected in the past? Are we just putting a band-aid over an actual problem by reducing this?

@dlex
Copy link
Contributor Author

dlex commented Jun 21, 2023

@michael-redpanda 0.8 was the maxiumum that allowed to avoid OOM, but apparently that's not right (see this comment). Re the band-aid, the memory semaphore solution is a band-aid, see this.

@dlex
Copy link
Contributor Author

dlex commented Jun 21, 2023

All CI failures are irrelevant

@piyushredpanda piyushredpanda merged commit 49c85d4 into redpanda-data:dev Jun 22, 2023
18 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-11533-v23.1.x-557 remotes/upstream/v23.1.x
git cherry-pick -x 51b3333a0bedb061307e422a0790bed0ef67f3d0

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-11533-v22.3.x-438 remotes/upstream/v22.3.x
git cherry-pick -x 51b3333a0bedb061307e422a0790bed0ef67f3d0

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI Failure (Redpanda process unexpectedly stopped) in MemoryStressTest.test_fetch_with_many_partitions
4 participants