Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider bg_worker runtime size and dedicated heartbeat tasks #4128

Closed
sunng87 opened this issue Jun 10, 2024 · 0 comments · Fixed by #4129
Closed

Reconsider bg_worker runtime size and dedicated heartbeat tasks #4128

sunng87 opened this issue Jun 10, 2024 · 0 comments · Fixed by #4129
Labels
C-enhancement Category Enhancements

Comments

@sunng87
Copy link
Member

sunng87 commented Jun 10, 2024

What type of enhancement is this?

Refactor

What does the enhancement do?

In a resource restrict environment, we saw bg_worker tasks (flush) utilize all the CPU resource that:

  1. heartbeat tasks are blocked, cause the node into readonly state Region 265098266411008(61723, 0) is in ReadOnly state, expect: Writable #4122
  2. k8s http liveness probe failed with timeout on simple tasks like curling /health

The proposal is to:

  1. Change default size of bg_worker to half the cpu number, so in worst case it won't take all cpus for other tasks.
  2. Move critial tasks like heartbeat to dedicated runtime so it won't be affected when bg_workers filled up with cpu-intensive tasks

Implementation challenges

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category Enhancements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant