Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid race in LocalHealthMonitor w.r.t null values #105065

Closed
nielsbauman opened this issue Feb 2, 2024 · 1 comment · Fixed by #105674
Closed

Avoid race in LocalHealthMonitor w.r.t null values #105065

nielsbauman opened this issue Feb 2, 2024 · 1 comment · Fixed by #105674
Labels

Comments

@nielsbauman
Copy link
Contributor

Follow up on this thread.

There is a race condition in the LocalHealthMonitor regarding null values (i.e. initial/reset values):

  • Last reported value was null
  • there is new a health node
  • New run starts the update process and records previous value null
  • health node/master change
  • new reported value is null
  • Run sees null and considers that it's ok to update the value....

Can probably relatively easily be fixed by using some form of ref counting/versioning of the lastReportedValue.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

elasticsearchmachine pushed a commit that referenced this issue Mar 8, 2024
Update the implementation of the `LocalHealthMonitor` to ensure there is
always only 1 `UpdateHealthInfoCacheAction` request in-flight, and that
only 1 monitoring thread can update the "internal state" (i.e. the
`lastReportedValue` in the `HealthTracker`s) at the same time. By
enforcing this, we're avoiding potential concurrency issues. This PR
also includes several tests that aim to validate the behaviour of the
`LocalHealthMonitor` in specific multi-threaded situations.

Resolves #105065
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants