Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingester remains in the LEAVING state if starting() terminates #6923

Merged
merged 4 commits into from
Dec 13, 2023

Conversation

duricanikolic
Copy link
Contributor

What this PR does

If starting the ingester service (Ingester.starting()) fails for any reason, the corresponding ingester might be in the ring in an inconsistent state. For example, if an ingester receives the SIGTERM while opening TSDBs (replaying WAL), the TSDB opening intentionally doesn't stop, although the context is cancelled (to reduce the likelihood of on-disk corruptions). Nevertheless, the ring lifecycler gets started, and the ingester switches from LEAVING to ACTIVE state in the ring, although the Ingester.starting() function was interrupted due to the context cancellation. The ingester, although terminated, remains in the ACTIVE state because Ingester.stopping() wasn’t called after the failed Ingester.starting().

This PR fixes the problem explained above, by explicitly stopping ingester's lifecycler, in case of an error during Ingester.startint().

Which issue(s) this PR fixes or relates to

Fixes #6753

Checklist

  • Tests updated.
  • [na] Documentation added.
  • [na] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@duricanikolic duricanikolic self-assigned this Dec 13, 2023
@duricanikolic duricanikolic requested a review from a team as a code owner December 13, 2023 10:05
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test is still doing too much.

Why do we need to start and stop ingester first? We can setup the ring to initial state that we want to test.

I think we should also add a scenario, where there is no ring entry before ingester starts, and it fails on first start.

@pstibrany
Copy link
Member

Why do we need to start and stop ingester first? We can setup the ring to initial state that we want to test.

Thinking about it, setting up the ring may end up being same amount of code :(

@duricanikolic
Copy link
Contributor Author

I think we should also add a scenario, where there is no ring entry before ingester starts, and it fails on first start.

@pstibrany I can add this scenario, but this will additionally increase the size of this source.

@duricanikolic
Copy link
Contributor Author

Why do we need to start and stop ingester first? We can setup the ring to initial state that we want to test.

Thinking about it, setting up the ring may end up being same amount of code :(

We can move all the auxiliary structs as well as methods used for tests' intialization in a separate source (e.g., ingester_util_test.go), in order to reduce the size of this source.

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix lgtm.

Signed-off-by: Yuri Nikolic <durica.nikolic@grafana.com>
@duricanikolic duricanikolic merged commit 1c97e32 into main Dec 13, 2023
28 checks passed
@duricanikolic duricanikolic deleted the yuri/ingester-consistent-state branch December 13, 2023 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingester left in a inconsistent state in the ring if terminated while starting()
2 participants