Skip to content
This repository has been archived by the owner on Jul 19, 2023. It is now read-only.

Loss of data on restart #688

Closed
kolesnikovae opened this issue May 12, 2023 · 3 comments
Closed

Loss of data on restart #688

kolesnikovae opened this issue May 12, 2023 · 3 comments
Assignees
Labels
area/database kind/bug Something isn't working

Comments

@kolesnikovae
Copy link
Contributor

kolesnikovae commented May 12, 2023

A restart may result in losing the head block:
image

Log from one of the ingesters (cluster of 4 ingesters with the default replication factor): apparently, shutdown didn’t finish gracefully:
image

I find it interesting that none of the ingesters has finished shutdown as well: no panics or explicit errors in the log – however, most likely this indicates a problem with log collection.

It seems that under some circumstances a block is not picked up after the restart, and the head is getting polluted with chunks:

/ $ ls -la /data/1218/head | wc -l
276
/ $ du -sh /data/1218/head
49.9G   /data/1218/head
@kolesnikovae
Copy link
Contributor Author

Also see grafana/pyroscope#2076

@cyriltovena
Copy link
Collaborator

Possible we need to increase the grace period

    deployment.mixin.spec.template.spec.withTerminationGracePeriodSeconds(4800)

@kolesnikovae
Copy link
Contributor Author

Termination grace period added in #837. Closing for now

@kolesnikovae kolesnikovae added kind/bug Something isn't working area/database and removed area/database labels Jul 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/database kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants