You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is fine when there is only in-order data. But when there is OOO data, since all OOO data is compacted after the in-order head's compaction, the sample in the OOO head could be like just a minute old. And it will produce block that has data that is a minute old.
This means, if you happen to restart right after compaction, then instead of having an hour of data from the WAL reply, you might just like few samples in the series left because of the out of order block.
This is a blocker for 2.2, we will work on a fix today. cc @colega
To Reproduce
We are still on reproducing this, but these are the potential steps
Have OOO samples that are not too old, like few minutes old, along with in-order data. Ideally some in-order series having no OOO samples to see the gaps properly
Right after a compaction cycle, restart the ingesters.
The in-order series should have some gaps now.
Expected behavior
No gaps/loss after a restart.
Environment
Mimir r190
The text was updated successfully, but these errors were encountered:
Describe the bug
EDIT: The data gaps happens for the in-order data. And not the OOO data.
We init the head block with minValidTime as last block's maxt
https://github.com/grafana/mimir-prometheus/blob/1446b53d874c0309d8f99749ced5e1c0637cf245/tsdb/db.go#L839-L847
Which means, during the WAL replay, all samples before the minValidTime are discarded.
This is fine when there is only in-order data. But when there is OOO data, since all OOO data is compacted after the in-order head's compaction, the sample in the OOO head could be like just a minute old. And it will produce block that has data that is a minute old.
This means, if you happen to restart right after compaction, then instead of having an hour of data from the WAL reply, you might just like few samples in the series left because of the out of order block.
This is a blocker for 2.2, we will work on a fix today. cc @colega
To Reproduce
We are still on reproducing this, but these are the potential steps
Expected behavior
No gaps/loss after a restart.
Environment
The text was updated successfully, but these errors were encountered: