Duplicated and out-of-order compacted data #183

fabxc · 2018-01-15T07:39:27Z

During debugging for downsampling last week I discovered at least one block that contained duplicated and out of order data.
For about 10% of its series there was a sequence of 3 chunks that was repeated about 180 times. At the end of that repetition were two chunks for times before that sequence. I could not come up with an explanation how this could have happened. Generally, the original block with those 3 chunks might not have been GC'd and then accidentally been re-compacted in over several iterations. But this should not get us to 180.

No data loss occurred as far as I can tell and this should be fully recoverable.
Even though the duplication is very high, the total data blow up is fairly minimal AFAICT.

As we have no explanation of what caused this, the best way to address this seems to be:

Add handling to our normal reads and downsampling that account for the issue so it is not user facing
Add verification logic to our compactor that aborts if it detects such a case again. This way we can detect the issue right away and will have a chance to properly debug it rather than after several more compaction iterations
Add a thanos bucket check command that walks existing blocks and detects the issue. It can also be extended to re-write affected blocks properly.

The text was updated successfully, but these errors were encountered:

bwplotka · 2018-04-12T11:03:22Z

I think this was related to compaction overlaps. No repro so far.

…stogram change series result metric to native histogram

fabxc mentioned this issue Jan 30, 2018

Simplify compactor and respect resolutions #195

Merged

bwplotka closed this as completed Apr 26, 2018

bwplotka mentioned this issue Jun 12, 2018

Blocks can contain chunks that go past MaxTime prometheus-junkyard/tsdb#347

Closed

fpetkovski pushed a commit to fpetkovski/thanos that referenced this issue Jan 26, 2024

Merge pull request thanos-io#183 from Shopify/result-series-native-hi…

0bd0efb

…stogram change series result metric to native histogram

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicated and out-of-order compacted data #183

Duplicated and out-of-order compacted data #183

fabxc commented Jan 15, 2018

bwplotka commented Apr 12, 2018

Duplicated and out-of-order compacted data #183

Duplicated and out-of-order compacted data #183

Comments

fabxc commented Jan 15, 2018

bwplotka commented Apr 12, 2018