Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidecar and Compact: Out of order block uploads lead to overlaps and Compact critical errors #2753

Closed
gburek-fastly opened this issue Jun 10, 2020 · 5 comments
Labels

Comments

@gburek-fastly
Copy link

Thanos, Prometheus and Golang version used:

thanos, version 0.12.2 (branch: HEAD, revision: 52e10c6)
build user: root@c1a6cf60f03d
build date: 20200430-16:24:03
go version: go1.13.6

prometheus, version 2.18.1 (branch: HEAD, revision: ecee9c8abfd118f139014cb1b174b08db3f342cf)
build user: root@2117a9e64a7e
build date: 20200507-16:51:47
go version: go1.14.2

Object Storage Provider:

Google Cloud Storage

What happened:

Compact ran into an overlapped block error:

Jun 05 08:09:29 caller=compact.go:460 msg="critical error detected; halting" err="compaction: group 0@15585302070356120162: pre compaction overlap check: overlaps found while gathering blocks. [mint: 1591185600000, maxt: 1591192800000, range: 2h0m0s, blocks: 2]: <ulid: 01EA1R7GMDZP9XBPQBYBJPES3W, mint: 1591171200000, maxt: 1591200000000, range: 8h0m0s>, <ulid: 01E9XB4ECJ3NPQ08PBS7XHAXZ6, mint: 1591185600000, maxt: 1591192800000, range: 2h0m0s>"

Looking at the source system, it appears that it was was isolated from the internet due to an upstream network failure for several days. Once the network was accessible, it began uploading blocks, but failed on 01E9XB4ECJ3NPQ08PBS7XHAXZ6 at 07:52:19. That block was later uploaded at 08:05:51:

Jun 03 11:00:12 caller=shipper.go:347 msg="upload new block" id=01E9WXCZWJA9TXDEY0JJ6KACNR
Jun 03 13:00:41 caller=sidecar.go:340 err="iter local block metas: check exists: Get https://storage.googleapis.com/storage/v1/b/<bucket>/o/01E9X48Q4J2X91Y6WPDW6ST3T7%2Fmeta.json?alt=json&prettyPrint=false&projection=full: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: dial tcp: i/o timeout" uploaded=0
...
Jun 05 07:52:19 caller=sidecar.go:340 err="iter local block metas: check exists: Get https://storage.googleapis.com/storage/v1/b/<bucket>/o/01E9X48Q4J2X91Y6WPDW6ST3T7%2Fmeta.json?alt=json&prettyPrint=false&projection=full: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: dial tcp 172.217.20.106:443: connect: network is unreachable" uploaded=0
Jun 05 07:52:42 caller=shipper.go:347 msg="upload new block" id=01E9X48Q4J2X91Y6WPDW6ST3T7
Jun 05 07:53:10 caller=shipper.go:347 msg="upload new block" id=01E9XB4ECJ3NPQ08PBS7XHAXZ6
Jun 05 07:59:06  caller=shipper.go:313 msg="shipping failed" block=01E9XB4ECJ3NPQ08PBS7XHAXZ6 err="upload chunks: upload file /data/thanos/upload/01E9XB4ECJ3NPQ08PBS7XHAXZ6/chunks/000001 as 01E9XB4ECJ3NPQ08PBS7XHAXZ6/chunks/000001: Post https://storage.googleapis.com/upload/storage/v1/b/<bucket>/o?alt=json&prettyPrint=false&projection=full&uploadType=resumable&upload_id=AAANsUljcCy9pr7Ob_I23MocUc_Iob8dhIv_3X1HwW2PrySiehs-3uXBZLRwAMbkwDh84HqPsxmlgnTopL
Jun 05 07:59:06 caller=shipper.go:347 msg="upload new block" id=01E9XJ05MJNCDZ3WX87WKB85P1
...
Jun 05 08:05:51 caller=sidecar.go:340 err="failed to sync 1 blocks" uploaded=21
Jun 05 08:05:51 caller=shipper.go:347 msg="upload new block" id=01E9XB4ECJ3NPQ08PBS7XHAXZ6

In between the first upload failure of 01E9XB4ECJ3NPQ08PBS7XHAXZ6 and the retry, compact compacted the blocks that were present:

Jun 05 08:05:34 caller=compact.go:633 group="0@{replica=\"oob-osl6508\", site=\"osl65\"}" groupKey=0@15585302070356120162 msg="compaction available and planned; downloading blocks" plan="[/data/compact/0@15585302070356120162/01E9WXCZWJA9TXDEY0JJ6KACNR /data/compact/0@15585302070356120162/01E9X48Q4J2X91Y6WPDW6ST3T7 /data/compact/0@15585302070356120162/01E9XJ05MJNCDZ3WX87WKB85P1]"
Jun 05 08:05:48 caller=fetcher.go:451 component=block.BaseFetcher msg="successfully synchronized block metadata" duration=25.221939741s cached=27921 returned=27921 partial=3
Jun 05 08:05:49 caller=compact.go:692 group="0@{replica=\"oob-osl6508\", site=\"osl65\"}" groupKey=0@15585302070356120162 msg="downloaded and verified blocks; compacting blocks" plan="[/data/compact/0@15585302070356120162/01E9WXCZWJA9TXDEY0JJ6KACNR /data/compact/0@15585302070356120162/01E9X48Q4J2X91Y6WPDW6ST3T7 /data/compact/0@15585302070356120162/01E9XJ05MJNCDZ3WX87WKB85P1]" duration=15.079950156s
Jun 05 08:05:57 caller=compact.go:441 msg="compact blocks" count=3 mint=1591171200000 maxt=1591200000000 ulid=01EA1R7GMDZP9XBPQBYBJPES3W sources="[01E9WXCZWJA9TXDEY0JJ6KACNR 01E9X48Q4J2X91Y6WPDW6ST3T7 01E9XJ05MJNCDZ3WX87WKB85P1]" duration=7.489969252s
Jun 05 08:05:57 caller=compact.go:722 group="0@{replica=\"oob-osl6508\", site=\"osl65\"}" groupKey=0@15585302070356120162 msg="compacted blocks" new=01EA1R7GMDZP9XBPQBYBJPES3W blocks="[/data/compact/0@15585302070356120162/01E9WXCZWJA9TXDEY0JJ6KACNR /data/compact/0@15585302070356120162/01E9X48Q4J2X91Y6WPDW6ST3T7 /data/compact/0@15585302070356120162/01E9XJ05MJNCDZ3WX87WKB85P1]" duration=7.529746624s overlapping_blocks=false
Jun 05 08:06:06 caller=compact.go:764 group="0@{replica=\"oob-osl6508\", site=\"osl65\"}" groupKey=0@15585302070356120162 msg="uploaded block" result_block=01EA1R7GMDZP9XBPQBYBJPES3W duration=8.934421357s

Compact is configured with --consistency-delay=2h --delete-delay=6h

It appears that a transient upload error of one block among many, that were several days old resulted in a race between an uploading sidecar and compact where a compacted block was formed before all blocks that should have been included in it were successfully uploaded.

This overlap issue seems to be at odds with the documented expectation in https://thanos.io/operating/troubleshooting.md/#overlaps

What you expected to happen:

I would expect compact to wait for consistency-delay after a block was uploaded to the object store to make it available for compaction. It appears to rely on the ULID to calculate age rather than the object creation time of metadata.json, which is problematic when the block age does not correspond to its age in the object store.

How to reproduce it (as minimally and precisely as possible):

Set up thanos compact to run against an objstore bucket. Set up a prometheus install with several hours of uncompacted blocks already present. Move one block out of the prometheus data directory. Start thanos sidecar so that it uploads all existing blocks. Watch compact compact the several hours old blocks. Move reserved block back into the prometheus directory and restart thanos sidecar. It should then upload and compact will halt.

@bwplotka
Copy link
Member

Interestingly, I think we upload with the order of blocks presented in filesystem, so from oldest, but we might want to double-check. 👍

We will look into this, this week, thanks. Any further pointers if you can "reliably" reproduce overlaps, let us know:

Set up thanos compact to run against an objstore bucket. Set up a prometheus install with several hours of uncompacted blocks already present. Move one block out of the prometheus data directory. Start thanos sidecar so that it uploads all existing blocks. Watch compact compact the several hours old blocks. Move reserved block back into the prometheus directory and restart thanos sidecar. It should then upload and compact will halt.

This sounds like amazing pointer, it will help us a lot.

@bwplotka
Copy link
Member

As per

return metas[i].BlockMeta.MinTime < metas[j].BlockMeta.MinTime
upload is done from the oldest to newest.

I improved readability a bit here: #2764

I would expect compact to wait for consistency-delay after a block was uploaded to the object store to make it available for compaction. It appears to rely on the ULID to calculate age rather than the object creation time of metadata.json, which is problematic when the block age does not correspond to its age in the object store

Yes, that's the plan, I think there is issue about it.

It appears that a transient upload error of one block among many, that were several days old resulted in a race between an uploading sidecar and compact where a compacted block was formed before all blocks that should have been included in it were successfully uploaded.

Ah, this is actually the root cause indeed.

It's here:

// though this one failed. It will be retried on second Sync iteration.

This is because if single block cannot be uploaded we want at least partial upload. I can agree this is not the best decision as upload is only about transport error.. I think we should retry with backoff but never keep doing other blocks. 👍

Will fix on master

@bwplotka
Copy link
Member

In mean time you can enable vertical compaction, as it works well so far (although experimental).

As a hack solution you can do it now in 0.12.2 with --deduplication.replica-label=some-not-existing-label

@bwplotka
Copy link
Member

Fixed #2765

@bwplotka
Copy link
Member

bwplotka commented Jun 15, 2020

It should be part of v0.13.0-rc.2

In terms of ULID for partial uploads, we have this ticket: #2470

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants