Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sidecar: Upload blocks from oldest #1670

Closed
bwplotka opened this issue Oct 21, 2019 · 2 comments · Fixed by #1679
Closed

sidecar: Upload blocks from oldest #1670

bwplotka opened this issue Oct 21, 2019 · 2 comments · Fixed by #1679

Comments

@bwplotka
Copy link
Member

bwplotka commented Oct 21, 2019

Ref: https://cloud-native.slack.com/archives/CK5RSSC10/p1571650833206600

It can happen during long network partition that uploaded blocks in the wrong order can cause compactor to miss one block during compaction. This results in overlapping blocks - compactor halts in that case. Read path works.

AC:

  • Sort local blocks based on min/max time
  • Optional: Add repair for verifier with vertical compaction - this will allow manual repair for those case (the alternative is to change a bit different meta to the overlapped block OR remove it)
@d-ulyanov
Copy link
Contributor

Just to add some example:
in our case, we have 2 blocks: Block №1 (level=2) and block №2 (level=1) - compactor failed with error that these blocks overlapped.

Time ranges:
Block №1:

17-10-2019 00:00:00
17-10-2019 08:00:00

Block №2

17-10-2019 04:56:24
17-10-2019 06:00:00

Defenitelly, block №2 should be included to block №1, but it's not.

Root cause for compactor
Block №1 was compacted at 11:47:01, block №2 was uploaded at 14:36:12

Probably, root cause could be related with such errors (it was during 10min):

"level=warn ts=2019-10-17T08:50:25.558160695Z caller=shipper.go:333 msg="updating meta file failed" err="write /prometheus/thanos.shipper.json.tmp: no space left on device""

@obiesmans
Copy link
Contributor

I'd like to work on this.

obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 23, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 23, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 23, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 24, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 24, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 24, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
obiesmans pushed a commit to obiesmans/thanos that referenced this issue Oct 24, 2019
Fixes thanos-io#1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
bwplotka pushed a commit that referenced this issue Oct 24, 2019
Fixes #1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
GiedriusS pushed a commit that referenced this issue Oct 28, 2019
Fixes #1670

Signed-off-by: Olivier Biesmans <olivier.biesmans@blablacar.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants