Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compact: failure on empty block #869

Closed
abursavich opened this issue Feb 26, 2019 · 4 comments
Closed

compact: failure on empty block #869

abursavich opened this issue Feb 26, 2019 · 4 comments

Comments

@abursavich
Copy link
Contributor

abursavich commented Feb 26, 2019

Thanos, Prometheus and Golang version used

improbable/thanos:v0.3.1
quay.io/prometheus/prometheus:v2.7.1

What happened

Thanos compact is crash looping.

What you expected to happen

Compaction completes successfully.

How to reproduce it (as minimally and precisely as possible):

I'm not sure how the state of blocks in S3 was produced, but I think I understand what is happening.

Given a set of empty blocks to compact, tsdb's LeveledCompactor is resulting in an empty block: it marks all of the input blocks deletable, writes nothing to disk, and returns an empty ULID with a nil error.

// Compactor provides compaction against an underlying storage
// of time series data.
type Compactor interface {
	//...
   
	// Compact runs compaction against the provided directories. Must
	// only be called concurrently with results of Plan().
	// Can optionally pass a list of already open blocks,
	// to avoid having to reopen them.
	// When resulting Block has 0 samples
	//  * No block is written.
	//  * The source dirs are marked Deletable.
	//  * Returns empty ulid.ULID{}.
	Compact(dest string, dirs []string, open []*Block) (ulid.ULID, error)
}

Thanos is not handling this case and is exiting when the directory it expects to be named by the (empty) ULID doesn't exist.

error executing compaction: compaction failed: compaction: failed to finalize the block /data/compact/0@{prometheus="monitoring/default",prometheus_replica="prometheus-default-0"}/00000000000000000000000000: read new meta: open /data/compact/0@{prometheus="monitoring/default",prometheus_replica="prometheus-default-0"}/00000000000000000000000000/meta.json: no such file or directory

Full logs to relevant components

Logs

...
{"block":"01D4KE5VJX8MHRXKNXRT4QB1C2","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.582029611Z"}
{"block":"01D4KG1PJXBWFJ1SA7NP72TV8R","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.618553643Z"}
{"block":"01D4KG20QN7S9K0VMQPMFS7V2K","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.640079676Z"}
{"block":"01D4KN1JNSKEQR143DWGNHAKCY","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.676418268Z"}
{"block":"01D4KN1JNSKEQR143DWGNHAKCY","caller":"compact.go:194","level":"debug","msg":"block is too fresh for now","ts":"2019-02-26T01:21:38.694373934Z"}
{"block":"01D4KN1JTJFZH897Q1KB3187SC","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.694406388Z"}
{"block":"01D4KN1JTJFZH897Q1KB3187SC","caller":"compact.go:194","level":"debug","msg":"block is too fresh for now","ts":"2019-02-26T01:21:38.789104732Z"}
{"caller":"compact.go:827","level":"info","msg":"start of GC","ts":"2019-02-26T01:21:38.789165195Z"}
{"blocks":"[/data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QR3YED74WDJZ8ZV5YW1P /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QSWTA6EH0SN6R9V36270 /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QTRA6B16JPXSBV9SVNKK /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QVJ3P7QRYN1J3V89DFHA]","caller":"compact.go:721","compactionGroup":"0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}","duration":"580.824228ms","level":"debug","msg":"downloaded and verified blocks","ts":"2019-02-26T01:21:39.465482905Z"}
{"caller":"compact.go:384","count":4,"duration":"31.131698ms","level":"info","msg":"compact blocks resulted in empty block","sources":"[01D3M2QR3YED74WDJZ8ZV5YW1P 01D3M2QSWTA6EH0SN6R9V36270 01D3M2QTRA6B16JPXSBV9SVNKK 01D3M2QVJ3P7QRYN1J3V89DFHA]","ts":"2019-02-26T01:21:39.496691527Z"}
{"blocks":"[/data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QR3YED74WDJZ8ZV5YW1P /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QSWTA6EH0SN6R9V36270 /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QTRA6B16JPXSBV9SVNKK /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QVJ3P7QRYN1J3V89DFHA]","caller":"compact.go:730","compactionGroup":"0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}","duration":"31.254751ms","level":"debug","msg":"compacted blocks","ts":"2019-02-26T01:21:39.496812357Z"}
{"caller":"main.go:181","err":"error executing compaction: compaction failed: compaction: failed to finalize the block /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/00000000000000000000000000: read new meta: open /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/00000000000000000000000000/meta.json: no such file or directory","level":"error","msg":"running command failed","ts":"2019-02-26T01:21:39.496912774Z"}

@abursavich
Copy link
Contributor Author

I've looked up the blocks from the log in S3 and none of them have any chunks, just small (~400B) meta.json and (~150KB) index files.

@abursavich
Copy link
Contributor Author

I purged all the empty blocks in my S3 bucket (after backing them up in case they would be useful) and now compact is happy again.

@rafaeljesus
Copy link

Yeah we had to create a script to remove all empty blocks from the bucket, it worked after that, I mean another issue popped up out-of-order label set ^^

@GiedriusS
Copy link
Member

GiedriusS commented Mar 14, 2019

Closing as this should be fixed by #904. Please shout if I am wrong :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants