-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify compactor and respect resolutions #195
Conversation
Note that this does not yet create downsamplings on the fly during compaction. But running sequences of |
Rebased, PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Maybe two minor questions and some "eureka" moments (: LGTM.
return errors.Wrap(err, "build compaction groups") | ||
} | ||
for _, g := range groups { | ||
os.RemoveAll(dataDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is that here? (: We don't need any cache anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... ahh are we caching only meta json in memory and download them + blocks on compaction only.
That makes sense, cool 👍
// groupKey returns a unique identifier for the group the block belongs to. It considers | ||
// the downsampling resolution and the block's labels. | ||
func groupKey(meta *block.Meta) string { | ||
return fmt.Sprintf("%d@%s", meta.Thanos.Downsample.Resolution, labels.FromMap(meta.Thanos.Labels)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mhm.. you are not hashing it. Are we sure order will not be different every call to labels.FromMap
? (: Maybe hash is more safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, checked that method. Output is sorted... OK 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, that's the general contract for all constructors from the labels package – and really for anywhere we deal with labels.
} | ||
return res | ||
sort.Slice(res, func(i, j int) bool { | ||
return res[i].Key() < res[j].Key() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want this order?
1m@block1
1m@block2
1m@block3
5m@block1
5m@block2
...
Maybe we should put labels in the front of groupKey
? Then we group by labels first.
I don't know which one is better ;p Just checking if there are any rationales behind that order (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mh, no strict rationale I guess. But it likely makes sense to treat lower resolution blocks with higher priority since the most recent ones are queried the most.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
pkg/compact/compact.go
Outdated
return g, nil | ||
} | ||
|
||
// Key returns an identifier for the group. | ||
func (cg *Group) Key() string { | ||
return fmt.Sprintf("%s@%d", cg.labels, cg.resolution) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, cannot we use the same function/form as in groupKey
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched form but left it as 2 functions. Just wasn't practical to consolidate.
8357d16
to
4af3535
Compare
This extends the compactor to consider downsample resolutions to the compaction grouping. With this it is safe to run the compactor against a bucket with data of different resolutions.
Furthermore #183 is likely to be cased by semantical races in the compactor. The PR drastically simplifies the logic and gets rid of any disk caching of meta.json files and just rebuilds groupings on demand. It is very unlikely that this has any noticable performance impact.