Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable sharding compact instances by prometheus shard #1245

Closed
wants to merge 1 commit into from

Conversation

mattrco
Copy link
Contributor

@mattrco mattrco commented Jun 11, 2019

We have some very large prometheus shards which take a long time to compact and downsample (especially if there's catching up to do).

This PR adds two new flags, shard.include and shard.exclude which specify prometheus shards to filter blocks by. Filtering is done when the syncer builds a slice of metadata about blocks in the bucket.

Each flag can be specified multiple times (but including both at once is not valid, I thought this would over-complicate things and increase the chance of mistakes in config).

@bwplotka I think we chatted about this briefly last week :)

I'm currently testing this on a production-like workload.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I love the idea ❤️ but I would propose different flag type and name. See comment below:

I think something like this is great, but I have 2 suggestions:

  • I would not tie it to Prometheus. There can be many sources of blocks, including Thanos-receiver. This being said it should be agnosting to Prometheus. It should be agnostic to sharding as well as again, it unnecessarly ties things. I would literally match of external-labels of source blocks. As we do literally that on compactor. It's tied to TSDB blocks. So e.g repeated block-external-label-matcher flag would be nice?
  • I would not reinvent the wheel in terms how to specify this "string". Especially this form does not specify what exactly we include / exclude on. Pod name? replica name? external labels? UIDs? (: I would stick to matchers because that is what we do, we query the blocks like metrics (where each block is single series). We can even reuse internal code for github.com/prometheus/tsdb@v0.8.0/labels/selector.go labels.Matcher. So matcher flag could take literally {cluster=~"monzo-super-cluster-(east|west)", zone="abc"}.

Let me know what are you thoughts (:

@@ -110,6 +110,10 @@ func registerCompact(m map[string]setupFunc, app *kingpin.Application, name stri
compactionConcurrency := cmd.Flag("compact.concurrency", "Number of goroutines to use when compacting groups.").
Default("1").Int()

includeShards := cmd.Flag("shard.include", "Prometheus shard to compact. May be specified multiple times. Cannot be used with --shards.exclude.").Strings()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something like this is great, but I have 2 suggestions:

  • I would not tie it to Prometheus. There can be many sources of blocks, including Thanos-receiver. This being said it should be agnosting to Prometheus. It should be agnostic to sharding as well as again, it unnecessarly ties things. I would literally match of external-labels of source blocks. As we do literally that on compactor. It's tied to TSDB blocks. So e.g repeated block-external-label-matcher flag would be nice?
  • I would not reinvent the wheel in terms how to specify this "string". Especially this form does not specify what exactly we include / exclude on. Pod name? replica name? external labels? UIDs? (: I would stick to matchers because that is what we do, we query the blocks like metrics (where each block is single series). We can even reuse internal code for github.com/prometheus/tsdb@v0.8.0/labels/selector.go labels.Matcher. So matcher flag could take literally {cluster=~"monzo-super-cluster-(east|west)", zone="abc"}. What do you think?

In this manner we don't need include/exclude as well (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good - I'm happy not to tie it to prometheus shard and reuse matcher code 👍

@brancz
Copy link
Member

brancz commented Jun 12, 2019

What's the use of doing functional sharding here? How about we do consistent hashing over the sets of external labels instead?

@bwplotka
Copy link
Member

Essentially you have single bucket for too many sources and compactor is too slow to cope with work. (:

@bwplotka
Copy link
Member

bwplotka commented Jun 12, 2019

@brancz what would be benefit of consistent sharding vs manual specification? I think manual is also nice if you want to have control over what to do first or maybe one compactor being bigger than other for different number of sources / bigger sources.

@brancz
Copy link
Member

brancz commented Jun 12, 2019

Consistent hashing so that the local cache needs minimal change, but generally just use 0, 1, 2, 3,... out of x sharding, so we can build autoscaling mechanisms around this.

@bwplotka
Copy link
Member

bwplotka commented Jun 12, 2019

Sure but that will block custom deployments as I mentioned:

if you want to have control over what to do first or maybe one compactor being bigger than other for different number of sources / bigger sources.

Especially that usually users have from 1 to 10 maybe sources (of cource happen to be 100), and not equal ones, so I don't see consistent hashing to be that useful for most of users. Plus you can have consistent hashing done by configuration management as well, but not vice versa (custom deployment when we are tied to consistent hashing). Happy to discuss though (:

We might want both as well, but it's again, it's some complexity.

@brancz
Copy link
Member

brancz commented Jun 12, 2019

We might want both as well

I’d be ok with that. In that case this sounds a lot like we want relabeling...

@mattrco
Copy link
Contributor Author

mattrco commented Jun 13, 2019

I like the suggestion of consistent hashing too, although as @bwplotka points out there's a slightly different use case for each kind of sharding.

I'm happy to go ahead with this method and we could potentially replace it or add a consistent hashing scheme later on.

@bwplotka
Copy link
Member

bwplotka commented Jun 13, 2019

@brancz looks like we all agree to have both.

In this case your comment:

In that case this sounds a lot like we want relabeling...

How this affects the idea I proposed here? (matchers):
#1245 (review)

I think instead of matchers, we literally do relabelling in form of:

  1. As input we have list of external labels (can be one empty set) we see from blocks in object storage.
  2. We do relabelling. We need some form of target label like __address__ that points to what we actually end up doing (if not dropped). I guess __external_labels__?
  3. We dedup on __external_labels__ obviously as well

I think I like this although it's complex if someone is new to this - I think that's fair for this feature though. Thoughts @mattrco ?

cc @brian-brazil would be nice to have your view on this (:

@brian-brazil
Copy link
Contributor

The hashmod action is how this is done in Prometheus, so I'd suggest looking at that with keep/drop rather than coming with a new way of hashing.

@brancz
Copy link
Member

brancz commented Jun 13, 2019

Yeah the consistent hashing mechanism would really just be so the local disk doesn't have to be re-populated entirely when re-sharding. I'd be perfectly happy to start with the hashmod approach that relabeling offers for sharding. We need to think a bit about vertical compaction here though, so I would say the selection should apply to the result block instead of the input blocks. So to me it looks like we should pass the potential result external labels of a block to be compacted into a relabeling config and based on the config decide whether to keep or not.

We do relabelling. We need some form of target label like address that points to what we actually end up doing (if not dropped). I guess external_labels?

Relabeling itself doesn't care about any special labels. It's just an input of a set of labels and an output of a set of labels, and if the output set is empty, it's dropped. So I don't think this is needed.

The only slightly weird thing is that relabeling would allow re-writing labels, instead of just keep/drop, which may not be a desirable feature.

@brian-brazil
Copy link
Contributor

The only slightly weird thing is that relabeling would allow re-writing labels, instead of just keep/drop, which may not be a desirable feature.

Only if you use those output labels, rather than merely checking for existence.

@brancz
Copy link
Member

brancz commented Jun 13, 2019

Fair. I think relabeling is a good fit when done that way.

@bwplotka
Copy link
Member

Relabeling itself doesn't care about any special labels. It's just an input of a set of labels and an output of a set of labels, and if the output set is empty, it's dropped. So I don't think this is needed.
The only slightly weird thing is that relabeling would allow re-writing labels, instead of just keep/drop, which may not be a desirable feature.

Yea, you are right. But indeed we have quite not obvious cases here, so we need to agree on clear semantics here. What happens if:

  • input a=1,b=2 output a=2 (because well, you can rewrite). What to do? It means keep? It means keep all that matches a=2? (I think the former)
  • input output (yes - that can be the case, although it's catched by newest Thanos sidecar. But in theory it can be a source without any external labels). I think it's fair to drop it.

@brancz
Copy link
Member

brancz commented Jun 14, 2019

Correct. Relabeling would just be used for selection. If the set is non empty after relabeling is done, then it’s kept.

@bwplotka
Copy link
Member

@jojohappy I think this is prefered for all components, including store: #1059

@bwplotka
Copy link
Member

The only problem I can see (or at least something to agree on) is how to provide such configuration.. It has to be some YAML I guess..

@jojohappy
Copy link
Member

@mattrco @brancz @bwplotka I like the relabeling too. IMO, we can try to do like objstore config file and re-use the relabeling code from prometheus, for example, Loki.

@bwplotka
Copy link
Member

Fair let's aim for this then. Are you happy to adjust store @jojohappy ?

@jojohappy
Copy link
Member

Are you happy to adjust store?

Yes, I do.

@Lord-Y
Copy link

Lord-Y commented Sep 5, 2019

Any news about fixing the conflicts?

@bwplotka
Copy link
Member

bwplotka commented Sep 6, 2019

It's not only conflicts, but we might also need to design and agree and consistent approach. We are working on this (:

@mattrco are you happy to close this in the meantime to stop confusion? I will start some PR with the design it would be also awesome if you would join us in making this if you want (:

@bwplotka bwplotka closed this Sep 6, 2019
@bwplotka
Copy link
Member

bwplotka commented Sep 7, 2019

#1501

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants