Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Count active metrics that match given query within each tenant #37

Closed
colega opened this issue Jul 28, 2021 · 8 comments · Fixed by #42
Closed

Count active metrics that match given query within each tenant #37

colega opened this issue Jul 28, 2021 · 8 comments · Fixed by #42
Assignees
Labels
enhancement New feature or request

Comments

@colega
Copy link
Contributor

colega commented Jul 28, 2021

Is your feature request related to a problem? Please describe.
We need a gauge metric that provides the number of active series within each tenant that match a pre-configured query.
I.e., we want to know how many active metrics match {foo="bar",...}

Describe the solution you'd like
We want to add a configuration to the Ingester, called active_matching_series, which is a slice of structs:

type ActiveMatchingSeriesConfig struct {
	Name    string `yaml:"name"`
	Matcher string `yaml:"matcher"`
}

Same configuration can be provided by using multiple ingester.active-matching-series flags, each one with a value built as <name>:<matcher>, for example: --ingester.active-matching-series='foobar:{foo="bar"}'

A new gauge would be defined in ingesterMetrics as:

		// Not registered automatically, but only if activeSeriesEnabled is true.
		activeMatchingSeriesPerUser: prometheus.NewGaugeVec(prometheus.GaugeOpts{
			Name: "cortex_ingester_active_matching_series",
			Help: "Number of currently active series matching a pre-configured matcher per user.",
		}, []string{"user", "matcher"}),

When the ingester is instantiated, we'll instantiate an implementation of this interface:

type ActiveSeriesMatcher interface {
	// Matches provide a slice of bools indicating whether the set of labels.Labels provided matches or not each one of the matchers.
	// The length of the returned slice is the same as the length of the Labels() slice, and matchers are applied in same order.
	Matches(labels.Labels) []bool
	// Labels provides the values of the `matcher` label for each matcher.
	Labels() []string
}

We would modify the activeSeriesStripe to contain a slice of counters, one for each of the matchers:

// activeSeriesStripe holds a subset of the series timestamps for a single tenant.
type activeSeriesStripe struct {
	// ...
	active         int   // Number of active entries in this stripe. Only decreased during purge or clear.
	activeMatching []int // Number of active entries matching the pre-configured matchers in this stripe. Only decreased during purge or clear. Has the same length as the number of matchers provided.
}

And we'll add a matches []bool to the activeSeriesEntry struct, which is the most critical part as this is where we'll consume memory. Given n the number of matches, we would consume 8*3 (slice ptr, cap & len) + n bytes more per each metric, plus the slice pointer pressure on the garbage collector. With 1M series per ingester, this would be 24MB per ingester plus one megabyte per each of the matchers provided, which isn't a big deal.

This slice would be filled once, when creating the series, using the previous interface, as ActiveSeries would hold an instance of it. We'll use this bool mask to increase the counters in activeSeriesStripe.activeMatching just like we do with active, that's O(n*m) with n series and m matchers.

Then we'll modify the signature of func (c *ActiveSeries) Active() int to func (c *ActiveSeries) Active() (int, []int) so it would return the number of series matching each one of the matchers too. Counting those would be O(m) with m the number of matchers, as each one of the stripes would provide it's own count.

Describe alternatives you've considered

  • Just put metrics in different tenants: this is not how we want the product to work.
  • Use uint64 bitmask instead of []byte to save memory: the amount of memory saved would be neglible but we'd be limited to just 64 matchers. We don't expect more than those, but since it's cheap to remove the limitation, we can just go with the bool slice.

Questions
Do we need to modify the userState? I don't have enough context on that one yet (maybe I'll see better once I start implementing this).

@colega colega added the enhancement New feature or request label Jul 28, 2021
@colega colega self-assigned this Jul 28, 2021
@gouthamve
Copy link
Member

Can you use an exporter that just does a count count({foo="bar"}) and exposes it as a metric?

@colega
Copy link
Contributor Author

colega commented Jul 28, 2021

How can we do that for all tenants?

@colega
Copy link
Contributor Author

colega commented Jul 28, 2021

Plus, we would be potentially opening the TSDBs for tenants who don't have any active metrics, right?

@jesusvazquez
Copy link
Member

Using an exporter would require a deployment per cluster right? And then the process itself would be querying for cluster data to then store it back as a metric. So we thought we might just generate the metric from service itself and save us the overhead of the exporter.

@pracucci
Copy link
Collaborator

I'm not sure about the exporter. I see two issues:

  1. We need to do it for every tenant. The exporter needs the updated list of tenants first and then, for each tenant, run the count query.
  2. The query looks very expensive. We need to count active series (over the last 20m) and continuously running such query, for every tenant.

Because of this, I'm in favour of adding the support directly into Mimir.

Now going to review the proposal.

@pracucci
Copy link
Collaborator

The proposal LGTM, but I've some small feedback. Thanks!

We need a gauge metric that provides the number of active series within each tenant that match a pre-configured query.

I would say not a "pre-configured query", but "label matchers". I know it's just wording, but with "query" we mean a different thing.

active_matching_series

I would suggest to call it active_series_custom_trackers because they're custom trackers for active series.

Same configuration can be provided by using multiple ingester.active-matching-series flags

We learned the hard way it's not easy to handle multiple flags (eg. you can't do it in jsonnet with our config setup, but you need to inject custom args in the container, which is annoying). I would avoid it. You could use a semicolon-separated string.

Name: "cortex_ingester_active_matching_series",

Based on my feedback above, I would call the metric cortex_ingester_active_series_custom_tracker and I would call the label name instead of matcher (since I guess we'll export the tracker name, not the label matchers).

@colega
Copy link
Contributor Author

colega commented Jul 29, 2021

Thanks, I already started implementing in #42, still pending tests and apply this feedback.

I was also considering exporting the matcher itself as a label, apart from the name, since it would provide context to the querier without adding extra cardinality to the metric, WDYT?
I.e.: cortex_ingester_active_series_custom_tracker{name="foobar", matcher="{foo=\"bar\"}"}

@pracucci
Copy link
Collaborator

I was also considering exporting the matcher itself as a label

I have mixed feelings. Looks an anti-pattern to me.

colega added a commit that referenced this issue Sep 2, 2021
Added doc-generator.ExamplerConfig interface that can be implemented to
provide an optional comment and yaml example for the configuration.
Rendered that in the documentation section.

Ref: #42 (comment)
Ref: #37

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
pracucci pushed a commit that referenced this issue Sep 16, 2021
* Provide example config for active_series_custom_trackers

Added doc-generator.ExamplerConfig interface that can be implemented to
provide an optional comment and yaml example for the configuration.
Rendered that in the documentation section.

Ref: #42 (comment)
Ref: #37

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Print the example before the flag

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Indent the example

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants