Count active metrics that match given query within each tenant #37

colega · 2021-07-28T09:37:10Z

Is your feature request related to a problem? Please describe.
We need a gauge metric that provides the number of active series within each tenant that match a pre-configured query.
I.e., we want to know how many active metrics match {foo="bar",...}

Describe the solution you'd like
We want to add a configuration to the Ingester, called active_matching_series, which is a slice of structs:

type ActiveMatchingSeriesConfig struct {
	Name    string `yaml:"name"`
	Matcher string `yaml:"matcher"`
}

Same configuration can be provided by using multiple ingester.active-matching-series flags, each one with a value built as <name>:<matcher>, for example: --ingester.active-matching-series='foobar:{foo="bar"}'

A new gauge would be defined in ingesterMetrics as:

		// Not registered automatically, but only if activeSeriesEnabled is true.
		activeMatchingSeriesPerUser: prometheus.NewGaugeVec(prometheus.GaugeOpts{
			Name: "cortex_ingester_active_matching_series",
			Help: "Number of currently active series matching a pre-configured matcher per user.",
		}, []string{"user", "matcher"}),

When the ingester is instantiated, we'll instantiate an implementation of this interface:

type ActiveSeriesMatcher interface {
	// Matches provide a slice of bools indicating whether the set of labels.Labels provided matches or not each one of the matchers.
	// The length of the returned slice is the same as the length of the Labels() slice, and matchers are applied in same order.
	Matches(labels.Labels) []bool
	// Labels provides the values of the `matcher` label for each matcher.
	Labels() []string
}

We would modify the activeSeriesStripe to contain a slice of counters, one for each of the matchers:

// activeSeriesStripe holds a subset of the series timestamps for a single tenant.
type activeSeriesStripe struct {
	// ...
	active         int   // Number of active entries in this stripe. Only decreased during purge or clear.
	activeMatching []int // Number of active entries matching the pre-configured matchers in this stripe. Only decreased during purge or clear. Has the same length as the number of matchers provided.
}

And we'll add a matches []bool to the activeSeriesEntry struct, which is the most critical part as this is where we'll consume memory. Given n the number of matches, we would consume 8*3 (slice ptr, cap & len) + n bytes more per each metric, plus the slice pointer pressure on the garbage collector. With 1M series per ingester, this would be 24MB per ingester plus one megabyte per each of the matchers provided, which isn't a big deal.

This slice would be filled once, when creating the series, using the previous interface, as ActiveSeries would hold an instance of it. We'll use this bool mask to increase the counters in activeSeriesStripe.activeMatching just like we do with active, that's O(n*m) with n series and m matchers.

Then we'll modify the signature of func (c *ActiveSeries) Active() int to func (c *ActiveSeries) Active() (int, []int) so it would return the number of series matching each one of the matchers too. Counting those would be O(m) with m the number of matchers, as each one of the stripes would provide it's own count.

Describe alternatives you've considered

Just put metrics in different tenants: this is not how we want the product to work.
Use uint64 bitmask instead of []byte to save memory: the amount of memory saved would be neglible but we'd be limited to just 64 matchers. We don't expect more than those, but since it's cheap to remove the limitation, we can just go with the bool slice.

Questions
Do we need to modify the userState? I don't have enough context on that one yet (maybe I'll see better once I start implementing this).

The text was updated successfully, but these errors were encountered:

gouthamve · 2021-07-28T09:49:49Z

Can you use an exporter that just does a count count({foo="bar"}) and exposes it as a metric?

colega · 2021-07-28T10:43:21Z

How can we do that for all tenants?

colega · 2021-07-28T10:48:14Z

Plus, we would be potentially opening the TSDBs for tenants who don't have any active metrics, right?

jesusvazquez · 2021-07-29T13:05:42Z

Using an exporter would require a deployment per cluster right? And then the process itself would be querying for cluster data to then store it back as a metric. So we thought we might just generate the metric from service itself and save us the overhead of the exporter.

pracucci · 2021-07-29T13:14:07Z

I'm not sure about the exporter. I see two issues:

We need to do it for every tenant. The exporter needs the updated list of tenants first and then, for each tenant, run the count query.
The query looks very expensive. We need to count active series (over the last 20m) and continuously running such query, for every tenant.

Because of this, I'm in favour of adding the support directly into Mimir.

Now going to review the proposal.

pracucci · 2021-07-29T13:24:08Z

The proposal LGTM, but I've some small feedback. Thanks!

We need a gauge metric that provides the number of active series within each tenant that match a pre-configured query.

I would say not a "pre-configured query", but "label matchers". I know it's just wording, but with "query" we mean a different thing.

active_matching_series

I would suggest to call it active_series_custom_trackers because they're custom trackers for active series.

Same configuration can be provided by using multiple ingester.active-matching-series flags

We learned the hard way it's not easy to handle multiple flags (eg. you can't do it in jsonnet with our config setup, but you need to inject custom args in the container, which is annoying). I would avoid it. You could use a semicolon-separated string.

Name: "cortex_ingester_active_matching_series",

Based on my feedback above, I would call the metric cortex_ingester_active_series_custom_tracker and I would call the label name instead of matcher (since I guess we'll export the tracker name, not the label matchers).

colega · 2021-07-29T14:57:56Z

Thanks, I already started implementing in #42, still pending tests and apply this feedback.

I was also considering exporting the matcher itself as a label, apart from the name, since it would provide context to the querier without adding extra cardinality to the metric, WDYT?
I.e.: cortex_ingester_active_series_custom_tracker{name="foobar", matcher="{foo=\"bar\"}"}

pracucci · 2021-07-29T15:15:42Z

I was also considering exporting the matcher itself as a label

I have mixed feelings. Looks an anti-pattern to me.

Added doc-generator.ExamplerConfig interface that can be implemented to provide an optional comment and yaml example for the configuration. Rendered that in the documentation section. Ref: #42 (comment) Ref: #37 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Provide example config for active_series_custom_trackers Added doc-generator.ExamplerConfig interface that can be implemented to provide an optional comment and yaml example for the configuration. Rendered that in the documentation section. Ref: #42 (comment) Ref: #37 Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Print the example before the flag Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com> * Indent the example Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

colega added the enhancement New feature or request label Jul 28, 2021

colega self-assigned this Jul 28, 2021

colega mentioned this issue Jul 29, 2021

Count active series that match specific label matchers #42

Merged

3 tasks

pstibrany closed this as completed in #42 Aug 25, 2021

colega mentioned this issue Sep 2, 2021

Provide example config for active_series_custom_trackers #198

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Count active metrics that match given query within each tenant #37

Count active metrics that match given query within each tenant #37

colega commented Jul 28, 2021

gouthamve commented Jul 28, 2021

colega commented Jul 28, 2021

colega commented Jul 28, 2021

jesusvazquez commented Jul 29, 2021

pracucci commented Jul 29, 2021

pracucci commented Jul 29, 2021

colega commented Jul 29, 2021

pracucci commented Jul 29, 2021

Count active metrics that match given query within each tenant #37

Count active metrics that match given query within each tenant #37

Comments

colega commented Jul 28, 2021

gouthamve commented Jul 28, 2021

colega commented Jul 28, 2021

colega commented Jul 28, 2021

jesusvazquez commented Jul 29, 2021

pracucci commented Jul 29, 2021

pracucci commented Jul 29, 2021

colega commented Jul 29, 2021

pracucci commented Jul 29, 2021