Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimised /api/v1/series for blocks storage #2976

Conversation

pracucci
Copy link
Contributor

@pracucci pracucci commented Aug 4, 2020

What this PR does:
While working on #2794, I've compared our implementation to fetch series with Prometheus one and I've noticed Prometheus is easier. I've done the refactoring and written and benchmark and turned out it performs better too.

benchmark                                           old ns/op     new ns/op     delta
Benchmark_Ingester_v2MetricsForLabelMatchers-12     250057594     226011946     -9.62%

benchmark                                           old allocs     new allocs     delta
Benchmark_Ingester_v2MetricsForLabelMatchers-12     1104071        800196         -27.52%

benchmark                                           old bytes     new bytes     delta
Benchmark_Ingester_v2MetricsForLabelMatchers-12     62724828      51382654      -18.08%

Which issue(s) this PR fixes:
N/A

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@pracucci pracucci requested a review from pstibrany August 4, 2020 13:22
@pracucci pracucci force-pushed the refactor-series-api-endpoint-blocks-storage branch from 90cc4fb to 0ee87df Compare August 5, 2020 08:35
@@ -695,37 +692,24 @@ func (i *Ingester) v2MetricsForLabelMatchers(ctx context.Context, req *client.Me
}

seriesSet := q.Select(false, nil, matchers...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sets with series are not sorted, so merging them later doesn't work properly. This can be shown by using modified request in the benchmark/test:

	req := &client.MetricsForLabelMatchersRequest{
		StartTimestampMs: now,
		EndTimestampMs:   now,
		MatchersSet: []*client.LabelMatchers{
			{Matchers: []*client.LabelMatcher{
				{Type: client.REGEX_MATCH, Name: model.MetricNameLabel, Value: "test.*"},
			}},
			{Matchers: []*client.LabelMatcher{
				{Type: client.REGEX_MATCH, Name: model.MetricNameLabel, Value: "test.*0"}, // ending with 0
			}},
		},
	}

This should still return all numSeries series, but returns more of them.

Can you rerun benchmarks with sorting, and see whether it is still an improvement or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for spotting it! I copied the logic from Prometheus:
https://github.com/prometheus/prometheus/blob/b521612042ec87103088b45b7b5dfee3bb8dc732/web/api/v1/api.go#L604-L608

Either Prometheus /series API endpoint doesn't guarantee any deduplication, or there's a bug in Prometheus too. I will open an issue there.

Back to us, I modified the logic to use .Select(true, ...) (sorted) for now and re-run the benchmark (updated PR description). There's still some benefit, less then before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened this issue in Prometheus: prometheus/prometheus#7801. Will follow up this PR accordingly, if any change is required.

Signed-off-by: Marco Pracucci <marco@pracucci.com>
…ber of series

Signed-off-by: Marco Pracucci <marco@pracucci.com>
@pracucci pracucci force-pushed the refactor-series-api-endpoint-blocks-storage branch from 0ee87df to c0376e6 Compare August 14, 2020 13:18
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Still looks like a nice improvement to me.

@pracucci pracucci merged commit 50e5584 into cortexproject:master Aug 14, 2020
@pracucci pracucci deleted the refactor-series-api-endpoint-blocks-storage branch August 14, 2020 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants