[query] Increase perf for temporal functions #2049

arnikola · 2019-11-25T23:29:36Z

This PR contains improvements across the board for queries, largely focussed on temporal queries:

All endpoints support gziped responses for smaller payloads.
Prometheus remote read endpoint now uses a specialized path, allowing some memory usage optimizations; queries using this path get parsed directly into Prometheus proto format rather than a transitional format.
UnconsolidatedSeries now doesn't do consolidation. Shocker! (Before it was aligning points to steps, then instantly unrolling them within temporal functions, allocing 2 unused data structures).
Temporal functions greatly simplified by removing the logic that allowed split blocks to be worked on. This is likely to be re-added in future after some of the surrounding logic is simplified somewhat; it was currently unoptimized and quite confusing.
Temporal functions now compare timestamps by int64s vs Time.Times
Temporal functions now batch to groups sized by CPU count and are processed concurrently.

…ly de-consolidated

arnikola · 2019-11-27T19:27:25Z

FWIW, benchmark results for just the encoded series iterator improvements (top is existing, bottom is new):

BenchmarkNextIteration/10_series-12         	    1375	    773605 ns/op	  358427 B/op	    1400 allocs/op
BenchmarkNextIteration/100_series-12        	     398	   3082631 ns/op	 3483303 B/op	   10402 allocs/op
BenchmarkNextIteration/200_series-12        	     240	   4988956 ns/op	 6900974 B/op	   20404 allocs/op
BenchmarkNextIteration/500_series-12        	     100	  10027566 ns/op	17243642 B/op	   50409 allocs/op
BenchmarkNextIteration/1000_series-12       	      62	  17959736 ns/op	34472746 B/op	  100415 allocs/op
BenchmarkNextIteration/2000_series-12       	      36	  32262755 ns/op	68930819 B/op	  200426 allocs/op

BenchmarkNextIteration/10_series-12         	    1764	    689199 ns/op	   14422 B/op	     300 allocs/op
BenchmarkNextIteration/100_series-12        	     723	   1640192 ns/op	   14448 B/op	     301 allocs/op
BenchmarkNextIteration/200_series-12        	     534	   2217000 ns/op	   14464 B/op	     301 allocs/op
BenchmarkNextIteration/500_series-12        	     352	   3378426 ns/op	   14502 B/op	     302 allocs/op
BenchmarkNextIteration/1000_series-12       	     261	   4601541 ns/op	   14547 B/op	     303 allocs/op
BenchmarkNextIteration/2000_series-12       	     183	   6483285 ns/op	   14610 B/op	     304 allocs/op

arnikola · 2019-11-27T19:50:11Z

src/query/storage/mock/storage.go

@@ -174,6 +175,14 @@ func (s *mockStorage) Fetch(
 	return s.fetchResult.results[idx], s.fetchResult.err
 }

+func (s *mockStorage) FetchProm(


This file should be replaced by mocks when possible

codecov · 2019-11-27T21:11:38Z

Codecov Report

Merging #2049 into master will increase coverage by 3.8%.
The diff coverage is 37.7%.

@@           Coverage Diff            @@
##           master   #2049     +/-   ##
========================================
+ Coverage    62.6%   66.4%   +3.8%     
========================================
  Files         996    1009     +13     
  Lines       86851   86931     +80     
========================================
+ Hits        54386   57742   +3356     
+ Misses      28192   25326   -2866     
+ Partials     4273    3863    -410

Flag	Coverage Δ
#aggregator	`61.1% <ø> (-2.2%)`	⬇️
#cluster	`71.2% <ø> (-14.4%)`	⬇️
#collector	`43.4% <ø> (-5.4%)`	⬇️
#dbnode	`69.2% <ø> (+2.6%)`	⬆️
#m3em	`73.2% <ø> (+17.4%)`	⬆️
#m3ninx	`65.5% <ø> (+18.6%)`	⬆️
#m3nsch	`63.6% <ø> (-36.4%)`	⬇️
#metrics	`51.9% <ø> (-48.1%)`	⬇️
#msg	`74.9% <ø> (-25.1%)`	⬇️
#query	`47.1% <37.7%> (-22.5%)`	⬇️
#x	`76.1% <ø> (-0.3%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e54ccf...e62c84b. Read the comment docs.

arnikola · 2019-11-27T21:38:13Z

src/query/functions/temporal/base.go

+					batch.Iter,
+					m,
+					c.processor,
+					seriesMeta,


This needs to be resultSeriesMeta

robskillington · 2019-12-01T03:52:43Z

src/query/block/column.go

+// PopulateColumns sets all columns to the given row size.
+func (cb ColumnBlockBuilder) PopulateColumns(size int) {
+	for i := range cb.block.columns {
+		cb.block.columns[i] = column{Values: make([]float64, size)}


Worth working out the full size and doing a block alloc upfront?

i.e. all := make([]float64, size*len(cb.block.columns)) and then subdividing for each one? That would reduce num of individual allocations, we've seen good speed ups from bulk allocs.

Neat trick!

src/cmd/services/m3comparator/main/querier.go

robskillington · 2019-12-02T01:23:08Z

src/query/api/v1/httpd/handler.go

@@ -181,7 +182,9 @@ func (h *Handler) RegisterRoutes() error {
 	// Wrap requests with response time logging as well as panic recovery.
 	var (
 		wrapped = func(n http.Handler) http.Handler {
-			return logging.WithResponseTimeAndPanicErrorLogging(n, h.instrumentOpts)
+			return httputil.CompressionHandler{


nit: Want to add a comment that this does compression under the hood? for all wrapped routes?

Actually also.. do we want want to do this in applyMiddleware(...)? That means that every single route will get it rather than only those using wrapped(...) here.

Will do. It actually doesn't force compression on for all requests; rather it only returns compressed responses provided the request has an Accept-Encoding:gzip (or Accept-Encoding:deflate) header

notbdu

Left a few comments but otherwise LGTM. Although I don't have a ton of context here haha.

notbdu · 2019-12-02T22:46:30Z

src/query/block/column.go

+			len(values), len(cols))
+	}
+
+	rows := len(cols[0].Values)


Q: Are the number of rows always the same for every column?

Yeah this thing is a bit awkward (been meaning to update it), but basically it has a row per series; since the series are expected to be in steps by this point, it's safe to say they're the same size here.

Been meaning to refactor this file (and a bunch of other workflows)

notbdu · 2019-12-02T23:08:16Z

src/query/block/container_test.go

@@ -357,13 +357,13 @@ func buildUnconsolidatedSeriesBlock(ctrl *gomock.Controller,
 	it.EXPECT().Err().Return(nil).AnyTimes()
 	it.EXPECT().Next().Return(true)
 	it.EXPECT().Next().Return(false)
-	vals := make([]ts.Datapoints, numSteps)
+	vals := make(ts.Datapoints, numSteps)


Is ts.Datapoints a slice? Do we need to initialize as vals := make(ts.Datapoints, 0, numSteps)?

Yeah, ts.Datapoints is just an alias for []ts.Datapoint; updated to use append properly

notbdu · 2019-12-03T16:28:31Z

src/query/storage/fanout/storage.go

+) (storage.PromResult, error) {
+	stores := filterStores(s.stores, s.fetchFilter, query)
+	// Optimization for the single store case
+	if len(stores) == 1 {


Q: Does this save alot of perf when we don't enter the loop below?

Honestly, probably not a huge amount; was more following existing functions

notbdu · 2019-12-03T16:29:02Z

src/query/storage/fanout/storage.go

+	wg.Add(len(stores))
+	resultMeta := block.NewResultMetadata()
+	for _, store := range stores {
+		store := store


Q: Do we normally use this pattern or pass the variable into the fn as a param?

I think we generally use this approach, haven't seen many places where we've passed it by param

Yeah pass by param we usually avoid, event though may seem "safer" we use worker pools in a lot of places where you can't pass the variable as a param to the fn (as it takes a func() {} only to pass to worker pool) so it's better to just get used to doing it one way all the times and not forgetting to take ref for the inner lambda we've found.

notbdu · 2019-12-03T16:38:45Z

src/query/storage/prom_converter.go

+		return nil, err
+	}
+
+	samples := make([]prompb.Sample, 0, initRawFetchAllocSize)


Q: Would pooling stuff like samples and labels make a noticeable diff to perf?

Because we can't really predict the sizes of the slice we need here, we can run into problems where a certain query needs to create huge slices, which get released to the pool but never shrunk back. We could potentially use a pool that has a chance to release and re-alloc smaller slices periodically, but not super sure how big an impact that would have

notbdu · 2019-12-03T16:51:26Z

src/query/ts/m3db/encoded_unconsolidated_series_iterator.go

+	if it.datapoints == nil {
+		it.datapoints = make(ts.Datapoints, 0, initBlockReplicaLength)
+	} else {
+		it.datapoints = it.datapoints[:0]


nit: Should this be done in some sort of iterator Reset() function?

I think Reset() is usually used when we want to change the underlying data in an Iterator; here, we're filling up only the data that's returned in Current and the underlying data stays the same

robskillington · 2019-12-06T22:27:40Z

src/query/block/container_test.go

+	// metas := make([]SeriesMeta, count)
+	// for i := range metas {
+	// 	metas[i] = SeriesMeta{Name: []byte(fmt.Sprint(i))}
+	// }


nit: Remove these commented out?

Nice, may do it in followup

robskillington · 2019-12-10T05:03:39Z

src/query/api/v1/handler/prometheus/remote/read.go

+			continue
+		}
+
+		filtered = append(filtered, l)


nit: We could make this a little more performant by:

start for loop compare each until we hit a label we need to skip

if we find label we need to skip, copy all into "filtered" up to this element, then keep copying as you go

if get to end and none were found that need to be skipped, then just take a reference to the original slice

This would avoid memcpying the results back into the original for any results that don't have the label.

But maybe an overoptimization... so only do it if you feel like it's worthwhile. I also realize this only happens if the filter is specified at all.. so maybe not worth doing for now.

I'll keep that pattern in mind, but this path is likely cold enough for this to not really matter that much for the extra complexity? If we see this showing up in traces, happy to revisit though

robskillington · 2019-12-10T05:06:12Z

src/query/block/container.go

+		batch, err := bl.MultiSeriesIter(concurrency)
+		if err != nil {
+			// NB: do not have to set the iterator error here, since not all
+			// contained blocks necessarily allow mutli series iteration.


nit: Typo "mutli" -> "multi"

Nice catch, may handle that in the cleanup PR if thats ok

robskillington · 2019-12-10T05:10:25Z

src/query/block/container_test.go

-	vals := make([]ts.Datapoints, numSteps)
-	for i := range vals {
+	vals := make(ts.Datapoints, 0, numSteps)
+	// for i := range vals {


nit: Remove?

👍 removed in cleanup

robskillington · 2019-12-10T05:11:04Z

src/query/block/empty.go

+func (b *ucEmptyBlock) MultiSeriesIter(
+	concurrency int,
+) ([]UnconsolidatedSeriesIterBatch, error) {
+	batch := make([]UnconsolidatedSeriesIterBatch, concurrency)


nit: Would it make sense to use append for consistency?

For cases like this where we're initializing the entire slice to a default value I think this reads a little clearer than doing:

batch := make([]UnconsolidatedSeriesIterBatch, 0, concurrency) for i := 0; i < concurrency; i++ {

What do you think?

robskillington · 2019-12-10T05:14:03Z

src/query/block/series.go

-	for i, vals := range s.datapoints {
-		values[i] = consolidationFunc(vals)
+	for i, v := range s.datapoints {
+		values[i] = v.Value


Hm, are we not performing the consolidation function anymore?

Why is consolidationFunc passed in here but not used anywhere?

Ah yeah, so this particular file is not long for this repo... was intending to delete this (and have in the cleanup) and it's (now) on a dead codepath

robskillington · 2019-12-10T05:18:18Z

src/query/functions/temporal/base.go

+		go func() {
+			err := buildBlockBatch(loopIndex, batch.Iter, builder, m, p, &mu)
+			mu.Lock()
+			// NB: this no-ops if the error is nil.


We could skip the mu.Lock() however if nil. Is that worth it?

Sounds good, may handle that in the cleanup PR if thats ok

robskillington · 2019-12-10T05:22:38Z

src/query/functions/temporal/base.go

+		mu.Lock()
+		// NB: this sets the values internally, so no need to worry about keeping
+		// a reference to underlying `values`.
+		err := builder.SetRow(idx, values, blockMeta.seriesMeta[idx])


Is the lock required if SetRow(...) only touches that row? (which looking at SetRow it looks like it does?)

I know could be potentially dangerous, although could avoid a lot of locking across batches when processing series after series.

Maybe just best to keep the locks..

Yeah think it's safer to keep the locks here; applying the actual query is a much heavier process than writing the rows, so any increase is likely negligible

robskillington · 2019-12-10T05:23:42Z

src/query/functions/temporal/base.go

@@ -341,7 +363,8 @@ func getIndices(
 			l = i
 		}

-		if !ts.After(rBound) {
+		if ts <= rBound {
+			// if !ts.After(rBound) {


nit: Remove commented out?

Removed in cleanup 👍

robskillington · 2019-12-10T05:24:13Z

src/query/functions/temporal/base_test.go

+
+			for i, v := range sink.Values {
+				fmt.Println(i, v)
+				fmt.Println(" ", tt.expected[i])


Still need these printlns?

Removed in cleanup 👍

robskillington · 2019-12-10T05:27:20Z

src/query/functions/temporal/linear_regression.go

 // linearRegression performs a least-square linear regression analysis on the
 // provided datapoints. It returns the slope, and the intercept value at the
 // provided time.
 // Uses this algorithm: https://en.wikipedia.org/wiki/Simple_linear_regression.
 func linearRegression(
 	dps ts.Datapoints,
-	interceptTime time.Time,
+	interceptTime int64,


nit: Can we use xtime.UnixNano instead of just pure int64 here and other places? Feel like it will be safer/easier to grok/read too. There's a lot of To/From helpers from that type too.

Good call, changed all of these in the upcoming PR

robskillington · 2019-12-10T05:27:53Z

src/query/functions/temporal/rate.go

@@ -139,20 +143,18 @@ func standardRateFunc(
 		counterCorrection   float64
 		firstVal, lastValue float64
 		firstIdx, lastIdx   int
-		firstTS, lastTS     time.Time
+		firstTS, lastTS     int64


Can we use xtime.UnixNano here too? https://github.com/m3db/m3/blob/master/src/x/time/unix_nano.go#L27

robskillington · 2019-12-10T05:28:19Z

src/query/functions/temporal/base.go

-	lBound time.Time,
-	rBound time.Time,
+	lBound int64,
+	rBound int64,


Can we use xtime.UnixNano here too? https://github.com/m3db/m3/blob/master/src/x/time/unix_nano.go#L27

robskillington · 2019-12-10T05:28:41Z

src/query/functions/temporal/rate.go

@@ -221,7 +223,8 @@ func irateFunc(
 	datapoints ts.Datapoints,
 	isRate bool,
 	_ bool,
-	timeSpec transform.TimeSpec,
+	_ int64,
+	_ int64,


Can we use xtime.UnixNano here too? https://github.com/m3db/m3/blob/master/src/x/time/unix_nano.go#L27

Good call, changed all of these in the upcoming PR

robskillington · 2019-12-10T05:35:05Z

src/query/storage/m3/storage.go

+	}
+
+	result, _, err := accumulator.FinalResultWithAttrs()
+	defer accumulator.Close()


Why not defer accumulator.Close() before the call to FinalResultWithAttrs since it will run anyway (seems more normal to do a defer foo.Close() just after you have access to something rather than a few more lines down).

Good call, changed in cleanup 👍

robskillington · 2019-12-10T05:37:15Z

src/query/storage/prom_converter.go

+		identTag := identTags.Current()
+		labels = append(labels, prompb.Label{
+			Name:  identTag.Name.Bytes(),
+			Value: identTag.Value.Bytes(),


Are these identTags guaranteed to not be closed until we've returned/used the labels? Are they not pooled at all? If they were pooled I could see just taking raw refs to the bytes causing problems (since could be returned while you hang onto the bytes by the prompb.Labels).

Good catch, will update

robskillington · 2019-12-10T05:43:18Z

src/query/storage/prom_converter.go

+			if err != nil {
+				// Return the first error that is encountered.
+				select {
+				case errorCh <- err:


Hm, seems racey here that errorCh might be already closed?

This is since stopped() is not read or any lock is held to guarantee the errorCh is still open before enqueuing.

Unfortunately even in a switch like this writing to a closed channel will panic:
https://play.golang.org/p/XmRlQbixatB

Can use a cancellable lifetime here if you like:
https://github.com/m3db/m3/blob/master/src/x/resource/lifetime.go#L28:6

As used here:
https://github.com/m3db/m3/blob/master/src/dbnode/storage/index/block.go#L832-L840

robskillington · 2019-12-10T05:50:20Z

src/query/ts/m3db/encoded_unconsolidated_series_iterator.go

@@ -85,8 +92,10 @@ func (it *encodedSeriesIterUnconsolidated) Next() bool {
 		return false
 	}

-	alignedValues := values.AlignToBoundsNoWriteForward(it.meta.Bounds, it.lookbackDuration)
-	it.series = block.NewUnconsolidatedSeries(alignedValues, it.seriesMeta[it.idx])
+	it.series = block.NewUnconsolidatedSeries(


Do we not need to AlignToBoundsNoWriteForward anymore?

No; in fact I've managed to remove it in the cleanup

For more context, this was only being used in temporal functions, which immediately unrolled it back into a flat list :P

I've since split consolidated/unconsolidated distinction to be StepIterator = consolidated, SeriesIterator = unconsolidated, as unconsolidated steps or consolidated series don't make a lot of sense

robskillington · 2019-12-10T05:51:03Z

src/query/ts/m3db/encoded_unconsolidated_series_iterator.go

 )

 type encodedSeriesIterUnconsolidated struct {
 	idx              int
 	lookbackDuration time.Duration
 	err              error
 	meta             block.Metadata
+	datapoints       ts.Datapoints
+	alignedValues    []ts.Datapoints


Is alignedValues being used anywhere?

Removed in cleanup 👍

arnikola added 5 commits November 24, 2019 16:10

wip

e6339bb

playing around

cc2a972

More stuff

f2f5f1a

revert

509b7af

a

ba4b461

arnikola requested review from robskillington and benraskin92 November 25, 2019 23:29

arnikola added 3 commits November 26, 2019 15:54

Simplifying temporal base

921cc5c

Parallelize series decoding for temporal functions

71725bf

Removing consolidation which was only used in one place and immediate…

b585b65

…ly de-consolidated

arnikola commented Nov 27, 2019

View reviewed changes

Cleanup

65d9450

arnikola commented Nov 27, 2019

View reviewed changes

arnikola marked this pull request as ready for review November 27, 2019 22:06

robskillington reviewed Dec 1, 2019

View reviewed changes

src/cmd/services/m3comparator/main/querier.go Show resolved Hide resolved

robskillington reviewed Dec 2, 2019

View reviewed changes

PR response

3ce98c5

notbdu approved these changes Dec 3, 2019

View reviewed changes

PR rseponse, re-adding and updating temporal tests

e4a3d34

arnikola added a commit that referenced this pull request Dec 3, 2019

[general] cleaning up nits from #2049 and #2055

8b7e574

arnikola mentioned this pull request Dec 3, 2019

[query] Cleaning up nits from #2049 and #2055 #2056

Merged

arnikola added 6 commits December 3, 2019 17:28

Small bug fix

d9c670c

merge

1b40353

Updated lazy and container blocks to yield batches if possible

f0548b2

Added tests

1c2daa4

Added tests, better smoothing

ef3214e

Disable multiblock support for now (doesn't work with improved temporal)

d17c5f0

Fix test

1d7583f

arnikola requested a review from robskillington December 6, 2019 14:54

robskillington reviewed Dec 6, 2019

View reviewed changes

robskillington reviewed Dec 10, 2019

View reviewed changes

Merge branch 'master' into arnikola/temporal

e62c84b

arnikola merged commit 8cbe5a8 into master Dec 10, 2019

arnikola added a commit that referenced this pull request Dec 10, 2019

[query] Cleaning up nits from #2049 and #2055 (#2056)

3cb0b13

arnikola added a commit that referenced this pull request Dec 10, 2019

PR response for #2049

f8bbd57

robskillington deleted the arnikola/temporal branch January 28, 2020 16:44

[query] Increase perf for temporal functions #2049

[query] Increase perf for temporal functions #2049

Conversation

arnikola commented Nov 25, 2019 • edited

arnikola commented Nov 27, 2019 • edited

Choose a reason for hiding this comment

codecov bot commented Nov 27, 2019 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

notbdu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnikola Dec 3, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Dec 10, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Dec 10, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnikola commented Nov 25, 2019 •

edited

arnikola commented Nov 27, 2019 •

edited

codecov bot commented Nov 27, 2019 •

edited

arnikola Dec 3, 2019 •

edited

robskillington Dec 10, 2019 •

edited

robskillington Dec 10, 2019 •

edited