improvement(query): performance improvement for sorted merge iterator #17596

foobar · 2020-04-03T07:18:51Z

Sorted merge iterator has cpu-intensive operations to sort the points
from multiple inputs. Typical queries like SELECT * FROM m GROUP BY *
do not behave well due to the comparison of points though in many cases
it doesn't necessarily have to use the slow path.

This patch adds a shortcut. If each input has a single and unique
series we can just return the points input by input.
The detection of the shortcut introduces slight overhead but the gains
are significant in many slow queries.

See #8304

foobar · 2020-04-03T07:23:41Z

@jsternberg @jacobmarble
could you take a look?

e-dard

Hi @foobar thanks for this PR. Very cool!

There are a few things of note:

I have a couple of small suggestions. The main one is to remove the use of +1, and just use 1 instead, but I also made a simplification to the code and a possible short circuit.
I don't know if we will put this work into a 1.7 release. Can you change the base branch to master-1.x? If accepted, we can manage the backport to the 1.8 branch.
I would like to see some benchmarks showing the changes in performance for this change. I mainly concerned about whether the extra work to detect if the fast condition will be used will be significantly detrimental to other use-cases (queries).
can you add some test coverage for detectFast in the form of unit tests?

@stuartcarnie would you please review this too.

Thanks again @foobar for the contribution!

query/iterator.gen.go.tmpl

e-dard · 2020-04-20T07:46:43Z

@foobar hi, I'm happy with what's here so far. However, I still would like to see some benchmarking and testing of the new code-path. Further @stuartcarnie may have his own suggestions. Thanks

foobar · 2020-04-20T13:05:55Z

Thanks for your comments! @e-dard

I have a couple of small suggestions. The main one is to remove the use of +1, and just use 1 instead, but I also made a simplification to the code and a possible short circuit.

I reworked the code based on your suggestion.

I don't know if we will put this work into a 1.7 release. Can you change the base branch to master-1.x? If accepted, we can manage the backport to the 1.8 branch.

done

I would like to see some benchmarks showing the changes in performance for this change. I mainly concerned about whether the extra work to detect if the fast condition will be used will be significantly detrimental to other use-cases (queries).

benchmark test added and here is the result on my machine:

go test -bench=BenchmarkSorted ./query/...
goos: linux
goarch: amd64
pkg: github.com/influxdata/influxdb/query
BenchmarkSortedMergeIterator_Fast-32                         166           6738013 ns/op
BenchmarkSortedMergeIterator_NotFast-32                       19          61152186 ns/op
BenchmarkSortedMergeIterator_FastCheckOverhead-32        1718110               700 ns/op

can you add some test coverage for detectFast in the form of unit tests?

added test cases

Sorted merge iterator has cpu-intensive operations to sort the points from multiple inputs. Typical queries like `SELECT * FROM m GROUP BY *` do not behave well due to the comparison of points though in many cases it doesn't necessarily have to use the slow path. This patch adds a shortcut. If each input has a single and unique series we can just return the points input by input. The detection of the shortcut introduces slight overhead but the gains are significant in many slow queries.

stuartcarnie

Overall, this is a great improvement 🥇

May I suggest the following small change, which may be a little easier for a future maintainer to understand. It is also likely to be a little more efficient and avoids the use of strings.Compare, which is not recommended by the Go documentation and the source itself:

	var less func(i, j int) bool
	if h.opt.Ascending {
		less = func(i, j int) bool {
			x, y := s[i].point, s[j].point
			if x.Name != y.Name {
				return x.Name < y.Name
			}

			if x.Tags.ID() != y.Tags.ID() {
				return x.Tags.ID() < y.Tags.ID()
			}

			hasDup = true
			return false
		}
	} else {
		less = func(i, j int) bool {
			x, y := s[i].point, s[j].point
			if x.Name != y.Name {
				return x.Name > y.Name
			}

			if x.Tags.ID() != y.Tags.ID() {
				return x.Tags.ID() > y.Tags.ID()
			}

			hasDup = true
			return false
		}
	}
	sort.Slice(s, less)

As a follow on to #17596, performance of all merge operations can be improved by removing allocations when comparing tags. `benchstat` results: ``` name old time/op new time/op delta SortedMergeIterator-16 32.4ms ± 2% 5.2ms ± 3% -83.81% (p=0.000 n=10+10) name old alloc/op new alloc/op delta SortedMergeIterator-16 36.5MB ± 0% 5.8MB ± 0% -84.20% (p=0.000 n=10+9) name old allocs/op new allocs/op delta SortedMergeIterator-16 420k ± 0% 60k ± 0% -85.71% (p=0.000 n=9+10) ```

foobar · 2020-04-23T03:25:21Z

Thanks @stuartcarnie for reviewing this PR.

May I suggest the following small change, which may be a little easier for a future maintainer to understand. It is also likely to be a little more efficient and avoids the use of strings.Compare, which is not recommended by the Go documentation and [the source itself]

From the source it shouldn't have significant difference in term of performance; for readability, current code is shorter with @e-dard comments. I'm fine with either.
@e-dard your thought?

foobar · 2020-05-04T14:37:56Z

@e-dard @stuartcarnie @rickspencer3 any other comments?

e-dard · 2020-05-06T13:11:41Z

@stuartcarnie can you run with this? I got a bunch of other review backed up :-)

stuartcarnie · 2020-05-06T13:17:06Z

I am ok, as-is. Thanks again, @foobar!

@dgnorton if you are happy, you are welcome to merge it

foobar · 2020-05-11T12:25:50Z

hi @dgnorton, had you got a chance to look at this ?

stuartcarnie

@dgnorton everything looks good to me

foobar · 2020-05-15T02:18:45Z

@dgnorton could you get it merged?

ayang64 · 2020-05-18T19:36:41Z

@foobar, @stuartcarnie

Please forgive me if I'm off base -- I haven't tested this thoroughly but I think the entire less func could be simplified to something like this:

less := func(i, j int) bool {
  x, y := s[i].point, s[j].point
  hasDup = hasDup ||  x.Name == y.Name
  return ((x.Name < y.Name) || (x.Tags.ID() < y.Tags.ID()) && h.opt.Ascending
}

this should be a bit faster. i haven't tested or benchmarked it though.

ayang64 · 2020-05-18T19:45:33Z

query/iterator.gen.go

+	s := make([]*floatSortedMergeHeapItem, len(h.items))
+	copy(s, h.items)
+
+	less := func(i, j int) bool {


please investigate if something like the following would work as a comparator:

less := func(i, j int) bool { x, y := s[i].point, s[j].point hasDup = hasDup || x.Name == y.Name return ((x.Name < y.Name) || (x.Tags.ID() < y.Tags.ID()) && h.opt.Ascending }

i think that should be at least as fast.

@ayang64 the less function is required to sort ascending or descending order depending on h.opt.Ascending. It is unclear to me if the short version above achieves that. Also, your proposed version is setting hasDup to true only if x.Name == y.Name, whereas the original function sets hasDup to true iif both the Name and Tags are equal, which is a required property.

I didn't see any obvious performance issues with the existing code, and the overall improvement is significant.

foobar · 2020-06-11T09:40:43Z

hi @timhallinflux, could this catch 1.8.1 milestone?

See Stuarts response

foobar force-pushed the optimize-sorted-merge-iterator branch from b8c6606 to 22f297a Compare April 3, 2020 07:20

foobar changed the title ~~feat(query): Performance improvement for sorted merge iterator~~ feat(query): performance improvement for sorted merge iterator Apr 3, 2020

foobar changed the title ~~feat(query): performance improvement for sorted merge iterator~~ improvement(query): performance improvement for sorted merge iterator Apr 3, 2020

rbetts requested review from rickspencer3 and e-dard April 16, 2020 21:46

timhallinflux added 1.x area/performance labels Apr 16, 2020

e-dard previously requested changes Apr 17, 2020

View reviewed changes

query/iterator.gen.go.tmpl Outdated Show resolved Hide resolved

query/iterator.gen.go.tmpl Outdated Show resolved Hide resolved

query/iterator.gen.go.tmpl Outdated Show resolved Hide resolved

e-dard requested a review from stuartcarnie April 17, 2020 11:52

foobar changed the base branch from 1.7 to master-1.x April 20, 2020 03:21

foobar changed the base branch from master-1.x to 1.7 April 20, 2020 03:36

foobar force-pushed the optimize-sorted-merge-iterator branch from 22f297a to 9eefd9f Compare April 20, 2020 04:01

foobar changed the base branch from 1.7 to master-1.x April 20, 2020 04:02

foobar force-pushed the optimize-sorted-merge-iterator branch from 9eefd9f to 7aa6fde Compare April 20, 2020 04:09

foobar force-pushed the optimize-sorted-merge-iterator branch from 78db9df to c7d360d Compare April 20, 2020 11:35

foobar requested a review from e-dard April 20, 2020 13:06

foobar force-pushed the optimize-sorted-merge-iterator branch from 1cd5a47 to af8e66c Compare April 20, 2020 13:06

stuartcarnie requested changes Apr 22, 2020

View reviewed changes

stuartcarnie mentioned this pull request Apr 22, 2020

feat(query): Improve query performance #17828

Closed

foobar requested a review from stuartcarnie April 28, 2020 09:53

stuartcarnie approved these changes May 11, 2020

View reviewed changes

ayang64 previously requested changes May 18, 2020

View reviewed changes

e-dard removed their request for review May 22, 2020 09:13

foobar requested review from ayang64 and e-dard June 11, 2020 06:27

jacobmarble requested a review from dgnorton June 22, 2020 22:07

dgnorton approved these changes Jun 23, 2020

View reviewed changes

dgnorton merged commit 78a05d1 into influxdata:master-1.x Jun 23, 2020

jacobmarble mentioned this pull request Jun 23, 2020

improvement(query): performance improvement for sorted merge iterator #17733

Closed

foobar deleted the optimize-sorted-merge-iterator branch June 24, 2020 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvement(query): performance improvement for sorted merge iterator #17596

improvement(query): performance improvement for sorted merge iterator #17596

foobar commented Apr 3, 2020

foobar commented Apr 3, 2020 •

edited

Loading

e-dard left a comment

e-dard commented Apr 20, 2020

foobar commented Apr 20, 2020

stuartcarnie left a comment

foobar commented Apr 23, 2020

foobar commented May 4, 2020

e-dard commented May 6, 2020

stuartcarnie commented May 6, 2020

foobar commented May 11, 2020

stuartcarnie left a comment

foobar commented May 15, 2020

ayang64 commented May 18, 2020 •

edited

Loading

ayang64 May 18, 2020 •

edited

Loading

stuartcarnie May 19, 2020 •

edited

Loading

foobar commented Jun 11, 2020

improvement(query): performance improvement for sorted merge iterator #17596

improvement(query): performance improvement for sorted merge iterator #17596

Conversation

foobar commented Apr 3, 2020

foobar commented Apr 3, 2020 • edited Loading

e-dard left a comment

Choose a reason for hiding this comment

e-dard commented Apr 20, 2020

foobar commented Apr 20, 2020

stuartcarnie left a comment

Choose a reason for hiding this comment

foobar commented Apr 23, 2020

foobar commented May 4, 2020

e-dard commented May 6, 2020

stuartcarnie commented May 6, 2020

foobar commented May 11, 2020

stuartcarnie left a comment

Choose a reason for hiding this comment

foobar commented May 15, 2020

ayang64 commented May 18, 2020 • edited Loading

ayang64 May 18, 2020 • edited Loading

Choose a reason for hiding this comment

stuartcarnie May 19, 2020 • edited Loading

Choose a reason for hiding this comment

foobar commented Jun 11, 2020

foobar commented Apr 3, 2020 •

edited

Loading

ayang64 commented May 18, 2020 •

edited

Loading

ayang64 May 18, 2020 •

edited

Loading

stuartcarnie May 19, 2020 •

edited

Loading