Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize SHOW SERIES #6533

Merged
merged 1 commit into from
May 3, 2016

Conversation

benbjohnson
Copy link
Contributor

@benbjohnson benbjohnson commented May 2, 2016

Overview

This commit changes the SeriesIterator to use a heap for sorting and uses a floatFastDedupeIterator to avoid point encoding while deduplication.

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated
  • Sign CLA (if not already signed)

@mention-bot
Copy link

By analyzing the blame information on this pull request, we identified @e-dard and @mark-rushakoff to be potential reviewers

@jwilder
Copy link
Contributor

jwilder commented May 2, 2016

[ERROR] run: Command 'go test -v -parallel 1 -timeout 480s ./...' failed with error: # github.com/influxdata/influxdb/cmd/influxd/run
cmd/influxd/run/server_bench_test.go:10:2: cannot find package "github.com/pkg/profile" in any of:
    /usr/local/go/src/github.com/pkg/profile (from $GOROOT)
    /root/go/src/github.com/pkg/profile (from $GOPATH)
FAIL    github.com/influxdata/influxdb/cmd/influxd/run [setup failed]

@jwilder
Copy link
Contributor

jwilder commented May 2, 2016

I see about a 25% improvement locally. Number of shards still significantly impacts performance though.

LGTM though.

@jwilder jwilder added this to the 0.13.0 milestone May 2, 2016
@benbjohnson
Copy link
Contributor Author

@jwilder I'm removing the heap because I don't think it's helping. I'm doing some tests on the large shard set to see if deferring the series fetching will help. I'll push up the changes in a bit once I get some numbers.

@benbjohnson benbjohnson force-pushed the optimize-show-series branch 2 times, most recently from 5e223e8 to 25d364b Compare May 2, 2016 21:25
@benbjohnson benbjohnson changed the title Optimize SHOW SERIES (WIP) Optimize SHOW SERIES May 2, 2016
@benbjohnson
Copy link
Contributor Author

@jwilder I was able to get the 140 shards w/ 1M series down to 5m and memory usage is around 6GB in the latest commit (25d364b). All tests passing. Ready for review.

dst = append(dst, m.seriesByID[id].Key)
}
return dst
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was added because we were previously locking and unlocking for every single series access. This batches it all together and allows us to reuse the buffer.

@benbjohnson benbjohnson force-pushed the optimize-show-series branch 2 times, most recently from d90c0f0 to badc884 Compare May 2, 2016 21:54
@jwilder
Copy link
Contributor

jwilder commented May 2, 2016

👍 on green.

This commit changes the `SeriesIterator` to process one measurement
at a time and uses a `floatFastDedupeIterator` to avoid point
encoding during deduplication.
@benbjohnson benbjohnson merged commit 417df18 into influxdata:master May 3, 2016
@benbjohnson benbjohnson deleted the optimize-show-series branch May 3, 2016 15:15
@jwilder jwilder mentioned this pull request May 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants