inmem startup improvments #9164

stuartcarnie · 2017-11-27T21:15:18Z

~~only call ParseTags when necessary~~
- due to introduction of SeriesFile, this optimization is no longer available
remove dependency on inmem.Series in tsdb test package
Measurement and Series are no longer exported. Their use is restricted
to the inmem package
improve Measurement and Series types by exporting immutable
fields and removing unnecessary APIs and locks

~~Reduced startup time from 28s to 17s. Overall improvement including #9162 reduces startup from 46s to 17s for 1MM series across 14 shards.~~

Required for all non-trivial PRs

Rebased/mergable
Tests pass
CHANGELOG.md updated

stuartcarnie · 2017-11-27T21:21:35Z

models/points.go

 	walkTags(buf, func(key, value []byte) bool {
-		tags = append(tags, NewTag(key, value))
+		tags[p].Key = key


@jwilder this change removes a runtime.typedmemmove

stuartcarnie · 2017-11-27T21:23:02Z

models/points.go

@@ -281,8 +281,8 @@ func ParseKeyBytes(buf []byte) ([]byte, Tags) {
 	return buf[:i], tags
 }

-func ParseTags(buf []byte) (Tags, error) {
-	return parseTags(buf), nil
+func ParseTags(buf []byte) Tags {


the error return value was always nil, so this removes a stack push / pop for every ParseTags call

jwilder

LGTM. @benbjohnson @e-dard might want to take a look. Not sure if this will affect their current dev branch.

jwilder · 2017-11-27T21:40:20Z

tsdb/index/inmem/meta.go

+type measurement struct {
+	Database  string
+	Name      string `json:"name,omitempty"`
+	NameBytes []byte // cached version as []byte


Do we need both a string and a byte version of name? Wonder if we can just have Name []byte.

👍 I'll take a look and adjust if Name is no longer needed

Name is still used in a number of places, however I pushed an additional commit that replaces []byte(Measurement.Name) with Measurement.NameBytes to prevent additional allocations

Cool. Can remove the other places some other time.

e-dard

LGTM 👍 I just have a couple of small suggestions/questions.

Can we hold off merging this until Ben and I get the dev branch er-tsi-index-part merged in?

e-dard · 2017-11-28T12:02:33Z

tsdb/engine/tsm1/engine.go

+)
+
+func BlockTypeToInfluxQLDataType(typ byte) influxql.DataType {
+	if int(typ) < len(blockToFieldType) {


As we're on an über optimisation drive here, you could also have:

fieldTypesN byte = len(blockToFieldType) ... ... ... func BlockTypeToInfluxQLDataType(typ byte) influxql.DataType { if typ < fieldTypesN {

I tried that, but alas the bounds check elimination (BCE) optimization can't be used. We can only declare FieldTypesN as a var and the compiler cannot prove this value doesn't change outside the BlockTypeToInfluxQLDataType function.

Interesting. 👍

e-dard · 2017-11-28T12:04:02Z

tsdb/engine/tsm1/engine_test.go

+	t := makeBlockTypeSlice(100)
+	for i := 0; i < b.N; i++ {
+		for j := 0; j < len(t); j++ {
+			tsm1.BlockTypeToInfluxQLDataType(t[j])


Are you sure this can't get optimised away by the compiler? That used to be a problem so I always got into the habit of defining a var result influxql.Unknown somewhere outside the benchmark.

Good catch – if the function was not inlined, it would have made no difference. We can deduce the inner loop overhead is thus about 45ns :)

Original benchmark was about 212ns, less loop overhead is 167ns. I added an assignment and it jumped from 45ns to 70ns. So about 30ns vs 167ns or about a 5.5x improvement.

Still a good improvement.

e-dard · 2017-11-28T12:26:22Z

tsdb/index/inmem/meta.go

+
+	mu       sync.RWMutex
+	shardIDs map[uint64]struct{} // shards that have this series defined
+	deleted  bool


Can we move deleted down to the bottom? There will be 7 bytes of padding after deleted, and if it gets left here then we'll end up with a larger than necessary struct size if we add anything that's less than 8 bytes underneath.

e-dard · 2017-11-28T12:32:02Z

tsdb/index/inmem/meta.go

+	t.mu.Unlock()
+}
+
+// StoreByte stores ids under the value key.


It's not clear to me why we need this new method.

To complement the LoadByte as there is a drive to move to []byte keys vs string keys. This change also removes a call to runtime.slicebytetostring at the call site in measurement#AddSeries. The cast in StoreByte is for the map key, so we benefit from the compiler optimization that it skips the alloc.

I get the push to []byte; it's something that's been slowly chipped away at over time but seems never ending 😄

The cast in StoreByte is for the map key, so we benefit from the compiler optimization that it skips the alloc.

That was kind of the underlying point of my confusion I guess.. We can't benefit from that particular compiler optimisation here because we need to alloc in order to guarantee that the map key is stable don't we? I'm pretty sure that particular optimisation would only work on the read-side wouldn't it?

With Store(value string, ids seriesIDs) I would assume there would be a runtime.slicebytetostring on the string(value) at the call site but then there would be no further allocations would there? Or maybe there would be another allocation when we did t.valueIDs[value] = ids in Store?

Indeed you are correct. Whilst map's won't allocate if a space exists to store a new key in a bucket, the assignment does indeed cause an allocation every time. It seems the compiler is not smart enough to delay the allocation if the key does not exist.

However, that leads to another issue with this logic anyhow. The *measurement#AddSeries method calls tagKeyValue#LoadByte to fetch the seriesIDs slice and later overwrites with the StoreByte method. This is a race condition, as if two concurrent processes mutated the slice, one would overwrite the other on the StoreByte call. We know this won't happen, as it is only mutated via *measurement#AddSeries that takes an exclusive lock on the measurement. We should just remove all this complicated locking logic :)

If you're referring to the lock on TagKeyValue, it was added specifically to fix an existing race (I forget where).

stuartcarnie · 2017-11-28T13:35:05Z

Will hold off until @e-dard merges the er-tsi-index-part

* only call ParseTags when necessary * remove dependency on inmem.Series in tsdb test package * Measurement and Series are no longer exported. Their use is restricted to the inmem package * improve Measurement and Series types by exporting immutable fields and removing unnecessary APIs and locks Reduced startup time from 28s to 17s. Overall improvement including #9162 reduces startup from 46s to 17s for 1MM series across 14 shards.

stuartcarnie · 2017-12-29T20:37:42Z

This PR was merged as it still had a number of useful improvements, including cleaning up some inmem APIs

ghost assigned stuartcarnie Nov 27, 2017

ghost added the review label Nov 27, 2017

stuartcarnie requested a review from jwilder November 27, 2017 21:15

stuartcarnie commented Nov 27, 2017

View reviewed changes

jwilder approved these changes Nov 27, 2017

View reviewed changes

e-dard approved these changes Nov 28, 2017

View reviewed changes

stuartcarnie force-pushed the sgc-inmem-startup branch 2 times, most recently from 440c8b6 to be616a8 Compare November 28, 2017 18:25

stuartcarnie added 4 commits December 29, 2017 07:58

prefer NameBytes

98aa368

updates per PR review comments

455013a

ensure correctly aligned for 32-bit architecture

638caf3

stuartcarnie force-pushed the sgc-inmem-startup branch from be616a8 to e48b9d1 Compare December 29, 2017 17:57

updates after TSI / series file merge

ed207b5

stuartcarnie force-pushed the sgc-inmem-startup branch from e48b9d1 to ed207b5 Compare December 29, 2017 17:58

stuartcarnie merged commit 80f1120 into master Dec 29, 2017

ghost removed the review label Dec 29, 2017

stuartcarnie deleted the sgc-inmem-startup branch December 29, 2017 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inmem startup improvments #9164

inmem startup improvments #9164

stuartcarnie commented Nov 27, 2017 •

edited

Loading

stuartcarnie Nov 27, 2017

stuartcarnie Nov 27, 2017

jwilder left a comment

jwilder Nov 27, 2017

stuartcarnie Nov 27, 2017

stuartcarnie Nov 27, 2017 •

edited

Loading

jwilder Nov 27, 2017

e-dard left a comment

e-dard Nov 28, 2017

stuartcarnie Nov 28, 2017

e-dard Nov 28, 2017

e-dard Nov 28, 2017

stuartcarnie Nov 28, 2017

e-dard Nov 28, 2017

e-dard Nov 28, 2017

e-dard Nov 28, 2017

stuartcarnie Nov 28, 2017 •

edited

Loading

e-dard Nov 28, 2017

stuartcarnie Nov 28, 2017

e-dard Nov 28, 2017

stuartcarnie commented Nov 28, 2017

stuartcarnie commented Dec 29, 2017

inmem startup improvments #9164

inmem startup improvments #9164

Conversation

stuartcarnie commented Nov 27, 2017 • edited Loading

Required for all non-trivial PRs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwilder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuartcarnie Nov 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-dard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuartcarnie Nov 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuartcarnie commented Nov 28, 2017

stuartcarnie commented Dec 29, 2017

stuartcarnie commented Nov 27, 2017 •

edited

Loading

stuartcarnie Nov 27, 2017 •

edited

Loading

stuartcarnie Nov 28, 2017 •

edited

Loading