Improve tsm1 cache performance #7228

stevenh · 2016-08-28T22:11:13Z

Rebased/mergable
Tests pass
CHANGELOG.md updated
Sign CLA (if not already signed)

Reduce the cache lock contention by widening the cache lock scope in WriteMulti, while this sounds counter intuitive it was:

1 x Read Lock to read the size
1 x Read Lock per values
1 x Write Lock per values on race
1 x Write Lock to update the size

We now have:

1 x Write Lock

This also reduces contention on the entries Values lock too as we have the global cache lock.

Move the calculation of the added size before taking the lock as it takes time and doesn't need the lock.

This also fixes a race in WriteMulti due to the lock not being held across the entire operation, which could cause the cache size to have an invalid value if Snapshot has been run in between the addition of the values and the size update.

Fix the cache benchmark which where benchmarking the creation of the cache not its operation and add a parallel test for more real world scenario, however this could still be improved.

Add a fast path newEntryValues values for the new case which avoids taking the values lock and all the other calculations.

mention-bot · 2016-08-28T22:11:14Z

@stevenh, thanks for your PR! By analyzing the annotation information on this pull request, we identified @e-dard to be a potential reviewer

e-dard · 2016-08-29T17:25:17Z

@jwilder

Reduce the cache lock contention by widening the cache lock scope in WriteMulti, while this sounds counter intuitive it was: * 1 x Read Lock to read the size * 1 x Read Lock per values * 1 x Write Lock per values on race * 1 x Write Lock to update the size We now have: * 1 x Write Lock This also reduces contention on the entries Values lock too as we have the global cache lock. Move the calculation of the added size before taking the lock as it takes time and doesn't need the lock. This also fixes a race in WriteMulti due to the lock not being held across the entire operation, which could cause the cache size to have an invalid value if Snapshot has been run in the between the addition of the values and the size update. Fix the cache benchmark which where benchmarking the creation of the cache not its operation and add a parallel test for more real world scenario, however this could still be improved. Add a fast path newEntryValues values for the new case which avoids taking the values lock and all the other calculations. Drop the lock before performing the sort in Cache.Keys().

stevenh · 2016-09-15T08:29:37Z

Rebased to eliminate conflict.

rw · 2016-09-23T01:25:25Z

tsdb/engine/tsm1/cache.go

-	newSize := c.size + uint64(totalSz)
-	if c.maxSize > 0 && newSize+c.snapshotSize > c.maxSize {
-		c.mu.RUnlock()
+	c.mu.Lock()


Why did you switch this from a read-only lock to a read-write lock?

In general we want to use RLock whenever possible.

Because taking the lock multiple times was causing significant slowdown due to lock contention, but also as how this was structured was racey resulting in the potential for c.size to become totally invalid.

rw · 2016-09-23T01:26:18Z

tsdb/engine/tsm1/cache.go


 	var a []string
 	for k, _ := range c.store {
 		a = append(a, k)
 	}
+	c.mu.RUnlock()


Eliminating defers is always good :-)

jwilder · 2016-10-21T21:04:59Z

@stevenh This is working well in my tests. Can you rebase?

jwilder

I'll rebase and merge.

jwilder · 2016-10-25T21:31:53Z

Fixed via 7cc95ce. Thanks @stevenh!

stevenh mentioned this pull request Aug 28, 2016

Shard deletion triggers infinite memory usage due to cache lock contention #7226

Closed

stevenh force-pushed the cache-lock-contention branch from 5bcecb9 to 97d7015 Compare August 28, 2016 22:45

stevenh force-pushed the cache-lock-contention branch from 97d7015 to ab61ed0 Compare September 15, 2016 08:27

rw reviewed Sep 23, 2016

View reviewed changes

jwilder mentioned this pull request Oct 5, 2016

reuse cache function write #7359

Closed

4 tasks

jwilder added this to the 1.1.0 milestone Oct 21, 2016

jwilder added area/performance area/tsm labels Oct 21, 2016

jwilder self-assigned this Oct 24, 2016

jwilder approved these changes Oct 25, 2016

View reviewed changes

jwilder closed this Oct 25, 2016

jwilder mentioned this pull request Oct 25, 2016

Implement CacheStore, a sharded map for tsm1.Cache. #7014

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve tsm1 cache performance #7228

Improve tsm1 cache performance #7228

stevenh commented Aug 28, 2016 •

edited

Loading

mention-bot commented Aug 28, 2016

e-dard commented Aug 29, 2016

stevenh commented Sep 15, 2016

rw Sep 23, 2016

rw Sep 23, 2016

stevenh Sep 30, 2016 •

edited

Loading

rw Sep 23, 2016

jwilder commented Oct 21, 2016

jwilder left a comment

jwilder commented Oct 25, 2016

Improve tsm1 cache performance #7228

Improve tsm1 cache performance #7228

Conversation

stevenh commented Aug 28, 2016 • edited Loading

mention-bot commented Aug 28, 2016

e-dard commented Aug 29, 2016

stevenh commented Sep 15, 2016

rw Sep 23, 2016

Choose a reason for hiding this comment

rw Sep 23, 2016

Choose a reason for hiding this comment

stevenh Sep 30, 2016 • edited Loading

Choose a reason for hiding this comment

rw Sep 23, 2016

Choose a reason for hiding this comment

jwilder commented Oct 21, 2016

jwilder left a comment

Choose a reason for hiding this comment

jwilder commented Oct 25, 2016

stevenh commented Aug 28, 2016 •

edited

Loading

stevenh Sep 30, 2016 •

edited

Loading