internal/cache: reduce mutex contention #2774

jbowens · 2023-07-26T21:40:03Z

internal/cache: reduce mutex contention from EvictFile

We've observed block cache mutex contention originating from EvictFile. When
evicting a file with a significant count of blocks currently held within the
block cache, EvictFile may hold the block cache shard mutex for an extended
duration, increasing latency for requests that must access the same shard.

This commit mitigates this contention with two approaches. EvictFile now
periodically drops the mutex to provide these other requests with an
opportunity to make progress. Additionally, rather than free()-ing manually
managed memory back to the memory allocator while holding the mutex, the free
is deferred until the shard mutex has been dropped.

Informs #1997.

internal/cache: increase shard count to 4x CPUs

Increase the block cache shard count to reduce mutex contention.

cockroach-teamcity · 2023-07-26T21:40:11Z

This change is

RaduBerinde

Just some nits.

Reviewable status: 0 of 1 files reviewed, 6 unresolved discussions (waiting on @jbowens and @sumeerbhola)

internal/cache/clockpro.go line 180 at r1 (raw file):

		c.countTest--
		if oldV := c.metaDel(e); oldV != nil {
			oldV.release()

oldV should always be nil here right? (that would be the case above)
Ok to keep so it's idiomatic but perhaps add a comment.

internal/cache/clockpro.go line 259 at r1 (raw file):

	fkey := key{fileKey{id, fileNum}, 0}
	for c.evictFileRun(fkey) {
	}

maybe call runtime.Gosched() here so other goroutines get more of a chance to acquire the mutex?

internal/cache/clockpro.go line 264 at r1 (raw file):

			break
		}
	}

Wouldn't it be a lot simpler to just add a

mu.Unlock()
runtime.GoSched()
mu.Lock()

in here every 5 blocks?

internal/cache/clockpro.go line 399 at r1 (raw file):

// Remove the entry from the cache. This removes the entry from the blocks map,
// the files map, and ensures that hand{Hot,Cold,Test} are not pointing at the
// entry. Returns the deleted value that must be released.

Can also return nil.. Or we could make release tolerate nil and then we can just call metaDel().release() in many cases

internal/cache/clockpro.go line 405 at r1 (raw file):

	}
	// Remove the pointer to the value.
	deletedValue, e.val = e.val, nil

[nit] I find this harder to read than two separate statement

internal/cache/clockpro.go line 710 at r2 (raw file):

	// In tests, we can use large CPU machines with small cache sizes and have
	// many caches in existence at a time. If sharding into m shards would
	// produce small of shards, constrain the number of shards to 4.

[nit] would produce shards that are too small

jbowens

TFTR!

Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)

internal/cache/clockpro.go line 259 at r1 (raw file):

Previously, RaduBerinde wrote…

maybe call runtime.Gosched() here so other goroutines get more of a chance to acquire the mutex?

Good call, done.

internal/cache/clockpro.go line 264 at r1 (raw file):

Previously, RaduBerinde wrote…

Wouldn't it be a lot simpler to just add a
mu.Unlock()
runtime.GoSched()
mu.Lock()
in here every 5 blocks?

Freeing the memory while the mutex is dropped and adding a defer to handle invariant violation panics while Free-ing memory make it not too simple. But I agree performing the frees and consistency checks in the defer was confusing. I moved it it inline here, but it's still not very simple :/

internal/cache/clockpro.go line 399 at r1 (raw file):

Previously, RaduBerinde wrote…

Can also return nil.. Or we could make release tolerate nil and then we can just call metaDel().release() in many cases

I like it, done.

RaduBerinde · 2023-07-27T19:02:35Z

internal/cache/clockpro.go line 264 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Freeing the memory while the mutex is dropped and adding a defer to handle invariant violation panics while Free-ing memory make it not too simple. But I agree performing the frees and consistency checks in the defer was confusing. I moved it it inline here, but it's still not very simple :/

Ah.. right.. Another option would be for the function to return the slice and have the caller release. That would keep the entire function locked. We can pass in the prealloced slice and the function woulf fill it to the cap.

Also, I don't think we need to checkConsistency in the first return false case, so we could just move it before the second return false

sumeerbhola

Reviewed 3 of 3 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @jbowens and @RaduBerinde)

internal/cache/clockpro.go line 264 at r3 (raw file):

	// If most of the file's blocks are held in the block cache, evicting all
	// the blocks may take a while. We don't want to block the entire cache
	// shard, forcing concurrent readers until we're finished. We drop the mutex

... readers to wait until ...

internal/cache/clockpro.go line 277 at r3 (raw file):

	obsoleteValues := obsoleteValuesAlloc[:0]
	defer func() {
		if !moreRemaining {

maybe add a comment
// Don't want to pay the cost of checkConsistency for each batch corresponding to blocksPerMutexAcquisition, so do it only when this was the last batch processed.

internal/cache/clockpro.go line 289 at r3 (raw file):

	if blocks == nil {
		return false
	}

This code could use a few comments. Something like:

// b is the current head of the doubly linked list, and n is the entry after b.

internal/cache/clockpro.go line 293 at r3 (raw file):

		n = b.fileLink.next
		obsoleteValues = append(obsoleteValues, c.metaEvict(b))
		if b == n {

// b == n represents the case where b was the last entry remaining in the doubly linked list, which is why it pointed at itself. So no more entries left.

internal/cache/clockpro.go line 704 at r4 (raw file):

	// In tests we can use large CPU machines with small cache sizes and have
	// many caches in existence at a time. If sharding into m shards would
	// produce too small of shards, constrain the number of shards to 4.

too small shards ...

We've observed block cache mutex contention originating from EvictFile. When evicting a file with a significant count of blocks currently held within the block cache, EvictFile may hold the block cache shard mutex for an extended duration, increasing latency for requests that must access the same shard. This commit mitigates this contention with two approaches. EvictFile now periodically drops the mutex to provide these other requests with an opportunity to make progress. Additionally, rather than free()-ing manually managed memory back to the memory allocator while holding the mutex, the free is deferred until the shard mutex has been dropped. Informs cockroachdb#1997.

Increase the block cache shard count to reduce mutex contention.

jbowens

TFTRs!

Reviewable status: 2 of 3 files reviewed, 8 unresolved discussions (waiting on @RaduBerinde and @sumeerbhola)

internal/cache/clockpro.go line 289 at r3 (raw file):

Previously, sumeerbhola wrote…

This code could use a few comments. Something like:

// b is the current head of the doubly linked list, and n is the entry after b.

Done.

internal/cache/clockpro.go line 293 at r3 (raw file):

Previously, sumeerbhola wrote…

// b == n represents the case where b was the last entry remaining in the doubly linked list, which is why it pointed at itself. So no more entries left.

Done.

internal/cache/clockpro.go line 704 at r4 (raw file):

Previously, sumeerbhola wrote…

too small shards ...

Done.

jbowens force-pushed the evictfile-dropmu branch 2 times, most recently from fd83bba to 84588ca Compare July 26, 2023 21:48

jbowens requested review from sumeerbhola and a team July 27, 2023 15:32

RaduBerinde approved these changes Jul 27, 2023

View reviewed changes

jbowens force-pushed the evictfile-dropmu branch from 84588ca to b86a698 Compare July 27, 2023 18:01

jbowens commented Jul 27, 2023

View reviewed changes

jbowens force-pushed the evictfile-dropmu branch from b86a698 to 53839ee Compare July 27, 2023 18:20

sumeerbhola approved these changes Jul 28, 2023

View reviewed changes

jbowens added 2 commits July 28, 2023 11:12

internal/cache: increase shard count to 4x CPUs

195c979

Increase the block cache shard count to reduce mutex contention.

jbowens force-pushed the evictfile-dropmu branch from 53839ee to 195c979 Compare July 28, 2023 15:16

jbowens commented Jul 28, 2023

View reviewed changes

jbowens merged commit ce43e65 into cockroachdb:master Jul 28, 2023
9 checks passed

jbowens deleted the evictfile-dropmu branch July 28, 2023 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

internal/cache: reduce mutex contention #2774

internal/cache: reduce mutex contention #2774

jbowens commented Jul 26, 2023

cockroach-teamcity commented Jul 26, 2023

RaduBerinde left a comment

jbowens left a comment

RaduBerinde commented Jul 27, 2023

sumeerbhola left a comment

jbowens left a comment

internal/cache: reduce mutex contention #2774

internal/cache: reduce mutex contention #2774

Conversation

jbowens commented Jul 26, 2023

cockroach-teamcity commented Jul 26, 2023

RaduBerinde left a comment

Choose a reason for hiding this comment

jbowens left a comment

Choose a reason for hiding this comment

RaduBerinde commented Jul 27, 2023

sumeerbhola left a comment

Choose a reason for hiding this comment

jbowens left a comment

Choose a reason for hiding this comment