[store] use hasCache to minimize pendingRef pool #7672

max-hoffman · 2024-04-01T19:34:59Z

Expand the ChunkStore interface to let the nodeStore access recently accessed chunks. Avoid adding a child ref to the pendingRef list when already present in nbs.hasCache. For TPC-C this appears to reduce the pending ref count by another ~80%.

max-hoffman · 2024-04-01T19:35:06Z

#benchmark

github-actions · 2024-04-01T19:35:27Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/8512757441

coffeegoddd · 2024-04-01T19:56:38Z

@max-hoffman DOLT

test_name	from_latency_median	to_latency_median	is_faster
tpcc-scale-factor-1	118.92	118.92	0

test_name	server_name	server_version	tps	test_name	server_name	server_version	tps	is_faster
tpcc-scale-factor-1	dolt	`053e6ca`	17.63	tpcc-scale-factor-1	dolt	`0fa773c`	17.71	0

coffeegoddd · 2024-04-01T20:06:08Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`0fa773c`	ok	5937457

version	total_tests
`0fa773c`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-04-01T20:57:12Z

@max-hoffman DOLT

read_tests	from_latency_median	to_latency_median
covering_index_scan	2.97	2.91
groupby_scan	17.95	17.63
index_join	5.09	5.09
index_join_scan	2.26	2.22
index_scan	54.83	53.85
oltp_point_select	0.49	0.49
oltp_read_only	8.28	8.13
select_random_points	0.78	0.78
select_random_ranges	0.95	0.94
table_scan	54.83	54.83
types_table_scan	158.63	158.63

write_tests	from_latency_median	to_latency_median
oltp_delete_insert	6.91	6.79
oltp_insert	3.43	3.36
oltp_read_write	15.83	15.55
oltp_update_index	3.55	3.43
oltp_update_non_index	3.49	3.36
oltp_write_only	7.84	7.56
types_delete_insert	7.7	7.43

reltuk

I know this was just testing perf impact...but for iterating towards landing it, I definitely don't think landing this with a new interface on the ChunkStore is the right direction.

Instead, I would think something like getAddrsCb becomes something like func(chunk []byte, func(h hash.Hash)) and then NomsBlockStore implementation uses that to actually build the hashset that it wants, and in the process filters by the concrete hasCache that was passed into tableSet.append or whatever.

reltuk · 2024-04-02T17:46:07Z

If perf impact of the interface change is bad, even getAddrsCb as func(chunk []byte, insertIntoThis hash.HashSet, filterByThis Haser) type Haser interface { Has(hash.Hash) bool } or func(chunk []byte, insertIntoThis hash.HashSet, filterByThis lru...) or something could be preferable...

max-hoffman · 2024-04-02T18:54:12Z

#benchmark

github-actions · 2024-04-02T18:54:38Z

@max-hoffman workflow run: https://github.com/dolthub/dolt/actions/runs/8527963737

coffeegoddd · 2024-04-02T19:16:08Z

@max-hoffman DOLT

test_name	from_latency_median	to_latency_median	is_faster
tpcc-scale-factor-1	121.08	118.92	0

test_name	server_name	server_version	tps	test_name	server_name	server_version	tps	is_faster
tpcc-scale-factor-1	dolt	`d6aa1e6`	17.91	tpcc-scale-factor-1	dolt	`c832396`	18.11	0

coffeegoddd · 2024-04-02T19:38:08Z

@max-hoffman DOLT

comparing_percentages
100.000000 to 100.000000

version	result	total
`212f07f`	ok	5937457

version	total_tests
`212f07f`	5937457

correctness_percentage
100.0

coffeegoddd · 2024-04-02T20:18:30Z

@max-hoffman DOLT

read_tests	from_latency_median	to_latency_median
covering_index_scan	3.02	2.97
groupby_scan	17.32	17.63
index_join	5.09	5.09
index_join_scan	2.26	2.22
index_scan	54.83	54.83
oltp_point_select	0.49	0.49
oltp_read_only	8.13	8.13
select_random_points	0.78	0.78
select_random_ranges	0.94	0.94
table_scan	54.83	54.83
types_table_scan	161.51	158.63

write_tests	from_latency_median	to_latency_median
oltp_delete_insert	6.91	6.67
oltp_insert	3.43	3.3
oltp_read_write	15.83	15.55
oltp_update_index	3.49	3.43
oltp_update_non_index	3.49	3.36
oltp_write_only	7.7	7.56
types_delete_insert	7.7	7.43

reltuk

LGTM!

[store] use hasCache to minimize pendingRef pool

0fa773c

coffeegoddd added the correctness_approved label Apr 1, 2024

reltuk reviewed Apr 2, 2024

View reviewed changes

better interfaces

c832396

max-hoffman added 2 commits April 2, 2024 12:00

fmt

2643db6

vet

212f07f

reltuk approved these changes Apr 2, 2024

View reviewed changes

max-hoffman merged commit 864f962 into main Apr 2, 2024
20 checks passed

max-hoffman deleted the max/pending-ref-has-cache-check branch April 2, 2024 21:54

BrewTestBot mentioned this pull request Apr 3, 2024

dolt 1.35.7 Homebrew/homebrew-core#167847

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[store] use hasCache to minimize pendingRef pool #7672

[store] use hasCache to minimize pendingRef pool #7672

max-hoffman commented Apr 1, 2024 •

edited

Loading

max-hoffman commented Apr 1, 2024

github-actions bot commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

reltuk left a comment

reltuk commented Apr 2, 2024

max-hoffman commented Apr 2, 2024

github-actions bot commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

reltuk left a comment

[store] use hasCache to minimize pendingRef pool #7672

[store] use hasCache to minimize pendingRef pool #7672

Conversation

max-hoffman commented Apr 1, 2024 • edited Loading

max-hoffman commented Apr 1, 2024

github-actions bot commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

coffeegoddd commented Apr 1, 2024

reltuk left a comment

Choose a reason for hiding this comment

reltuk commented Apr 2, 2024

max-hoffman commented Apr 2, 2024

github-actions bot commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

coffeegoddd commented Apr 2, 2024

reltuk left a comment

Choose a reason for hiding this comment

max-hoffman commented Apr 1, 2024 •

edited

Loading