Cache fragmented range tombstone list for mutable memtables #10547

cbi42 · 2022-08-21T07:11:53Z

Summary: Each read from memtable used to read and fragment all the range tombstones into a FragmentedRangeTombstoneList. #10380 improved the inefficient here by caching a FragmentedRangeTombstoneList with each immutable memtable. This PR extends the caching to mutable memtables. The fragmented range tombstone can be constructed in either read (This PR) or write path (#10584). With both implementation, each DeleteRange() will invalidate the cache, and the difference is where the cache is re-constructed.CoreLocalArray is used to store the cache with each memtable so that multi-threaded reads can be efficient. More specifically, each core will have a shared_ptr to a shared_ptr pointing to the current cache. Each read thread will only update the reference count in its core-local shared_ptr, and this is only needed when reading from mutable memtables.

The choice between write path and read path is not an easy one: they are both improvement compared to no caching in the current implementation, but they favor different operations and could cause regression in the other operation (read vs write). The write path caching in (#10584) leads to a cleaner implementation, but I chose the read path caching here to avoid significant regression in write performance when there is a considerable amount of range tombstones in a single memtable (the number from the benchmark below suggests >1000 with concurrent writers). Note that even though the fragmented range tombstone list is only constructed in DeleteRange() operations, it could block other writes from proceeding, and hence affects overall write performance.

Test plan:

TestGet() in stress test is updated in Update TestGet() to verify against expected state #10553 to compare Get() result against expected state: ./db_stress_branch --readpercent=57 --prefixpercent=4 --writepercent=25 -delpercent=5 --iterpercent=5 --delrangepercent=4
Perf benchmark: tested read and write performance where a memtable has 0, 1, 10, 100 and 1000 range tombstones.

./db_bench --benchmarks=fillrandom,readrandom --writes_per_range_tombstone=200 --max_write_buffer_number=100 --min_write_buffer_number_to_merge=100 --writes=200000 --reads=100000 --disable_auto_compactions --max_num_range_tombstones=1000

Write perf regressed since the cost of constructing fragmented range tombstone list is shifted from every read to a single write. 6cbe5d8 is included in the last column as a reference to see performance impact on multi-thread reads if CoreLocalArray is not used.

micros/op averaged over 5 runs: first 4 columns are for fillrandom, last 4 columns are for readrandom.

	fillrandom main	write path caching	read path caching	memtable V3 (#10308)	readrandom main	write path caching	read path caching	memtable V3
0	6.35	6.15	5.82	6.12	2.24	2.26	2.03	2.07
1	5.99	5.88	5.77	6.28	2.65	2.27	2.24	2.5
10	6.15	6.02	5.92	5.95	5.15	2.61	2.31	2.53
100	5.95	5.78	5.88	6.23	28.31	2.34	2.45	2.94
100 25 threads	52.01	45.85	46.18	47.52	35.97	3.34	3.34	3.56
1000	6.0	7.07	5.98	6.08	333.18	2.86	2.7	3.6
1000 25 threads	52.6	148.86	79.06	45.52	473.49	3.66	3.48	4.38

Benchmark performance ofreadwhilewriting from Optionally issue DeleteRange in *whilewriting benchmarks #10552, 100 range tombstones are written: ./db_bench --benchmarks=readwhilewriting --writes_per_range_tombstone=500 --max_write_buffer_number=100 --min_write_buffer_number_to_merge=100 --writes=100000 --reads=500000 --disable_auto_compactions --max_num_range_tombstones=10000 --finish_after_writes

readrandom micros/op:

	main	write path caching	read path caching	memtable V3
single thread	48.28	1.55	1.52	1.96
25 threads	64.3	2.55	2.67	2.64

facebook-github-bot · 2022-08-21T15:39:24Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cbi42 · 2022-08-21T19:21:14Z

~~TODO: the stress test change should be a separate PR.~~

facebook-github-bot · 2022-08-22T21:25:26Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-22T21:32:36Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-22T21:37:34Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-22T23:18:31Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-22T23:27:04Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: Optionally issue DeleteRange in `*whilewriting` benchmarks. This happens in `BGWriter` and uses similar logic as in `DoWrite` to issue DeleteRange operations. I added this when I was benchmarking #10547, but this should be an independent PR. Pull Request resolved: #10552 Test Plan: ran some benchmarks with various delete range options, e.g. `./db_bench --benchmarks=readwhilewriting --writes_per_range_tombstone=100 --writes=200000 --reads=1000000 --disable_auto_compactions --max_num_range_tombstones=10000` Reviewed By: ajkr Differential Revision: D38927020 Pulled By: cbi42 fbshipit-source-id: 31ee20cb8127f7173f0816ea0cc2a204ec02aad6

facebook-github-bot · 2022-08-26T18:07:25Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-26T18:08:30Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-08-26T20:46:35Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-09-04T04:13:38Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

ajkr

LGTM, thanks!

db/range_tombstone_fragmenter.h

db/memtable.cc

facebook-github-bot · 2022-09-13T20:18:05Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-09-13T20:20:33Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-09-13T20:21:21Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-09-13T20:22:00Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2022-09-13T22:59:14Z

@cbi42 has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot · 2022-09-13T22:59:46Z

@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: fix a data race introduced in #10547 (P5295241720), first reported by pdillinger. The race is between the `std::atomic_load_explicit` in NewRangeTombstoneIteratorInternal and the `std::atomic_store_explicit` in MemTable::Add() that operate on `cached_range_tombstone_`. P5295241720 shows that `atomic_store_explicit` initializes some mutex which `atomic_load_explicit` could be trying to call `lock()` on at the same time. This fix moves the initialization to memtable constructor. Pull Request resolved: #10680 Test Plan: `USE_CLANG=1 COMPILE_WITH_TSAN=1 make -j24 whitebox_crash_test` Reviewed By: ajkr Differential Revision: D39528696 Pulled By: cbi42 fbshipit-source-id: ee740841044438e18ad2b8ea567444dd542dd8e2

…acebook#10547)" This reverts commit f291eef.

facebook-github-bot added the CLA Signed label Aug 21, 2022

cbi42 changed the title ~~Cache tombstone read path~~ Cache fragmented range tombstone list for live memtables Aug 21, 2022

cbi42 mentioned this pull request Aug 21, 2022

Fragment memtable range tombstone in the write path #10380

Closed

cbi42 added the WIP Work in progress label Aug 22, 2022

cbi42 mentioned this pull request Aug 22, 2022

Optionally issue DeleteRange in *whilewriting benchmarks #10552

Closed

cbi42 force-pushed the cache-tombstone-read-path branch from 3489062 to 41cccc8 Compare August 22, 2022 21:25

cbi42 force-pushed the cache-tombstone-read-path branch from 41cccc8 to 6bbe600 Compare August 22, 2022 21:32

cbi42 force-pushed the cache-tombstone-read-path branch from 6bbe600 to d8d5fc4 Compare August 22, 2022 21:37

cbi42 changed the title ~~Cache fragmented range tombstone list for live memtables~~ Cache fragmented range tombstone list for mutable memtables Aug 22, 2022

cbi42 removed the WIP Work in progress label Aug 22, 2022

cbi42 force-pushed the cache-tombstone-read-path branch from d8d5fc4 to 7740896 Compare August 22, 2022 23:18

cbi42 requested a review from ajkr August 23, 2022 03:00

cbi42 mentioned this pull request Aug 26, 2022

[Demo only] Fragment mutable memtable in write path #10584

Closed

cbi42 force-pushed the cache-tombstone-read-path branch from 7740896 to 9968f98 Compare August 26, 2022 18:07

cbi42 force-pushed the cache-tombstone-read-path branch from 9968f98 to 1907503 Compare August 26, 2022 18:08

cbi42 mentioned this pull request Aug 29, 2022

reduce DeleteRange's negative impact on read by fragmenting range tombstones in the write path #10308

Open

cbi42 mentioned this pull request Sep 12, 2022

Regression of getting iterator after a deletion by range #10662

Closed

ajkr approved these changes Sep 13, 2022

View reviewed changes

db/range_tombstone_fragmenter.h Outdated Show resolved Hide resolved

db/memtable.cc Outdated Show resolved Hide resolved

db/memtable.cc Show resolved Hide resolved

cbi42 added 6 commits September 13, 2022 13:20

Cache live memtable range tombstone

60b00e2

removed unused variable

c1c8fcb

Fix for failed CI

7a455ce

lock when concurrent

3d983d3

update doc

3ba9070

shared_ptr aliasing

af3b15e

cbi42 force-pushed the cache-tombstone-read-path branch from d4ca7e8 to af3b15e Compare September 13, 2022 20:21

HISTORY.md

43e0252

facebook-github-bot closed this in f291eef Sep 14, 2022

cbi42 mentioned this pull request Sep 14, 2022

Fix data race in accessing cached_range_tombstone_ #10680

Closed

ajkr added a commit to ajkr/rocksdb that referenced this pull request Sep 19, 2022

Revert "Cache fragmented range tombstone list for mutable memtables (f…

2fe0ee1

…acebook#10547)" This reverts commit f291eef.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache fragmented range tombstone list for mutable memtables #10547

Cache fragmented range tombstone list for mutable memtables #10547

cbi42 commented Aug 21, 2022 •

edited

Loading

facebook-github-bot commented Aug 21, 2022

cbi42 commented Aug 21, 2022 •

edited

Loading

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Sep 4, 2022

ajkr left a comment

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

Cache fragmented range tombstone list for mutable memtables #10547

Cache fragmented range tombstone list for mutable memtables #10547

Conversation

cbi42 commented Aug 21, 2022 • edited Loading

facebook-github-bot commented Aug 21, 2022

cbi42 commented Aug 21, 2022 • edited Loading

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 22, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Aug 26, 2022

facebook-github-bot commented Sep 4, 2022

ajkr left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

facebook-github-bot commented Sep 13, 2022

cbi42 commented Aug 21, 2022 •

edited

Loading

cbi42 commented Aug 21, 2022 •

edited

Loading