Skip to content

Use disk based key value store for deduplication#10572

Open
raghavgautam wants to merge 14 commits intoapache:masterfrom
raghavgautam:master
Open

Use disk based key value store for deduplication#10572
raghavgautam wants to merge 14 commits intoapache:masterfrom
raghavgautam:master

Conversation

@raghavgautam
Copy link
Copy Markdown
Contributor

@raghavgautam raghavgautam commented Apr 6, 2023

This patch addresses memory problem as discussed in #10571
It uses rocksdb as an on disk key-value store instead of ConcurrentHashMap to reduce memory requirement

The memory requirement is number_of_keys * 10 bits + 64 MB for each column family. So, for 1 billion keys, it will use <1.5 GB RAM.
Source:

For testing, I was able to ingest at 8-10K/sec, with 250 million records already loaded on my M1 mac with 10 cores and 64 GB RAM.

@raghavgautam raghavgautam changed the title #performance Use disk based key value store for deduplication Apr 6, 2023
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 6, 2023

Codecov Report

Merging #10572 (f52555f) into master (f6c6d14) will decrease coverage by 56.52%.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             master   #10572       +/-   ##
=============================================
- Coverage     70.35%   13.83%   -56.52%     
+ Complexity     6464      439     -6025     
=============================================
  Files          2103     2052       -51     
  Lines        112769   110896     -1873     
  Branches      16981    16795      -186     
=============================================
- Hits          79341    15348    -63993     
- Misses        27877    94295    +66418     
+ Partials       5551     1253     -4298     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 ?
unittests2 13.83% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ata/manager/realtime/RealtimeTableDataManager.java 0.00% <ø> (-67.42%) ⬇️
...nt/local/dedup/ConcurrentHashMapKeyValueStore.java 0.00% <0.00%> (ø)
...ent/local/dedup/PartitionDedupMetadataManager.java 0.00% <0.00%> (-66.00%) ⬇️
...segment/local/dedup/TableDedupMetadataManager.java 0.00% <0.00%> (-100.00%) ⬇️
...org/apache/pinot/spi/config/table/DedupConfig.java 0.00% <0.00%> (-83.34%) ⬇️

... and 1668 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@yupeng9
Copy link
Copy Markdown
Contributor

yupeng9 commented Apr 7, 2023

It'll be good to move this rockdb dependency to a plugin subfolder so Pinot core does not depend on this lib

@Jackie-Jiang
Copy link
Copy Markdown
Contributor

Currently the deduplication is handled using the same way as upsert as a short term solution (not production ready). We have done a lot of bugfixes to the upsert implementation, but not actively maintain the dedup implementation.

My suggestion would be to redesign dedup from scratch since it is not the same as upsert (no need to maintain valid docs, no need to track segment etc.), and TTL (dedup window) should be a must have for dedup. If we have proper TTL, the key size should be much smaller.

After that if we still need disk based KV store we can introduce that as a plug-in. We don't want to introduce RocksDB dependency in default distribution

@yupeng9
Copy link
Copy Markdown
Contributor

yupeng9 commented Apr 12, 2023

Currently the deduplication is handled using the same way as upsert as a short term solution (not production ready). We have done a lot of bugfixes to the upsert implementation, but not actively maintain the dedup implementation.

My suggestion would be to redesign dedup from scratch since it is not the same as upsert (no need to maintain valid docs, no need to track segment etc.), and TTL (dedup window) should be a must have for dedup. If we have proper TTL, the key size should be much smaller.

After that if we still need disk based KV store we can introduce that as a plug-in. We don't want to introduce RocksDB dependency in default distribution

I think the redesign of dedup makes sense but it'll be a much larger and involved effort, and unlikely for @raghavgautam to take. I think it makes sense to start a separate effort of dedup V2, but in the meantime it's fine for the community to add features and make improvements over V1, assuming it'll take some time for V2 to be ready.

For depedency on KV store, I left a similar comment that we'd better move this to a plugin for simpler dependencies of the core.

@Jackie-Jiang
Copy link
Copy Markdown
Contributor

I think the redesign of dedup makes sense but it'll be a much larger and involved effort, and unlikely for @raghavgautam to take. I think it makes sense to start a separate effort of dedup V2, but in the meantime it's fine for the community to add features and make improvements over V1, assuming it'll take some time for V2 to be ready.

My concern for the current dedup implementation is that it is not maintained properly (not production ready), and the bugfixes for upsert is not added to dedup. If we decide to keep V1, we want to first apply all the changes to upsert to dedup as well.

@yupeng9
Copy link
Copy Markdown
Contributor

yupeng9 commented Apr 12, 2023

I think the redesign of dedup makes sense but it'll be a much larger and involved effort, and unlikely for @raghavgautam to take. I think it makes sense to start a separate effort of dedup V2, but in the meantime it's fine for the community to add features and make improvements over V1, assuming it'll take some time for V2 to be ready.

My concern for the current dedup implementation is that it is not maintained properly (not production ready), and the bugfixes for upsert is not added to dedup. If we decide to keep V1, we want to first apply all the changes to upsert to dedup as well.

This sounds like a bigger problem that dedup is not ready for the community but we released it. I don't see any warning in Pinot doc...

I think we either maintain V1, or not advocate its use (e.g. mark it as experimental, or remove it). @kishoreg @icefury71 wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants