Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of a new lockless hashtable mpmc #262

Merged
merged 81 commits into from
Dec 5, 2022

Conversation

danielealbano
Copy link
Owner

@danielealbano danielealbano commented Nov 30, 2022

This PR implements a new lockless multi producer and multi consumer parallel hashtable which is capable of achieving amazing numbers (included at the end of the summary)!

This new hashtable works in a very different way from the previous one although the goal keeps being the same: spread the contention of the threads as much as possible and reduce the amount of times the data in memory are changed.

The old approach was relying on a combination of memory fencing and userspace spinlocks to perform lockless GET operations meanwhile using userspace spinlocks for the SET and DEL operations with a lock every 14 buckets, the results were great as an hashtable with 10000 was being controlled by about 714 locks, but the downside was a lot of memory changes, also impacting the cache of the cores, done by 1 single thread to write a value.

The new approach relies on a combination of approaches and components:

  • an epoch operation queue, each time an operation is triggered with the hashtable an operation is pushed to the queue and marked as completed once done, guaranteeing that the data read by the thread after pushing the new operation to the queue will never be deleted until the operation itself is completed
  • an epoch garbage collector, which operates in combination with the epoch operation queue, every time a bucket is deleted the associated key value data are not deleted immediately but insted staged for deletion, will only be deleted once all the operation started before the staging are marked as completed
  • a 3 pass approach to insert new buckets which guarantees that at some point if multiple threads are fighting to insert the same key one will be successful
    • which becomes 4 pass in case an insertion is carried out during an upsize if the key is new
  • the key value data are not stored inside the hashtable itself but as a pointer to an external structure, this makes a bucket much larger (16 bytes) but drammatically speed up the copy time and makes possible to use 128 bit atomic operations to update a bucket
  • pointer tagging to store status in the pointer of the key value itself
  • a transaction spinlock, as it's in use in the current hashtable, for the single/multi key transactions and the Read-Modify-Write operations

The hashtable will implement some extra optimizations to box the upper level data (the storagedb_entry_index) into the lower level data (the key value itself) using a combination of an union and the code in the header using some defines to create a typed version of the hashtable.

This new hashtable will provide much better performances when batching operations or operating in a cluster.

The PR includes new tests and new benchmarks, used to generate the numbers initially mentioned.

The PR is drammatically large:

  • 74 commits
  • 18 files changed
  • 4465 new lines
  • only 71 deletions

Talking about numbers, here some

Operation Threads V1 - Million Op/s  V2 - Million Op/s
INSERT 1 3.22191 5.02507
INSERT 2 6.26603 9.74552
INSERT 4 11.73655 19.20719
INSERT 8 24.03989 38.92227
INSERT 16 42.58111 72.36672
INSERT 32 69.45982 129.38364
INSERT 64 109.68172 197.67271
UPDATE 1 3.18695 5.55167
UPDATE 2 6.16648 11.06459
UPDATE 4 11.53689 21.79368
UPDATE 8 22.8819 43.98683
UPDATE 16 37.42774 83.61501
UPDATE 32 63.96726 143.30408
UPDATE 64 89.5212 236.87745

The V2 hashtable is between 2 and 2.5 times faster than the V1, the current implementation.

The hardware used for benchmarking was an EPYC 7502P (32 core, 64 hw thread, default bios settings) with 256GB RDIMM DDR4 3200mhz.

Closes #103

…nt of iterations that can be done in the benchmark
@danielealbano danielealbano added the enhancement New feature or request label Nov 30, 2022
@danielealbano danielealbano added this to the v0.2 milestone Nov 30, 2022
@danielealbano danielealbano self-assigned this Nov 30, 2022
@danielealbano danielealbano added this to In Progress in cachegrand via automation Nov 30, 2022
@codecov
Copy link

codecov bot commented Dec 1, 2022

Codecov Report

Base: 82.34% // Head: 82.74% // Increases project coverage by +0.41% 🎉

Coverage data is based on head (a602326) compared to base (48ba7eb).
Patch coverage: 91.43% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #262      +/-   ##
==========================================
+ Coverage   82.34%   82.74%   +0.41%     
==========================================
  Files         157      158       +1     
  Lines        9795    10240     +445     
==========================================
+ Hits         8065     8473     +408     
- Misses       1730     1767      +37     
Flag Coverage Δ
unittests 82.74% <91.43%> (+0.41%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...rc/data_structures/hashtable/spsc/hashtable_spsc.h 100.00% <ø> (ø)
...rc/data_structures/hashtable_mpmc/hashtable_mpmc.c 91.03% <91.03%> (ø)
...rc/data_structures/hashtable/spsc/hashtable_spsc.c 94.51% <100.00%> (-0.06%) ⬇️
src/epoch_gc.c 99.03% <100.00%> (+0.05%) ⬆️
src/random.c 100.00% <100.00%> (ø)
src/xalloc.c 97.14% <0.00%> (+1.43%) ⬆️
src/spinlock.h 94.44% <0.00%> (+5.56%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

…e that the bucket is altered on the right hashtable array, also properly cleanup after failing to insert because of a migration and when migrating migrating wait for temporary buckets to be cleaned up or finalized
@danielealbano danielealbano changed the title Initial implementation of a new new lockless hashtable mpmc Initial implementation of a new lockless hashtable mpmc Dec 5, 2022
@danielealbano danielealbano merged commit 7f32454 into main Dec 5, 2022
cachegrand automation moved this from In Progress to Completed Dec 5, 2022
@danielealbano danielealbano deleted the new-lockless-hashtable-mpmc branch December 5, 2022 00:00
@danielealbano danielealbano moved this from Completed to Ready for Work in cachegrand Dec 5, 2022
@danielealbano danielealbano moved this from Ready for Work to Completed in cachegrand Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
cachegrand
  
Completed
1 participant