Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock free hash table #117

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft

Lock free hash table #117

wants to merge 15 commits into from

Conversation

lyrm
Copy link
Collaborator

@lyrm lyrm commented Dec 20, 2023

Following the algorithm implemented in PR #62, this PR implements a hash table with better types and numerous optimizations to the algorithm. It aims to provide a full resizable hash table (both shrinkable and growable) and an API as close as possible to the stdlib hash table.

Algorithm in brief

The general idea of the algorithm is the same than the hash set in the previous PR (see also the chapter 13.3 of the Art of Multicore Programming, 2nd edition) : a single sorted linked list contains all the keys. The hash table itself is an array of buckets that simply serve as shortcuts (i.e. pointers) to some special nodes in the sorted linked list. As it is sorted, a find call only needs to pass though the nodes in between two of these special nodes to determine if a key is in the hash table, meaning if the hash table is grown properly, find has a O(1) complexity, as expected for an hash table.

The key advantage of the algorithm is that growing or shrinking it does not require to move nodes from one bucket to another : it just requires to add (or remove) some "special" shortcuts nodes. All the nodes actually containing data are sorted in a way they don't need to be moved and stay untouched during a grow/shrink operation.

Implemented changes to the algorithm

All main changes to the algorithm are/will be mentioned (and sometime even explained !) in this gist.

Current status

  • sorted linked list with multiple bindings
    • implemented operations (same semantics as the stdlib hash table) : add, remove, replace, find_all, find_opt/find
    • tests with stm for these functions
    • tests with dscheck for remove and add
  • empty node are managed as expected (to replace the dummy node of the original algorithm) : they can be added and removed (with add_empty and try_remove ~empty=false) and have no impact on the size.

Work in progress

The linked list have most of the needed functionalities. Next step : hash table !

@lyrm lyrm marked this pull request as draft December 20, 2023 17:37
@lyrm lyrm force-pushed the lf_hstbl branch 3 times, most recently from 4bc227c to 0ffb2c9 Compare December 22, 2023 15:31
@lyrm
Copy link
Collaborator Author

lyrm commented Dec 22, 2023

@polytypic I pushed my solution to make the counter work. If you could have a look, that would be great :)

module Sint = Set.Make (struct
type t = int

let compare = compare
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set.Make (Int) would be preferable. The compare here refers to the polymorphic compare, which is much slower than Int.compare.

let compare = compare
end)

let xor a b = match (a, b) with true, false | false, true -> true | _ -> false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also use let xor a b = a != b.

@lyrm lyrm mentioned this pull request Jan 24, 2024
12 tasks
@Sudha247 Sudha247 added this to the 1.0 milestone Jan 29, 2024
@lyrm lyrm mentioned this pull request Feb 1, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants