Lock free hash table #117

lyrm · 2023-12-20T17:36:35Z

Following the algorithm implemented in PR #62, this PR implements a hash table with better types and numerous optimizations to the algorithm. It aims to provide a full resizable hash table (both shrinkable and growable) and an API as close as possible to the stdlib hash table.

Algorithm in brief

The general idea of the algorithm is the same than the hash set in the previous PR (see also the chapter 13.3 of the Art of Multicore Programming, 2nd edition) : a single sorted linked list contains all the keys. The hash table itself is an array of buckets that simply serve as shortcuts (i.e. pointers) to some special nodes in the sorted linked list. As it is sorted, a find call only needs to pass though the nodes in between two of these special nodes to determine if a key is in the hash table, meaning if the hash table is grown properly, find has a O(1) complexity, as expected for an hash table.

The key advantage of the algorithm is that growing or shrinking it does not require to move nodes from one bucket to another : it just requires to add (or remove) some "special" shortcuts nodes. All the nodes actually containing data are sorted in a way they don't need to be moved and stay untouched during a grow/shrink operation.

Implemented changes to the algorithm

All main changes to the algorithm are/will be mentioned (and sometime even explained !) in this gist.

Current status

sorted linked list with multiple bindings
- implemented operations (same semantics as the stdlib hash table) : add, remove, replace, find_all, find_opt/find
- tests with stm for these functions
- tests with dscheck for remove and add
empty node are managed as expected (to replace the dummy node of the original algorithm) : they can be added and removed (with add_empty and try_remove ~empty=false) and have no impact on the size.

Work in progress

The linked list have most of the needed functionalities. Next step : hash table !

lyrm · 2023-12-22T15:31:11Z

@polytypic I pushed my solution to make the counter work. If you could have a look, that would be great :)

polytypic · 2024-01-03T22:51:28Z

test/hmap/llist_dscheck.ml

+module Sint = Set.Make (struct
+  type t = int
+
+  let compare = compare


Set.Make (Int) would be preferable. The compare here refers to the polymorphic compare, which is much slower than Int.compare.

polytypic · 2024-01-03T22:55:36Z

test/hmap/llist_dscheck.ml

+  let compare = compare
+end)
+
+let xor a b = match (a, b) with true, false | false, true -> true | _ -> false


You could also use let xor a b = a != b.

…able).

… for benchmarking).

…on with bucket node

…the hshbl code.

lyrm marked this pull request as draft December 20, 2023 17:37

lyrm force-pushed the lf_hstbl branch 3 times, most recently from 4bc227c to 0ffb2c9 Compare December 22, 2023 15:31

polytypic reviewed Jan 3, 2024

View reviewed changes

lyrm force-pushed the lf_hstbl branch from 0ffb2c9 to f830a52 Compare January 16, 2024 17:11

lyrm mentioned this pull request Jan 24, 2024

Saturn 1.0 : progress tracking #123

Open

12 tasks

Sudha247 added this to the 1.0 milestone Jan 29, 2024

lyrm mentioned this pull request Feb 1, 2024

Lock free hash table #62

Closed

8 tasks

lyrm added 15 commits February 20, 2024 14:17

Linked list for hashmap.

9e69b72

Stm test for length function.

1f34642

linked list supporting multiple bindings

07777ff

Preparation for hashtable.

46ce726

Add empty node management (that can be used as shortcuts by the hasht…

8e86c74

…able).

First draft for a non resizable hashtable.

70711b4

Hashtable with linked list code inlined (keeping both implementations…

a831146

… for benchmarking).

Resizable hashtable (WIP)

8404d2f

Debug non resizable htbl : Size was non properly managed when collisi…

e7c0239

…on with bucket node

Debugging (Size issue with bucket)

42686d1

Benchmarks for hshtbl.

2cdc91a

Bench update

7f922a5

Clean up : remove linked list module (as the code as been inlined in …

6882723

…the hshbl code.

Duplicate test for resizeable htbl.

ea9de3a

Factorize mem/find_all/find_opt under one function.

816bb66

lyrm force-pushed the lf_hstbl branch from f830a52 to 816bb66 Compare February 20, 2024 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lock free hash table #117

Lock free hash table #117

lyrm commented Dec 20, 2023 •

edited

lyrm commented Dec 22, 2023

polytypic Jan 3, 2024

polytypic Jan 3, 2024

Lock free hash table #117

Are you sure you want to change the base?

Lock free hash table #117

Conversation

lyrm commented Dec 20, 2023 • edited

Algorithm in brief

Implemented changes to the algorithm

Current status

Work in progress

lyrm commented Dec 22, 2023

polytypic Jan 3, 2024

Choose a reason for hiding this comment

polytypic Jan 3, 2024

Choose a reason for hiding this comment

lyrm commented Dec 20, 2023 •

edited