Merge pull request #77 from moka-rs/import-cht

Import cht source files for better integration
moka-rs · Feb 1, 2022 · d4d9d9c · d4d9d9c
2 parents 3ce0b4f + a9671c3
commit d4d9d9c
Show file tree

Hide file tree

Showing 15 changed files with 2,711 additions and 7 deletions.
diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -29,6 +29,7 @@
         "Ohad",
         "peekable",
         "preds",
+        "repr",
         "reqwest",
         "runtimes",
         "rustdoc",
@@ -41,6 +42,7 @@
         "thiserror",
         "toolchain",
         "trybuild",
+        "Uninit",
         "unsync",
         "Upsert",
         "usize"

diff --git a/Cargo.toml b/Cargo.toml
@@ -36,8 +36,8 @@ atomic64 = []
 # borrow violations found by Miri.
 # https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-channel/CHANGELOG.md#version-052
 crossbeam-channel = "0.5.2"
+crossbeam-epoch = "0.8.2"
 crossbeam-utils = "0.8"
-moka-cht = "0.4.2"
 num_cpus = "1.13"
 once_cell = "1.7"
 parking_lot = "0.11"

diff --git a/README.md b/README.md
@@ -492,6 +492,20 @@ This name would imply the following facts and hopes:
 [moka-pot-wikipedia]: https://en.wikipedia.org/wiki/Moka_pot
 
 
+## Credits
+
+### cht
+
+The source files of the concurrent hash table under `moka::cht` module were copied
+from the [cht v0.4.1][cht-v041] and modified by us. We did so for better integration
+between Moka and cht.
+
+The cht is authored by Gregory Meyer and its v0.4.1 and earlier versions are licensed
+under the MIT license.
+
+[cht-v041]: https://github.com/Gregory-Meyer/cht/tree/v0.4.1
+
+
 ## License
 
 Moka is distributed under either of

diff --git a/src/cht.rs b/src/cht.rs
@@ -0,0 +1,81 @@
+//! Lock-free hash tables.
+//!
+//! The hash tables in this crate are, at their core, open addressing hash
+//! tables implemented using open addressing and boxed buckets. The core of
+//! these hash tables are bucket arrays, which consist of a vector of atomic
+//! pointers to buckets, an atomic pointer to the next bucket array, and an
+//! epoch number. In the context of this crate, an atomic pointer is a nullable
+//! pointer that is accessed and manipulated using atomic memory operations.
+//! Each bucket consists of a key and a possibly-uninitialized value.
+//!
+//! The key insight into making the hash table resizable is to incrementally
+//! copy buckets from the old bucket array to the new bucket array. As buckets
+//! are copied between bucket arrays, their pointers in the old bucket array are
+//! CAS'd with a null pointer that has a sentinel bit set. If the CAS fails,
+//! that thread must read the bucket pointer again and retry copying it into the
+//! new bucket array. If at any time a thread reads a bucket pointer with the
+//! sentinel bit set, that thread knows that a new (larger) bucket array has
+//! been allocated. That thread will then immediately attempt to copy all
+//! buckets to the new bucket array. It is possible to implement an algorithm in
+//! which a subset of buckets are relocated per-thread; such an algorithm has
+//! not been implemented for the sake of simplicity.
+//!
+//! Bucket pointers that have been copied from an old bucket array into a new
+//! bucket array are marked with a borrowed bit. If a thread copies a bucket
+//! from an old bucket array into a new bucket array, fails to CAS the bucket
+//! pointer in the old bucket array, it attempts to CAS the bucket pointer in
+//! the new bucket array that it previously inserted to. If the bucket pointer
+//! in the new bucket array does *not* have the borrowed tag bit set, that
+//! thread knows that the value in the new bucket array was modified more
+//! recently than the value in the old bucket array. To avoid discarding updates
+//! to the new bucket array, a thread will never replace a bucket pointer that
+//! has the borrowed tag bit set with one that does not. To see why this is
+//! necessary, consider the case where a bucket pointer is copied into the new
+//! array, removed from the new array by a second thread, then copied into the
+//! new array again by a third thread.
+//!
+//! Mutating operations are, at their core, an atomic compare-and-swap (CAS) on
+//! a bucket pointer. Insertions CAS null pointers and bucket pointers with
+//! matching keys, modifications CAS bucket pointers with matching keys, and
+//! removals CAS non-tombstone bucket pointers. Tombstone bucket pointers are
+//! bucket pointers with a tombstone bit set as part of a removal; this
+//! indicates that the bucket's value has been moved from and will be destroyed
+//! if it has not been already.
+//!
+//! As previously mentioned, removing an entry from the hash table results in
+//! that bucket pointer having a tombstone bit set. Insertions cannot
+//! displace a tombstone bucket unless their key compares equal, so once an
+//! entry is inserted into the hash table, the specific index it is assigned to
+//! will only ever hold entries whose keys compare equal. Without this
+//! restriction, resizing operations could result in the old and new bucket
+//! arrays being temporarily inconsistent. Consider the case where one thread,
+//! as part of a resizing operation, copies a bucket into a new bucket array
+//! while another thread removes and replaces that bucket from the old bucket
+//! array. If the new bucket has a non-matching key, what happens to the bucket
+//! that was just copied into the new bucket array?
+//!
+//! Tombstone bucket pointers are typically not copied into new bucket arrays.
+//! The exception is the case where a bucket pointer was copied to the new
+//! bucket array, then CAS on the old bucket array fails because that bucket has
+//! been replaced with a tombstone. In this case, the tombstone bucket pointer
+//! will be copied over to reflect the update without displacing a key from its
+//! bucket.
+//!
+//! This hash table algorithm was inspired by [a blog post by Jeff Phreshing]
+//! that describes the implementation of the Linear hash table in [Junction], a
+//! C++ library of concurrent data structures. Additional inspiration was drawn
+//! from the lock-free hash table described by Cliff Click in [a tech talk] given
+//! at Google in 2007.
+//!
+//! [a blog post by Jeff Phreshing]: https://preshing.com/20160222/a-resizable-concurrent-map/
+//! [Junction]: https://github.com/preshing/junction
+//! [a tech talk]: https://youtu.be/HJ-719EGIts
+
+pub(crate) mod map;
+pub(crate) mod segment;
+
+// #[cfg(test)]
+// #[macro_use]
+// pub(crate) mod test_util;
+
+pub(crate) use segment::HashMap as SegmentedHashMap;
diff --git a/src/cht/map.rs b/src/cht/map.rs
@@ -0,0 +1,10 @@
+//! A lock-free hash map implemented with bucket pointer arrays, open addressing,
+//! and linear probing.
+
+pub(crate) mod bucket;
+pub(crate) mod bucket_array_ref;
+
+use std::collections::hash_map::RandomState;
+
+/// Default hasher for `HashMap`.
+pub type DefaultHashBuilder = RandomState;