High performance, compact and ethical C++ hash sets and maps:
- Fast open addressing hash set tuned for small sets.
- Fast open addressing hash map tuned for small maps.
- Fast open addressing hash set with link list tuned for small maps with predictable iteration order.
- Fast open addressing hash map with link list tuned for small maps with predictable iteration order.
These hash sets and maps are open-addressing hashtables similar
to google/dense_hash_map
, but they use tombstone bitmaps to
eliminate the necessity for empty or deleted key sentinels.
These containers implement most of the C++ unordered associative
container requrements thus can be substituted for unordered_map
and typical use cases including C++11 for-loop.
There is currently no equivalent to linked_hash_map
in the STL so it
is perhaps somewhat like an ordered_map
that uses the pos iterator
hint in the first argument of insert to set the position of a new
value inserted into the linked list, while find uses the hashed key
index to locate key value pairs in the hash table.
These maps are designed for a small memory footprint. There is one structure and one heap allocation which is divided between an array of tuples composed of (Key, Value) pairs and a tombstone bitmap. Key is hashed to index into the open addressing array, and collisions are handled by skipping to the next available slot.
Deletions requires the use of tombstones. Sentinel key values occupy key-space, and make containers incompatible with associative container requirements, thus a 2-bit tombstone array is used. 2-bits are used to distinguish available, occupied, and deleted states.
enum bitmap_state {
available = 0, occupied = 1, deleted = 2, recycled = 3
};
linked_hash_map adds (next, prev) indices to the array tuple, (head, tail) indices to the structure.
The follow table shows memory usage for the default 16 slot map (table units are in bytes):
map | struct | malloc |
---|---|---|
hash_set | 40 | 68 |
hash_set | 40 | 132 |
hash_map<u32,u32> | 40 | 132 |
hash_map<u64,u64> | 40 | 260 |
linked_hash_set<u32,i32> | 48 | 198 |
linked_hash_set<u64,i64> | 56 | 388 |
linked_hash_map<u32,u32,i32> | 48 | 260 |
linked_hash_map<u64,u64,i64> | 56 | 516 |
The following table shows the structure size in bytes on x86_64:
map | size |
---|---|
sizeof(hash_map) | 40 |
sizeof(linked_hash_map) | 48 |
sizeof(absl::flat_hash_map) | 48 |
sizeof(std::unordered_map) | 56 |
sizeof(tsl::robin_map) | 80 |
sizeof(google::dense_hash_map) | 88 |
linked_hash_map by default uses 32-bit integers for indices, limiting it to 2^31 entries, however, it can be instantiated with alternative integer types for indices.
cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build