Skip to content

Hashmap

BrandonRobare edited this page Jun 2, 2026 · 1 revision

Hashmap

Folder: 07-hashmap

What it is

A hash table written from scratch as a class template. It maps keys to values in roughly constant time by hashing each key to a bucket. Collisions, where two keys land in the same bucket, are handled by separate chaining: every bucket is a linked list, and colliding keys sit in the same list. This is the strongest data structure in the repository, so the page goes deep.

The pieces

The hash function is a small functor, DefaultHash<T>. It sums the bytes of the key and takes the result modulo the bucket count. The container is parameterized on the key type, the value type, an equality comparator (std::equal_to by default), and the hash functor, which mirrors how std::unordered_map is parameterized.

Inside, the table is a vector<list<Element>>, where Element is pair<const Key, Value>. The outer vector is the bucket array. Each inner list is one bucket's chain.

Bucket structure

flowchart LR
    subgraph BucketArray["elems_ : vector of buckets"]
        b0["bucket 0"]
        b1["bucket 1"]
        b2["bucket 2"]
        bn["bucket n-1"]
    end

    b0 --> e0["(k4, 40)"]
    b1 --> e1a["(k6, 60)"] --> e1b["(k107, 12)"]
    b2 --> empty["empty"]
    bn --> en["(k88, 7)"]

    key["key"] -->|hash mod numBuckets| BucketArray
Loading

The diagram shows a collision in bucket 1: keys k6 and k107 both hashed there, so they share a chain.

How the operations work

find hashes the key to a bucket, then walks that bucket's list with the helper findElement. It returns a pointer to the matching element or nullptr.

insert is a safe insert. It hashes the key, looks in the bucket, and if the key is absent it appends the pair and returns make_pair(pointer, true). If the key is already there it returns make_pair(pointer_to_existing, false), so the caller can tell an insertion from a no-op. This is the same contract std::map::insert offers.

erase removes by key and returns a pair. The pointer is the element that followed the erased one, even when that successor lives in a later bucket, which is why erase scans forward through the bucket array if the current bucket runs out. The boolean reports whether anything was removed.

operator[] reuses insert so the key is looked up only once. It inserts a default-constructed value if the key is missing, then returns a reference to the value either way.

rehash grows the table. When asked for more buckets, it builds a new bucket vector and a new hash functor sized to the new count, then re-hashes every existing element into its new bucket. If the requested size is not larger than the current size it does nothing.

Structure

classDiagram
    class DefaultHash~T~ {
        -size_t numBuckets_
        +hash(T) size_t
        +numBuckets() size_t
    }
    class hashmap~Key, Value, Compare, Hash~ {
        -size_t size_
        -Compare comp_
        -Hash hash_
        -vector~list~Element~~ elems_
        +find(Key) Element*
        +insert(Element) pair~Element*, bool~
        +erase(Key) pair~Element*, bool~
        +operator[](Key) Value&
        +rehash(size_t) void
    }
    hashmap~Key, Value, Compare, Hash~ *-- DefaultHash~T~ : hashes keys with
Loading

Build and run

g++ -std=c++17 07-hashmap/testHashmap.cpp -o testHashmap
./testHashmap

g++ -std=c++17 07-hashmap/testHashmapUpdated.cpp -o testHashmapUpdated
./testHashmapUpdated

testHashmap.cpp reads a few employee numbers and names from standard input; testHashmapUpdated.cpp runs assertions with no input.

Notes

The container skeleton and the default hash came from the course, adapted from Professional C++. My work is the safe insert returning a pair, the erase that returns the next element across bucket boundaries, the single-lookup operator[], and rehash. One caveat carried over from the skeleton: DefaultHash hashes the raw bytes of the key, so it works for integer keys but not for std::string keys, which would need a string-aware hash functor.

Clone this wiki locally