# Hash Tables
- Hash tables are data structures that implement a MAP ADT.
- Data is stored and retrieved by use of a function of the key.
- It is stored, but not sorted.

## Problem A: Address Generation
- Construction of the function h(k).
- Simple to calculate.
- Uniformly distribute the elements in the table.

## Problem B: Collision Resolution
- What strategy to use if two keys map to the same location h(k).

# Hash Functions and Hash Tables
- A hash function h maps keys of a given type to integers in a fixed interval [0, N-1].
- Ex: h(x) = x mod(n) is a hash function for integer keys.
- A hash table for a given key type consists of
    - Hash function h.
    - Array (called table) of size N.
- When implementing a map with a hash table, the goal is to store item (k, o) at index i = h(k).

&#8704; key K<sub>i</sub>

h(K<sub>i</sub>) = position of K in the table

h(K<sub>i</sub>) = pos, pos: integer

If we were lucky to have h(K<sub>i</sub>) != h(K<sub>g</sub>), i != g, then the search would be very efficient (O(1)).

**Definition**: Load factor of a Hash Table

&#945; = n/N, where n = # of elements and N = #of cells.

- The larger the load factor, the larger the chance of collisions.

## Address Generation
- Split problem into 2 sub-problems:
    - Hash code map: h<sub>1</sub> keys -> integers
        - h(x) = h<sub>2</sub>(h<sub>1</sub>(x))
    - Compression map: h<sub>2</sub>: integers -> [0, TableSize - 1]

## Hash Code Maps
- Hash codes reinterpret the key as an integer. They need to:
    1. Give the same result for the same key.
    2. Provide good "spread".

- Examples:
    - Memory address: we reinterpret the memory address of the key object as an integer (default hash code of all Java objects).
    - Integer cast: we reinterpret the bits of the key as an integer.
    - Component sum: we partition the bits of the key into components of fixed length and we sum the components.
    - Polynomial accumulation:
        - We partition the bits of the key into a sequence of components of fixed length a<sub>0</sub>a<sub>1</sub>...a<sub>n-1</sub>.
        - We evaluate the polynomial p(z) = a<sub>0</sub> + a<sub>1</sub>z + a<sub>2</sub>z<sup>2</sup> + ... + a<sub>n-1</sub>z<sup>n-1</sup> at a fixed value z, ignoring overflows.

## Compression Maps
- Take the output of the hash code and compress it into the desired range.
- If the result of the hash code was the same, the result of the compression map should be the same.
- Compression maps should maximize "spread" so as to minimize collisions.
- Examples:
    - Division: 
    - h<sub>2</sub>(y) = y mod(N)
    - The size N of the hash table is usually chosen to be a prime number (number theory).
    - Multiply, Add and Divide (MAD):
        - h<sub>2</sub>(y) = (ay + b) mod(N)
        - a and b are nonnegative integers such that a mod N != 0.
        - Otherwise, every integer would map to the same value b.

## Collision Resolution
- Separate chaining: if a collision occurs, use a list to store multiple keys in the same location.
- 