## Exercise 1: Hash functions
### Problem
Suppose $h$ is a $2$-universal hash function from $[n]$ to $[n^3]$. Show that $h$ is injective with probability at least $1-\dfrac1n$.
### Solution

In [8]:
from lib.universal_hash import UniversalHashTable

n = 20
nums = [i for i in range(n)]
num_simulations = 10000
checks = []

for _ in range(num_simulations):
    table = UniversalHashTable(m=n ** 3)
    table.insert_all(nums)

    checks.append(table.is_injective())

success_rate = sum(checks) / num_simulations * 100
print(f"{success_rate:.3f}%")

97.820%


## Exercise 2: The Tidemark algorithm
### Problem
The purpose of the following exercises is to walk you through part of the proof in section 2.3 of the book.
#### Part 1
Describe the indicator variables $X_{r,j}$ and $Y_r$ in your own words.

#### Part 2
Calculate $\mathbb E[X_{r,j}]$ and $\mathbb E[Y_r]$. You may use the fact that $h$ is uniform. Does $\mathbb E[X_{r,j}]$ depend on $j$, on $r$, or both?

#### Part 3
Show that $P[Y_r\ge1]\le\dfrac d{2^r}$.

#### Part 4
Show that for any random variable $X$, $\text{Var}[X]\le\mathbb E[X^2]$.

#### Part 5
Show that $\text{Var}[Y_r]\le\dfrac d{2^r}$. You may use linearity of variance (this is applicable since the $X_{r,j}$-variables are $2$-independent).

#### Part 6
Use Chebyshev to show that $P[Y_r=0]\le\dfrac{2^r}d$.

- - -
## Exercise 3: Counting rare elements
### Problem
Paul goes fishing. There are $u$ different fish species $U=\{1,\ldots, u\}$. Paul catches one fish at a time. Let $a_t$ be the fish species he catches at time $t$. Let $c_t[j]=|\{a_i|a_i=j,i\le t\}|$ be the number of times he catches a fish of species $j$ up to time $t$. Species $j$ is rare at time $t$ if it appears precisely once in his catch up to time $t$. The rarity $\rho[t]$ of his catch at time $t$ is defined as:
$$\rho(t)=\dfrac{\#\text{ rare species}}{u}$$
#### Part 1
Explain how Paul can calculate $\rho(t)$ precisely, using $2u+\log(m)$ bits of space.

#### Part 2
However, Paul wants to store only as many bits as will fit his tiny suitcase, i.e., $o(u)$, preferably $O(1)$ bits.
Therefore, Paul picks $k$ random fish species each independently, randomly with probability $1/u$ at the beginning and maintains the numbers of times each of these fish species appear in his bounty, as he catches fish one after another. Paul outputs the estimate:
$$\hat\rho(t)=\dfrac{\#\text{ rare species in the sample}}{k}$$
Let $c_1(t),\ldots,c_k(t)$ be the value of the counters at time $t$. Show that:
$$P[\hat\rho(t)\ge3\rho]\le\dfrac13$$

Hint: calculate first $P[c_i(t)=1]$.

## Exercise 4: Approximate Counting
### Problem
Solve exercise 4.1. from the book
### Solution

## Exercise 5: Hash families
### Problem
Solve exercise 2.1. from the book
### Solution