## 1

Here's a counterexample, showing an MST which clearly has larger total length than the convex hull:

![](quasi_circle.png)

## 2

After centering the points, their new mean (their center of masses) is the origin, which means they become scattered around it and the projection planes will be able to better discriminate points that are truly different. If we don't center, the farther the mean is to the origin the more likely it is that only a few planes ever become relevant. See e.g. this example in 2D, where the blue plane is entirely non-discriminative:

![](planes_and_points.png)

## 3

If we choose

$\boldsymbol{u}^{(1)} := (1, 1, 1, 1, 0), \boldsymbol{u}^{(2)} := (0, 1, 1, 1, 1)$

and

$\boldsymbol{v}^{(1)} := (2, 1, 3, 4, 5), \boldsymbol{v}^{(2)} := (2, 3, 1, 4, 5), \boldsymbol{v}^{(3)} := (2, 3, 4, 1, 5)$,

then both $\text{MinHash}(\boldsymbol{u}^{(1)})$ and $\text{MinHash}(\boldsymbol{u}^{(2)})$ are equal to $(1, 1)$.

## 4

Parity bits:

$$\begin{cases}
b_1 = 1 \\
b_2 = 0 \\
b_4 = 0 \\
\end{cases}
$$

Data bits:

$$\begin{cases}
b_3 = 0 \\
b_5 = 1 \\
b_6 = 0 \\
b_7 = 0
\end{cases}
$$

The syndrome is:

$$\begin{cases}
0 = b_1 \oplus b_3 \oplus b_5 \oplus b_7 \\
0 = b_2 \oplus b_3 \oplus b_6 \oplus b_7 \\
1 = b_4 \oplus b_5 \oplus b_6 \oplus b_7 \\
\end{cases}
$$

Since the syndrome is not 0, there's at least one error. Let's fix it: the bit indexed by 100 (we read the syndrome from bottom to top) is the incorrect one, i.e. $b_4$. The data is uncorrupted, therefore the result is $(b_3, b_5, b_6, b_7) = (0, 1, 0, 0)$.

## 5

```python
class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

def recursion(node):
    if node is None:
        return None
    newNode = Node()
    newNode.value = node.value
    newNode.left = recursion(node.right)
    newNode.right = recursion(node.left)
    return newNode
```

## 6

We will only evict if: a) the first choice cell $h(x)$ is occupied, *and* b) the second choice cell $h(x) \oplus f(x)$ is also occupied. If $f$ is guaranteed to be non-zero, then these two cells are distinct.

Therefore, the probability will be $5/10 \text{ (probability that the first cell is occupied)} \cdot 4/9 \text{ (probability that the second cell is occupied}) = 22.2\%$. 

## 7

Remember that the optimal strategy has an "oracle"-kind of access to the whole sequence of inputs (candidates), so can always pick the best candidate with probability 1.

a. If we hire immediately the first candidate, that the probability that he/she is the best is $1/N$. Therefore, the competitive ratio is $c_r = \frac{1}{1/N} = N$.

b. If we explore the market first (interview and reject the first $N/2$ candidates), there's a 50% chance the best candidate was among that pool, in which case we missed our shot. Therefore, the competitive ratio cannot be lower than $\frac{1}{0.5} = 2$.