# Week 09

Implement the bootstrapping algorithm from the lectures:

![](bootstrapping_algo.png)

You have two algorithms, which one is better? Both algorithms try to predict the sum of two dices, one of them randomly the other one using the *maximum likelihood*:

- $\mathcal A_1$: randomly guessing
- $\mathcal A_2$: return the most likely value


Chances of values when rolling two dices:

| Sum \(k\) | \# of outcomes | Probability \(P(k)\) |
|:---------:|:--------------:|:--------------------:|
| 2         | 1              | 1/36                 |
| 3         | 2              | 2/36                 |
| 4         | 3              | 3/36                 |
| 5         | 4              | 4/36                 |
| 6         | 5              | 5/36                 |
| 7         | 6              | 6/36                 |
| 8         | 5              | 5/36                 |
| 9         | 4              | 4/36                 |
| 10        | 3              | 3/36                 |
| 11        | 2              | 2/36                 |
| 12        | 1              | 1/36                 |

so the most likely value is 7 and the 2nd Algorith will always guess 7.

In [3]:
import numpy as np
import random
random.seed(0)

N = 1000000

def roll_two_dice():
    return random.randint(1, 6) + random.randint(1, 6)

def generate_dice_tosses(n = N):
    return np.array([roll_two_dice() for _ in range(n)])

def alg_1():
    return roll_two_dice()

def alg_2():
    return 7

def bootstrap_compare(test_set, b=1000):
    pred1 = np.array([alg_1() for _ in range(N)])
    pred2 = np.array([alg_2() for _ in range(N)])

    acc1 = np.mean(pred1 == test_set)
    acc2 = np.mean(pred2 == test_set)

    d = (pred2 == test_set).astype(int) - (pred1 == test_set).astype(int)
    delta_obs = np.mean(d)

    count = 0
    for _ in range(b):
        sample = np.random.choice(d, size=N, replace=True)
        if np.mean(sample) >= 2 * delta_obs:
            count += 1
    p_value = count / b

    return acc1, acc2, delta_obs, p_value

test = generate_dice_tosses()

acc1, acc2, delta_obs, p_val = bootstrap_compare(test, b=1000)

# 3) Print the results
print(f"Accuracy of alg_1 (random guess):   {acc1:.4f}")
print(f"Accuracy of alg_2 (always guess 7): {acc2:.4f}")
print(f"Observed delta acc2 − acc1:         {delta_obs:.4f}")
print(f"Bootstrap p-value:                  {p_val:.4f}")

Accuracy of alg_1 (random guess):   0.1134
Accuracy of alg_2 (always guess 7): 0.1672
Observed acc2 − acc1:               0.0538
Bootstrap p-value:                  0.0000


der Algorithmus welcher immer 7 ist durchschnittlich in ca 5 % der Fälle besser als der zufällige Algorithmus.

der p-value von 0 Zeigt, dass dieses Ergebnis statistisch sehr signifikant ist.

To do:

- Either derive the most likely value by hand or simulate and count.
- Implement the bootstrapping algorithm and compare the two algorithms.