# Rechenaufgaben zu 2.5 (Bayes-Klassifikation)

---

## Rechenaufgabe 1

Sei ein Datensatz $D$ gegeben durch

$$
D = \{((1,0),c_1),((0,1),c_1),((1,1),c_2),((0,0),c_2),((0,1),c_1)\}
$$

wobei $Z_1=Z_2=\{0,1\}$ sowie die Klassen $Z = \{c_1, c_2\}$.

**Aufgaben**:

1. Bestimmen Sie $P(c_1 \mid D)$ und $P(c_2 \mid D)$ mit dem Ansatz der Naiven-Bayes-Klassifikation.
2. Bestimmen Sie die Wahrscheinlichkeitsverteilungen der einzelnen Merkmalsausprägungen bzgl. der beiden Klassen.
    - $P(x_1 = 0 \mid c=c_1, D)$
    - $P(x_1 = 0 \mid c=c_2, D)$
    - $P(x_1 = 1 \mid c=c_1, D)$
    - $P(x_1 = 1 \mid c=c_2, D)$
    - $P(x_2 = 0 \mid c=c_1, D)$
    - $P(x_2 = 0 \mid c=c_2, D)$
    - $P(x_2 = 1 \mid c=c_1, D)$
    - $P(x_2 = 1 \mid c=c_2, D)$
3. Gegeben sei nun der Datenpunkt $(0,0)$. Bestimmen Sie die Wahrscheinlichkeiten, dass dieser Datenpunkt $c_1$ bzw. $c_2$ angehört.
4. Welcher Klasse gehört der neue Datenpunkt laut den obigen Berechnungen an?

In [27]:
D = [
    ((1, 0), "c_1"),
    ((0, 1), "c_1"),
    ((1, 1), "c_2"),
    ((0, 0), "c_2"),
    ((0, 1), "c_1"),
]

x_vals = [d[0] for d in D]
y_hats = [d[1] for d in D]


def probability_of_class(cls: str) -> float:
    """That is, P(cls | D)"""
    return y_hats.count(cls) / len(y_hats)


def prob_of_x_given_class(val: int, dim: int, cls: str):
    """That is, P(x_dim = val | c=cls, D)"""
    dim -= 1  # off-by-one adjustment
    nominator = len(
        [(x, y) for (x, y) in zip(x_vals, y_hats) if x[dim] == val and y == cls]
    )
    denominator = y_hats.count(cls)

    return nominator / denominator


def prob_of_class_given_x(cls: str, val: tuple[int, int]):
    """That is, P(cls | val, D) = P(cls | D) * P(val_1 | cls, D) * P(val_2 | cls, D)"""
    val_1, val_2 = val

    return (
        probability_of_class(cls)
        * prob_of_x_given_class(val_1, 1, cls)
        * prob_of_x_given_class(val_2, 2, cls)
    )


print("--- Aufgabe 1 ---")
print(f"P(c_1 | D) = {probability_of_class('c_1')}")
print(f"P(c_2 | D) = {probability_of_class('c_2')}")

print()

print("--- Aufgabe 2 ---")
print(f"P(x_1 = 0 | c=c_1, D) = {prob_of_x_given_class(val=0, dim=1, cls='c_1'):0.3f}")
print(f"P(x_1 = 0 | c=c_2, D) = {prob_of_x_given_class(val=0, dim=1, cls='c_2'):0.3f}")
print(f"P(x_1 = 1 | c=c_1, D) = {prob_of_x_given_class(val=1, dim=1, cls='c_1'):0.3f}")
print(f"P(x_1 = 1 | c=c_2, D) = {prob_of_x_given_class(val=1, dim=1, cls='c_2'):0.3f}")
print(f"P(x_2 = 0 | c=c_1, D) = {prob_of_x_given_class(val=0, dim=2, cls='c_1'):0.3f}")
print(f"P(x_2 = 0 | c=c_2, D) = {prob_of_x_given_class(val=0, dim=2, cls='c_2'):0.3f}")
print(f"P(x_2 = 1 | c=c_1, D) = {prob_of_x_given_class(val=1, dim=2, cls='c_1'):0.3f}")
print(f"P(x_2 = 1 | c=c_2, D) = {prob_of_x_given_class(val=1, dim=2, cls='c_2'):0.3f}")

print()

print("--- Aufgabe 3 ---")
print(f"P(c_1 | (0,0), D) = {prob_of_class_given_x('c_1', (0, 0)):0.3f}")
print(f"P(c_2 | (0,0), D) = {prob_of_class_given_x('c_2', (0, 0)):0.3f}")

print()

print("--- Aufgabe 4 ---")
print(f"cls(x) = c_1")

--- Aufgabe 1 ---
P(c_1 | D) = 0.6
P(c_2 | D) = 0.4

--- Aufgabe 2 ---
P(x_1 = 0 | c=c_1, D) = 0.667
P(x_1 = 0 | c=c_2, D) = 0.500
P(x_1 = 1 | c=c_1, D) = 0.333
P(x_1 = 1 | c=c_2, D) = 0.500
P(x_2 = 0 | c=c_1, D) = 0.333
P(x_2 = 0 | c=c_2, D) = 0.500
P(x_2 = 1 | c=c_1, D) = 0.667
P(x_2 = 1 | c=c_2, D) = 0.500

--- Aufgabe 3 ---
P(c_1 | (0,0), D) = 0.133
P(c_2 | (0,0), D) = 0.100

--- Aufgabe 4 ---
cls(x) = c_1
