# Probability & Odds

| Concept | Formula | Range | Example | Interpretation |
|----------|----------|--------|-----------|----------------|
| **Probability (Wahrscheinlichkeit)** | $P = \frac{\text{\# successes}}{\text{\# total}}$ | $0 \to 1$ | Beim Würfeln eine 6 zu werfen:<br> $P(6) = \frac{1}{6} \approx 0.1667,  \approx 16,7\%$ | “16,7% chance of success” → misst, wie wahrscheinlich ein Ereignis eintritt. |
| **Odds (Quoten)** | $O = \frac{P(\text{Erfolg})}{1 - P(\text{Erfolg})}$ | $0 \to \infty$ | $\text{Odds}(6) = \frac{1/6}{5/6} = \frac{1}{5} = 0{,}2$ | “About 1 success per 5 failures” (wegen Kehrwert 5/1) → misst das Verhältnis zwischen „Ereignis tritt ein“ und „Ereignis tritt nicht ein“. |
| **Log-Odds (Logit)** | $\text{logit}(P) = \ln\!\left(\frac{P}{1 - P}\right)$ | $-\infty \to \infty$ | $\ln(0.2) = -1,609$ | Used by logistic regression → stellt eine lineare Beziehung her. |


## Probability
Assume a dataset:
* 10000 values
* 30% of the values are 1
* 70% are 0

If I pick randomly a value out of the dataset, how LIKELY is it, I pick 1 vs. 0?

**Step 1: What we know**
* Total number of observations: $N = 10{,}000$
* Fraction of ones (class 1): $30\% = 0.3$
* Fraction of zeros (class 0): $70\% = 0.7$

**Step 2: Express as probabilities**
When you draw a single item at random from the dataset, you’re sampling from a **discrete distribution**.

$P(1) = \frac{\text{\# of ones}}{N} = \frac{0.3 \times 10{,}000}{10{,}000} = 0.3$

$P(0) = \frac{\text{\# of zeros}}{N} = \frac{0.7 \times 10{,}000}{10{,}000} = 0.7$

**Result:**
* Chance of picking a 1 = 30%
* Chance of picking a 0 = 70%

## Odds
Sometimes, especially in logistic regression, we use odds instead of probabilities.

$\text{Odds of 1} = \frac{P(1)}{P(0)} = \frac{0.3}{0.7} = 0.4286$

**These odds mean:**
“For every 1 that appears, about 2.33 zeros appear” (because 1 / 0.4286 ≈ 2.33).

**And the log-odds would be:**
$\text{logit}(P(1)) = \ln\left(\frac{P(1)}{P(0)}\right) = \ln(0.4286) = -0.847$