# Probability Theory By Example

### Axioms of Probability

**Probabilities**

A _probability function_ maps events onto real values ${P: \mathcal{A} \subseteq \mathcal{S} \rightarrow [0,1]}$.

The probability of an event $\mathcal{A}$ in the given sample space $\mathcal{S}$, denoted $P(\mathcal{A})$
satisfies the following properties (Kolmogorov, 1933):

Example: Rolling a regular die
- Sample space: $\mathcal{S} = \{1, 2, 3, 4, 5, 6\}$
- Event space: $\mathcal{A}_1 = \{1\}$, $\ldots$, $\mathcal{A}_6 = \{6\}$
- Probabilities (fair die): $P(\mathcal{A}_1) = \ldots = P(\mathcal{A}_6) = \frac16$

**Axiom 1**

$P(\mathcal{A}) \geq 0$: Probability of any event $\mathcal{A}$ is a non-negative real number

Example (cont'd): $P(\mathcal{A}_5) = -\frac12$ does not make any sense

**Axiom 2**

$P(\mathcal{S}) = 1$: Probability of the entire sample space is $1$

Example (cont'd): The throw will produce *some* number

**Axiom 3**

$P(\bigcup_{i=1}^{\infty} \mathcal{A}_i) = \sum_{i=1}^{\infty} P(\mathcal{A}_i)$:
Probability for any combination of events to occur is equal to the sum of their individual
probabilities

This assumes a _mutually exclusive_ set of events, i.e. $\; \mathcal{A}_i \cap
\mathcal{A}_j = \emptyset\;\;\;\forall i \neq j$

Example (cont'd): 
- Define an event $\mathcal{E} := \mathcal{A}_5 \cup \mathcal{A}_6$
- Since $\mathcal{A}_5$ and $\mathcal{A}_6$ are mutually exclusive, $P(\mathcal{E}) = P(\mathcal{A}_5 \cup \mathcal{A}_6) = P(\mathcal{A}_5) + P(\mathcal{A}_6)$



### Random Variables (RVs)

The value of a random variable $X$ is a subset of the underlying sample space or simply an event.

**Discrete case: probability mass functions (pmf)**

$P(X = x)$ is the probability that $X$ takes on the value $x$

Example (cont'd): $P(X = 5) = \frac16$

**Continuous case: probability density functions (pdf)**

Consider RV $H$ describing the height of a person

- $P(H = 1.87965)$ makes little sense as the height is continuous
- $p(H = h)$ is the probability density at $h$

Properties of a continuous RV $X$ (analagous to discrete case):

- ${\displaystyle p(x) \geq 0 }$
- ${\displaystyle \int_{-\infty}^{\infty} p(x) \, dx = 1 }$

Computing actual probabilities: $\displaystyle P(X\in(a, b]) = \int _ {a}^{b} p(x) \,d x$

**Example**

$P(X\in(-2, 3]) = \int _ {-2}^{3} p(x) \,d x$

<img src="img/prob_density_2.svg" style="width: 300px;"/>


### Mean and Variance

**Mean**

Discrete RV: $\displaystyle \mu_X = \text{E}[X] = \sum_i x_i\,p_i \quad$

Continuous RV: $\displaystyle \mu_X = \text{E}[X] = \int_{-\infty}^\infty x\,p(x) \,d x$

**Variance**

Variance (both discrete and continuous): $\sigma_X^2 = \mathrm{Var}(X) = \text{E}\left[(X-\mu_X)^2\right]$

**Example: Parking the car**

You're given two options

1. Park car in parking garage. Cost: CHF 24
2. Park car around the corner and risk a fine. Cost: CHF 40 if caught. Chance of getting caught: $p = 3/4$

Let $X_1$ and $X_2$ be the cost of option 1 and 2, respectively

Compute expected cost in both cases:

1. $\text{E}[X_1] = 24 \cdot 1 = 24$
2. $\text{E}[X_2] = 40 \cdot \frac34 + 0 \cdot \frac13 = 30$

So: Economically speaking, in expectation it's better to use the parking garage
