# Probability theory



## Random experiment

When we toss an unbiased coin, we say that it lands heads up with probability $\frac{1}{2}$ and tails up with probability $\frac{1}{2}$.

Such a coin toss is an example of a **random experiment** and the set of **outcomes** of this random experiment is the **sample space** $\Omega = \{h, t\}$, where $h$ stands for "heads" and $t$ stands for tails.

What if we toss a coin twice? We could view the two coin tosses as a single random experiment with the sample space $\Omega = \{hh, ht, th, tt\}$, where $ht$ (for example) denotes "heads on the first toss", "tails on the second toss".

What if, instead of tossing a coin, we roll a die? The sample space for this random experiment is $\Omega = \{1, 2, 3, 4, 5, 6\}$.

## Events

An **event**, then, is a subset of the sample space. In our example of the two consecutive coin tosses, getting heads on all coin tosses is an event:
$$A = \text{"getting heads on all coin tosses"} = \{hh\} \subseteq \{hh, ht, th, tt\} = \Omega.$$

Getting distinct results on the two coin tosses is also an event:
$$D = \{ht, th\} \subseteq \{hh, ht, th, tt\} = \Omega.$$

We can simulate a coin toss in Python as follows:

In [29]:
import numpy as np
np.random.seed(42)
np.random.randint(0, 2)

0

(Let's say 0 is heads and 1 is tails.)

Similarly, in our roll-of-a-die example, the following are all events:
$$S = \text{"six shows up"} = \{6\} \subseteq \{1, 2, 3, 4, 5, 6\} = \Omega,$$
$$E = \text{"even number shows up"} = \{2, 4, 6\} \subseteq \{1, 2, 3, 4, 5, 6\} = \Omega,$$
$$O = \text{"odd number shows up"} = \{1, 3, 5\} \subseteq \{1, 2, 3, 4, 5, 6\} = \Omega.$$
The empty set, $\emptyset = \{\}$, represents the **impossible event**, whereas the sample space $\Omega$ itself represents the **certain event**: one of the numbers $1, 2, 3, 4, 5, 6$ always occurs when a die is rolled, so $\Omega$ always occurs.

We can simulate the roll of a die in Python as follows:

In [17]:
np.random.randint(1, 7)

4

If we get 4, say, $S$ has not occurred, since $4 \notin S$; $E$ has occurred, since $4 \in E$; $O$ has not occurred, since $4 \notin O$.

When all outcomes are equally likely, and the sample space is finite, the probability of an event $A$ is given by $$\mathbb{P}(A) = \frac{|A|}{|\Omega|},$$ where $|\cdot|$ denotes the number of elements in a given set.

Thus, the probability of the event $E$, "even number shows up" is equal to $$\mathbb{P}(A) = \frac{|E|}{|\Omega|} = \frac{3}{6} = \frac{1}{2}.$$

If Python's random number generator is decent enough, we should get pretty close to this number by simulating die rolls:

In [21]:
outcomes = np.random.randint(1, 7, 100)
len([x for x in outcomes if x % 2 == 0]) / len(outcomes)

0.49

Here we have used 100 simulated "rolls". If we used, 1000000, say, we would get even closer to $\frac{1}{2}$:

In [25]:
outcomes = np.random.randint(1, 7, 1000000)
len([x for x in outcomes if x % 2 == 0]) / len(outcomes)

0.500907