# Probability

Open in Google Colab: [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/febse/stat2025/blob/main/03-Probability-Class.ipynb)

In [2]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

## Goats, Cars, and Probability

https://www.mathwarehouse.com/monty-hall-simulation-online/

## 10 Bottles

<div style="font-size: 48px;">
🍾 🍾 🍾 🍾 🍾 🍾 🍾 🍾 🔴 🔴
</div>

Bottles 1-8: No prize (🍾)  
Bottles 9-10: Prize! (🔴)

In [3]:
beer = pd.DataFrame({
    "Bottle": np.arange(1, 11),
    "Prize": [0, 0, 0, 0, 0, 0, 0, 0, 1, 1]
})
beer

Unnamed: 0,Bottle,Prize
0,1,0
1,2,0
2,3,0
3,4,0
4,5,0
5,6,0
6,7,0
7,8,0
8,9,1
9,10,1


In [5]:
beer.sample(n=1)

Unnamed: 0,Bottle,Prize
9,10,1


In [35]:
samples = beer.sample(n=10_000, replace=True)
samples["Prize"].sum()

np.int64(1930)

## The sample space and probability law of a fair coin toss

You take two identical coins and toss them simultaneously. We want to describe the possible outcomes of this experiment and assign probabilities to each outcome.

- Write down all possible outcomes of two tosses of a fair coin, using H for heads and T for tails.
- Write down a probability law for this experiment

In [None]:
np.random.choice(['Heads', 'Tails'], size=2)

array(['Tails', 'Heads'], dtype='<U5')

## Definition of Probability

A probability law is a _function_ that assigns a number between 0 and 1 to each set (event) in the sample space. Let $\Omega$ be a non-empty set and
let $A$ and $B$ be subsets of $\Omega$. A probability law must satisfy the following properties:

1. $P(A) \geq 0$ for all $A \subseteq \Omega$ (all events have non-negative probability).
2. $P(\emptyset) = 0$. (The probability of the empty set is 0).
3. $P(\Omega) = 1$. (The probability of the entire sample space is 1).
4. If $A$ and $B$ are disjoint, i.e. $A \cap B = \emptyset$, then $P(A \cup B) = P(A) + P(B)$.
   
   

Probability of the Union of Two Events

$$
P(A \cup B) = P(A) + P(B) - P(A \cap B)
$$

The probability of an event is one minus the probability of the complement of the event.

$$
P(A) = 1 - P(A^c)
$$


:::{#exr-sample-space-coin}
## The Sample Space of Three Coin Flips

Consider an experiment of flipping a coin three times (each toss can result either in a head (H) or in a tail (T) ). 

- Write down the sample space of the experiment.
- Write down the set $A$ consisting of all outcomes where the first flip is a head.
- Write down the set $B$ consisting of all outcomes where the number of heads is even.
- Assume that the coin is fair and that all outcomes are equally likely. What are the probabilities of the events $A$ and $B$?
:::

In [None]:
# A coin flipping game

# Here we simulate the result of 3 coin flips repeated 10 times (1 for heads, 0 for tails)
# np.random.choice selects a value at random from the given list
# The size specifies the number of times the selection is made
# In our case, we are selecting 3 values 10 times and storing the results in coin_3flips_df

flips_array = np.random.choice([1, 0], size=[10, 3])
flips = pd.DataFrame(flips_array, columns=['Flip1', 'Flip2', 'Flip3'])
flips

Unnamed: 0,Flip1,Flip2,Flip3
0,1,0,1
1,1,1,0
2,0,1,1
3,1,1,1
4,0,1,1
5,0,0,1
6,1,0,1
7,1,1,1
8,1,0,1
9,1,1,0


In [None]:
# The _number_ of games where the _first_ coin landed on heads (1) in each of the 10 game

flips["Flip1"].sum()

np.int64(7)

In [None]:
# The _proportion_ of games where the number of heads in each game is even

flips['TotalHeads'] = flips.sum(axis=1)
flips['EvenHeads'] = (flips['TotalHeads'] % 2 == 0)
flips['EvenHeads'].mean()

np.float64(0.7)

In [None]:
flips

Unnamed: 0,Flip1,Flip2,Flip3,TotalHeads,EvenHeads
0,1,0,1,2,True
1,1,1,0,2,True
2,0,1,1,2,True
3,1,1,1,3,False
4,0,1,1,2,True
5,0,0,1,1,False
6,1,0,1,2,True
7,1,1,1,3,False
8,1,0,1,2,True
9,1,1,0,2,True


## The Sample Space of Two Die Rolls

Consider an experiment of rolling a four sided die twice.

![Four sided die](https://upload.wikimedia.org/wikipedia/commons/8/80/Dald%C3%B8s_dice.jpg){style="width:300px"}

- Write down the sample space of the experiment.
- How many outcomes are in the sample space?
- Write down the set (event) $A$ consisting of all outcomes where the sum of the two rolls is 7.
- Write down the set (event) $B$ consisting of all outcomes where the sum of the two rolls is odd.
- Write down the set (event) $C$ consisting of all outcomes where the first roll is greater than the second roll.
- Write down the set (event) $D$ consisting of all outcomes where the two rolls are the same.

Assume that the die is fair and that all outcomes are equally likely. 
- What are the probability of the events $A$ and $B$: $P(A)$ and $P(B)$?
- What is the probability of the event $A \cap B$?
- What is the probability of the event $A \cup B$?
- What is the probability of the event $A^c$?
- What is the probability of the event $A \cap C$?

In [None]:
# Simulation of 10 rolls of a four-sided die

rolls_array = np.random.choice([1, 2, 3, 4], size=[10, 2])
rolls_array

array([[4, 2],
       [2, 2],
       [1, 1],
       [1, 3],
       [3, 1],
       [2, 1],
       [1, 2],
       [4, 4],
       [3, 1],
       [2, 1]])

In [None]:
rolls = pd.DataFrame(rolls_array, columns=['Die1', 'Die2'])
rolls

Unnamed: 0,Die1,Die2
0,4,2
1,2,2
2,1,1
3,1,3
4,3,1
5,2,1
6,1,2
7,4,4
8,3,1
9,2,1


In [None]:
rolls["GameSum"] = rolls["Die1"] + rolls["Die2"]
rolls

Unnamed: 0,Die1,Die2,GameSum
0,4,2,6
1,2,2,4
2,1,1,2
3,1,3,4
4,3,1,4
5,2,1,3
6,1,2,3
7,4,4,8
8,3,1,4
9,2,1,3


In [None]:
# Event A: sum of the two rolls is greater than 7
# - we can sum the values in each row to get the total of the two rolls: this is what die_rolls.sum(axis=1). The axes in the arrays are indexed: 0 for rows and 1 for columns
# - The == operator checks if the sum is equal to 7 and returns a boolean array (True and False)

rolls["SumGreater7"] = (rolls["GameSum"] > 7)
rolls

Unnamed: 0,Die1,Die2,GameSum,SumGreater7
0,4,2,6,False
1,2,2,4,False
2,1,1,2,False
3,1,3,4,False
4,3,1,4,False
5,2,1,3,False
6,1,2,3,False
7,4,4,8,True
8,3,1,4,False
9,2,1,3,False


In [None]:
rolls["SumGreater7"].sum()

np.int64(1)

In [38]:
# Event B: sum of the two rolls is odd (the != operator checks for inequality)
# - as in the previous cell, we sum the values in each row to get the total of the two rolls
# - % is the modulo operator. If the sum is even, the modulo of 2 will be 0, and if it is odd, the modulo will be 1
# - the != operator checks if the modulo is different from zero and returns a boolean array (True and False)


In [None]:
# Event C: first roll is greater than the second roll.
# - the square brackets after the array are used to select parts of the array. The : operator selects all the elements in the array (in this case, all the rows)
# - after a comma we can specify the column we want to select. The first column has index 0 and the second column has index 1
# - the > operator checks if the value in the first column is greater than the value in the second column and returns a boolean array (True and False)

rolls["Die1GreaterDie2"] = rolls["Die1"] > rolls["Die2"]
rolls

Unnamed: 0,Die1,Die2,GameSum,SumGreater7,Die1GreaterDie2
0,4,2,6,False,True
1,2,2,4,False,False
2,1,1,2,False,False
3,1,3,4,False,False
4,3,1,4,False,True
5,2,1,3,False,True
6,1,2,3,False,False
7,4,4,8,True,False
8,3,1,4,False,True
9,2,1,3,False,True


In [None]:
# Event D: $D$ the two rolls are the same.
# Exactly as before, but here we check for equality using the == operator

rolls["Die1EqualsDie2"] = rolls["Die1"] == rolls["Die2"]
rolls

Unnamed: 0,Die1,Die2,GameSum,SumGreater7,Die1GreaterDie2,Die1EqualsDie2
0,4,2,6,False,True,False
1,2,2,4,False,False,True
2,1,1,2,False,False,True
3,1,3,4,False,False,False
4,3,1,4,False,True,False
5,2,1,3,False,True,False
6,1,2,3,False,False,False
7,4,4,8,True,False,True
8,3,1,4,False,True,False
9,2,1,3,False,True,False


In [None]:
# The intersection of A and B (A and B occurring together)
# - The & operator is used to combine two boolean arrays. It returns a new boolean array where the value is True only if both arrays have True in the same position

rolls["BothEvents"] = rolls["SumGreater7"] & rolls["Die1GreaterDie2"]
rolls

Unnamed: 0,Die1,Die2,GameSum,SumGreater7,Die1GreaterDie2,Die1EqualsDie2,BothEvents
0,4,2,6,False,True,False,False
1,2,2,4,False,False,True,False
2,1,1,2,False,False,True,False
3,1,3,4,False,False,False,False
4,3,1,4,False,True,False,False
5,2,1,3,False,True,False,False
6,1,2,3,False,False,False,False
7,4,4,8,True,False,True,False
8,3,1,4,False,True,False,False
9,2,1,3,False,True,False,False


## De Morgan's Laws

De Morgan's laws describe the relationship between the complement of unions and intersections of sets. They state that:

- The complement of the union of two sets is equal to the intersection of their complements:
  
  $$
  (A \cup B)^c = A^c \cap B^c
  $$

- The complement of the intersection of two sets is equal to the union of their complements:
    $$
    (A \cap B)^c = A^c \cup B^c
    $$

## Exercise
Assume that a class consists of the following students:

- Students enrolled in the mathematics course: $A = \{Maria, Elena, Simeon, Rada, Steli\}$
- Students enrolled in the statistics course: $B = \{Peter, Pavel, Simeon, Rada\}$

- Write down the complements of $A$ and $B$ and the intersection of $A$ and $B$. 
- Then, write down the union of $A$ and $B$ and its complement. Describe these sets in words.
- Write down the intersection of $A^c$ and $B^c$ and verify that it is the same as the complement of the union of $A$ and $B$.

## Exercise 

A consumer survey indicates that 10 out of 100 customers in the store will buy an iPhone, another 20 percent will buy an Android phone, and 5 percent will buy both products.

- What is the probability that a randomly selected customer will buy neither an iPhone nor an Android phone?
- What is the probability that a randomly selected customer will buy _exactly_ one of the products?


## Exercises


:::{#exr-racing}
## Racing Cars and Betting

Consider a car race with 6 cars. You believe that the probability of winning is equal for the first three cars, and the probabilities of winning for cars 4, 5, and 6 are equal to 1/7. You must place a bet on one of the following two events:

- One of cars 1, 2, or 5 wins
- One of cars 3, 5, or 6 wins

Which event would you bet on? Justify your answer. The reward is the same for both events.

:::



:::{#exr-dice-rolls-until-4}
## Dice Rolls Until 2

You take a four-sided die and roll it until you get a 2. Describe the sample space of the experiment.

:::

## Probability Model

You are given a loaded four-sided die. The even numbers are twice as likely as the odd numbers, but the 
each even number is equally likely and each odd number is equally likely. Construct a probability model for this die.