# <ins>**Probability Theory**</ins>

## **Importance to Data Science**
Probability is the foundation of statistical methods. It plays a key role in data science themes like probabilistic models such as *Bayesian networks* and *Gaussian processes*, further expanding to machine learning algorithms. It is also critical to understand probability theory when doing error analysis and risk assessments. In this document we wont dive deep into any of these subjects but explore the core basics of understanding several probability concepts. However you might see these fancier applications in the future documents. Overall, probability theory provides the mathematical framework and tools necessary for dealing with the inherent uncertainty and variability in data, making it indispensable in data science for analysis, modeling, prediction, and decision-making.


****

## **Basic Probability**

### <ins>Definitions</ins>
If you read the past documents before jumping into this one, you might feel like being on a vacation because a lot of concepts in probability are easier to understand than in linear algebra or calculus. This is my "unbiased" opinion.

First we should define few reoccurring words that you will see throughout the document:
- **Probability**: The likelyhood that an event will occur, ranging from 0 (impossible event) to 1 (certain event).
- **Experiment**: An activity with uncertain outcome, rolling a die.
- **Sample Space $(S)$**: The set of all possible outcomes of an experiment. For example $S = \{1, 2, 3, 4, 5, 6\}$ for a die roll.
- **Event**: A subset of the sample space. For example rolling an even number $A = \{2,4,6\}$.

### <ins>Calculating Probability</ins>
- The probability of an event $A$ occurring is:
$$
P(A) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}
$$

Makes common sense if you think about it. Now here are couple examples:

1. Rolling a die:
    - Sample space: $S$ = \{1,2,3,4,5,6\}
    - Event: Rolling a $3$
    - Probability:
    $$
    P(\text{Drawing an Ace}) = \frac{1}{6}
    $$

2. Drawing a card from a deck
    - Sample space: 52 cards
    - Event: Drawing an ace
    - Probability
    $$
    P(\text{Drawing an ace}) = \frac{4}{52} = \frac{1}{13}
    $$

### <ins>Probability Rules</ins>
There are way more rules than listed in this document, but here are the basic ones you should know:

- **Complementary Events**: The probability of the complement of an event $A$ (denoted as $A^c$) is:
$$
P(A^c) = 1 - P(A)
$$

- **Union of Events**: The probability that either event $A$ or event $B$ occurs (or both):
$$
P(A \cup B) = P(A) + P(B) - P(A \cap B)
$$

- **Intersection of Events**: The probability that both events $A$ and $B$ occur (if $A$ and $B$ are independent):
$$
P(A \cap B) = P(A) \times P(B)
$$

### <ins>Python Examples of Basic Probability</ins>
Here are few examples of how to calculate basic probabilities in Python as well as applying probability rules

#### Rolling a Die

In [1]:
import random

# Simulate rolling a die
def roll_die():
    return random.randint(1, 6)

# Number of simulations
num_simulations = 100000
count_3 = 0

for _ in range(num_simulations):
    if roll_die() == 3:
        count_3 += 1

# Estimated probability
prob_3 = count_3 / num_simulations
print(f"Estimated Probability of rolling a 3: {prob_3}")

Estimated Probability of rolling a 3: 0.1671


#### Drawing a Card from a Deck

In [2]:
# Simulate drawing a card
def draw_card():
    deck = ['Ace'] * 4 + ['Non-Ace'] * 48
    return random.choice(deck)

# Number of simulations
num_simulations = 100000
count_ace = 0

for _ in range(num_simulations):
    if draw_card() == 'Ace':
        count_ace += 1

# Estimated probability
prob_ace = count_ace / num_simulations
print(f"Estimated Probability of drawing an Ace: {prob_ace}")

Estimated Probability of drawing an Ace: 0.07553


#### Complementary Events

In [3]:
# Probability of drawing a non-Ace card
prob_non_ace = 1 - (4 / 52)
print(f"Probability of drawing a non-Ace card: {prob_non_ace}")

Probability of drawing a non-Ace card: 0.9230769230769231


#### Union of Events (Rolling a 2 or 4)

In [4]:
# Simulate rolling a die
def roll_die():
    return random.randint(1, 6)

# Number of simulations
num_simulations = 100000
count_2_or_4 = 0

for _ in range(num_simulations):
    if roll_die() in [2, 4]:
        count_2_or_4 += 1

# Estimated probability
prob_2_or_4 = count_2_or_4 / num_simulations
print(f"Estimated Probability of rolling a 2 or 4: {prob_2_or_4}")

Estimated Probability of rolling a 2 or 4: 0.33383


#### Intersection of Independent Events (Rolling a 2 and flipping a head)

In [5]:
# Simulate flipping a coin
def flip_coin():
    return random.choice(['Heads', 'Tails'])

# Number of simulations
num_simulations = 100000
count_2_and_heads = 0

for _ in range(num_simulations):
    if roll_die() == 2 and flip_coin() == 'Heads':
        count_2_and_heads += 1

# Estimated probability
prob_2_and_heads = count_2_and_heads / num_simulations
print(f"Estimated Probability of rolling a 2 and flipping Heads: {prob_2_and_heads}")

Estimated Probability of rolling a 2 and flipping Heads: 0.08533
