# Introduction to Probability


Probability is a branch of mathematics that deals with calculating the likelyhood of an event occuring. It helps us to understand an quantify uncertainty.

It provides a mathimatical framework for modeling random phenomena and making predictions about outcomes.

## Key Terms in Probability

- **Experiment:** An action or process that leads to one of several possible outcomes.
- **Sample Space (Outcome Space)(S or Ω):** The set of all possible outcomes of an experimant.
- **Event:** The expectation of an specific outcome or set of outcomes of an experiment.
- **Outcome:** A simple result from the *sample space*.

## Events


- The (Sample Space) **S** is the collection of all possible outcomes of an random experiment.  


### Example

Suppose we randomly select a person and ask them, "How many books do you own?" In this case, our **sample space** is:

S = {0,1,2,3,4,5,.....}

We could set a practical upper limit, but some people might own hundreds or even thousands of books. So, we'll leave 
the sample space open to be as accurate as possible. Now, let's define some events:

- Let **A** be the event that a randomly selected person owns no books:


In [24]:
A = {0}

- Let **B** be the event that a person owns at least one book:

In [25]:
B = {0,1,2,3,4,5,6,7,8,9}

- Let **C** be the event that a person owns no more than 10 books:

In [26]:
C = {0,1,2,3,4,5,6,7,8,9,10}

- Let **D** be the event that a person owns an even number of books:

In [27]:
D = {0,2,4,6,8}

### Basic Set Operations

1. ∅ is the null set or empty set. This set does not contain any elements.

2. $A \cup B$ = union. Union contains all of the elements of set $A$ and all elements of set $B$.

   <img src="https://raw.githubusercontent.com/KonstantinData/data-science-track/main/data-science-track/01-mathematical-foundations/resources/images/union-image.webp" width="200"/>



In [28]:
H = A | B # Union of A and B
print(H)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


1. $A \cap B$ = intersection. The intersection contains the elements that can be found in both $A$ and $B$, these elements are common for both sets.  
   If $A \cap B = \emptyset$, then $A$ and $B$ are called **mutually exclusive events**.

   <img src="https://raw.githubusercontent.com/KonstantinData/data-science-track/main/data-science-track/01-mathematical-foundations/resources/images/intersection-image.webp" width="200"/>



In [29]:
I = A & B # Intersection of A and B
print(I)


{0}


1. $A' = A^C$ = complement. When we consider all of the possible elements, the complement to the set $A$ is all elements that do not belong to $A$.

   <img src="https://raw.githubusercontent.com/KonstantinData/data-science-track/main/data-science-track/01-mathematical-foundations/resources/images/complement-image.webp" width="200"/>




In [30]:
Ω = set(range(0,21))

A_complement = Ω - A 
print(A_complement)

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}


If $E \cup F \cup G \dots = \Omega$, then $E$, $F$, $G$, and so on are called **exhaustive events**.  
So when the union of the sets makes the complete set of all the possible elements, they are called **exhaustive events**.

---

Now, let's define some composite events.

- **The union of events $C$ and $D$** is the event that a randomly selected person either owns **no more than 10 books** or owns an **even number of books**. That is:

  $$
  C \cup D = \{0, 2, 4, 6, 8, 10, 12, 14, \dots\}
  $$

- **The intersection of events $A$ and $B$** is the event that a person owns **both no books and at least one book** at the same time. This is impossible, so:

  $$
  A \cap B = \emptyset
  $$

- **The complement of event $D$** is the event that a person owns an **odd number of books**. That is:

  $$
  D^c = \{1, 3, 5, 7, \dots\}
  $$

- If we define events $E_0, E_1, E_2, \dots$ such that:

  $$
  E_0 = \{0\}, \quad E_1 = \{1\}, \quad E_2 = \{2\}, \dots
  $$

  then the events $E_0, E_1, E_2, \dots$ are **exhaustive events**, meaning they cover all possible outcomes in the sample space.

## Probabilities of the Events

**Probability**
- a number between 0 and 1
- a number closer to 0 means not likely
- a number closer to 1 means quite likely
- if probability of an event is exactly 0, then the event can’t occur
- if the probability of an event is exactly 1, then the event will definitely occur

### **Relative Frequency Approach to Probability**  

The **relative frequency approach** estimates probability by **observing** how often an event 
occurs over repeated trials, rather than relying on assumptions.  

---

### **Steps to Calculate Probability Using Relative Frequency**

1. **Perform the experiment multiple times** (n)  
   - Repeat an experiment a large number of times.  
   - Example: Rolling a die 100 times or flipping a coin 500 times.  

2. **Count how often event (A) occurs** N(A)  
   - Observe and record the number of times event (A) happens.  
   - Example: If rolling a die, count how often a "6" appears.  

3. **Calculate probability using the formula**  
   
   P(A) = N(A) / n

   - **P(A)**  → Probability of event (A)  
   - **N(A)**  → Number of times event (A) occurred  
   - **n**     → Total number of trials  

---

### **Example Calculation**
Suppose you flip a coin **1,000 times**, and it lands on **heads** **520 times**. The probability of heads is:  

P(Heads) = N(520) / n(1000) = 0.52 (52%)

---

### **Key Takeaways**
✔ **This method estimates probability using real data** instead of assumptions.  
✔ **More trials (n) lead to a more accurate probability estimate.**  
✔ **The formula P(A) = N(A) / n calculates probability from observed outcomes.**  

Would you like a Python example to simulate this method? 🚀😊


### **Relative Frequency Formula**  

The probability of an event \( A \) occurring is estimated using the **relative frequency formula**:

P(A) = N(A) / n

where:
- P(A) → Estimated probability of event \( A \)  
- N(A) → Number of times event \( A \) occurs  
- n → Total number of trials (experiments)


In [92]:
import random

# Define experiment parameters
n = 10000 # Sample Space
N_A = 0 # Counts how often the Event occurs

# Simulate coin flips
for _ in range(n):
    flip = random.choice(["Kopf","Zahl"])
    if flip == "Kopf":
        N_A += 1

# Calculate probability using relative frequency formula
P_A = N_A / n # P_A (Probability A) = (Count EVENT A) / Sample Space

print(f"Total Trials (n): {n} = Sample Space")
print(f"Occurrences of Heads (N_A): {N_A} = Event")
print(f"Estimated Probability of Heads (P(A)): {P_A:.4f} = Outcome")

Total Trials (n): 10000 = Sample Space
Occurrences of Heads (N_A): 4981 = Event
Estimated Probability of Heads (P(A)): 0.4981 = Outcome


In [91]:
import random

n = 52
N_A_H = 0
N_A_K = 0
N_A_RC = 0
N_A_FC = 0

hearts = {"♥A", "♥2", "♥3", "♥4", "♥5", "♥6", "♥7", "♥8", "♥9", "♥10", "♥J", "♥Q", "♥K"}
diamonds = {"♦A", "♦2", "♦3", "♦4", "♦5", "♦6", "♦7", "♦8", "♦9", "♦10", "♦J", "♦Q", "♦K"}
clubs = {"♣A", "♣2", "♣3", "♣4", "♣5", "♣6", "♣7", "♣8", "♣9", "♣10", "♣J", "♣Q", "♣K"}
spades = {"♠A", "♠2", "♠3", "♠4", "♠5", "♠6", "♠7", "♠8", "♠9", "♠10", "♠J", "♠Q", "♠K"}

red_cards = hearts | diamonds

face_cards = {"♥J", "♥Q", "♥K", "♦J", "♦Q", "♦K", "♣J", "♣Q", "♣K", "♠J", "♠Q", "♠K"}

for _ in range(n):
    selected_set = random.choice([hearts, diamonds, clubs, spades])
    
    cards = random.choice(list(selected_set))
    
    if cards in hearts:
        N_A_H += 1
    if cards.endswith("K"):
        N_A_K += 1
    if cards in red_cards:
        N_A_RC += 1
    if cards in face_cards:
        N_A_FC += 1

P_A_H = N_A_H / n
P_A_K = N_A_K / n
P_A_RC = N_A_RC / n
P_A_FC = N_A_FC / n

print(f"Total Cards {n}")
print(f"Number of Hearts occured = {P_A_H}")
print(f"Number of Kings occured = {P_A_K}")
print(f"Number of Red Cards occured = {P_A_RC}")
print(f"Number of hearts Face Cards = {P_A_FC}")



Total Cards 52
Number of Hearts occured = 0.23076923076923078
Number of Kings occured = 0.11538461538461539
Number of Red Cards occured = 0.40384615384615385
Number of hearts Face Cards = 0.23076923076923078


## **Classical Probability

Classical probability is used when all outcomes in the sample space are **equally likely**. It is based on theoretical assumptions rather than observations.  

The probability of an event \( A \) is calculated as:  

P(A) = N(A) / N(S) 

S: Sample Space (Outcome Space) ca be written as **Ω** 

where:  
- N(A) is the number of ways event (A) can occur.  
- N(S) is the total number of possible outcomes in the sample space.  

This approach is commonly used in scenarios where each outcome has the same chance of occurring.  



### **Key Difference**  
| **Approach** | **Formula** | **Based On** | **Use Case** |
|-------------|------------|-------------|-------------|
| **Classical Probability** | P(A) = N(A) / N(S) | Theoretical assumptions | All outcomes equally likely |
| **Relative Frequency** | P(A) = N(A) / n | Observed data | Real-world experiments |


### **Understanding Probability with Axioms**

Probability helps us measure **how likely** something is to happen. To make it formal, we follow three basic rules called **axioms of probability**:

1. **Probability is always non-negative:**  
   Every event \( A \) has a probability that is **never** less than zero:  
   \[
   P(A) ≥ 0
   \]
   This means we **can’t have negative probability**.

2. **The probability of the entire sample space is 1:**  
   The **sample space** \( S \) includes all possible outcomes. The total probability of all possible events must be 1:  
   \[
   P(S) = 1
   \]
   For example, if you roll a die, you **must** get a number between 1 and 6. The probability of getting *any* number (1, 2, 3, 4, 5, or 6) **adds up to 1**.

3. **For mutually exclusive events, probabilities add up:**  
   If two events \( A_1 \) and \( A_2 \) **cannot happen at the same time** (for example, rolling a 1 **or** rolling a 2 on a die), their combined probability is:  
   \[
   P(A_1 ∪ A_2) = P(A_1) + P(A_2)
   \]
   More generally, for **multiple** mutually exclusive events \( A_1, A_2, A_3, \dots \), the probability of at least one of them happening is:  
   \[
   P(A_1 ∪ A_2 ∪ A_3 ∪ ...) = P(A_1) + P(A_2) + P(A_3) + ...
   \]

### **Example**
- Suppose you **flip a coin**.  
  - The probability of **getting heads** is \( P(H) = 0.5 \).
  - The probability of **getting tails** is \( P(T) = 0.5 \).
  - Since heads and tails **cannot happen at the same time**, the total probability is:  
    \[
    P(H ∪ T) = P(H) + P(T) = 0.5 + 0.5 = 1
    \]
  - This follows the **third axiom**.

### **Summary**
- **Probability is never negative** → \( P(A) ≥ 0 \)
- **Total probability is 1** → \( P(S) = 1 \)
- **If events can't happen together, their probabilities add up** → \( P(A_1 ∪ A_2) = P(A_1) + P(A_2) \)

## Continuous distributions

A continuous distribution characterizes a random variable whose possible outcomes form a continious set, typecally a bounded intervall or the entire real line.

Continuous distributions use a probability density function ***PDF***.

This function does not give probabilities directly; instead, the probability that the variable falls within a specific interval, is found by integrating the PDF over that interval.

For a continuous random variable 𝑋 with PDF 𝑓(𝑥), the probability that 𝑋 lies between two values 
𝑎 and 𝑏 is given by:

$$
P(a \leq X \leq b) = \int_a^b f(x) \, dx
$$


The PDF is defined as:

$$
f(x) =
\begin{cases}
\frac{1}{b - a}, & \text{if } a \le x \le b \\
0, & \text{otherwise}
\end{cases}
$$


For your specific interval where 𝑎 = 0.05432 and 𝑏 = 7.15758, the length of the interval is:

$$
b - a = 7.15758 - 0.05432 = 7.10326
$$


Therefore, the correct PDF is:

$$
f(x) =
\begin{cases}
\frac{1}{7.10326}, & \text{if } 0.05432 \le x \le 7.15758 \\
0, & \text{otherwise}
\end{cases}
$$


This ensures that the total probability is:

$$
\int_{0.05432}^{7.15758} \frac{1}{7.10326} \, dx = \frac{7.10326}{7.10326} = 1.
$$
