# Introduction to Probability

---

## 1. What is Probability?
**Probability** is a numerical measure of the likelihood of an event occurring. It is used to model and analyze situations involving uncertainty. It seeks to answer the question, _"What is the change of something happening?"_ Probability is a value between 0 and 1:
* 0: Impossible event.
* 1: Certain event.
* 0.5: Event with equal chances of happening or not happening (e.g., getting heads in a coin toss).

Probability is used in many areas of machine learning:
* **Model Uncertainty:** Measuring how reliable a model's predictions are (e.g., the probability that a classification model assigns an instance to a particular class).
* **Data Generation:** Generating new data points (e.g., in generative models).
* **Random Processes:** Modeling algorithms that involve randomness (e.g., random forests).
* **Bayesian Inference:** Using Bayes' theorem to update probabilities on data.

### 1.1. Basic Concepts
* **Experiment:** A process with an uncertain outcome (e.g., rolling a die, tossing a coin, observing whether a customer buys a product).
* **Sample Space:** The set of _all_ possible outcomes of an experiment (usually denoted by _S_ or _Ω_).
    * Example: When rolling a die, the sample space is _S_ = {1, 2, 3, 4, 5, 6}
    * Example: When tossing a coin twice, the sample space is _S_ = {HH, HT, TH, TT}
* **Event:** A subset of the sample space. The set of outcomes we are interested in or want to happen.
    * Example: The event of rolling an even number on a die: _E_ = {2, 4, 6}
    * Example: The event of getting at least one tail when tossing a coin twice: _E_ = {HT, TH, TT}
* **Probability Function:** A function which assigns to each event of the sample space a probability between 0 and 1.

---

## 2. Probability Rules
When calculating and combining probabilities, we follow some fundamental rules. These rules form the basis of probability theory.

### Rule 1: Non-Negativity Rule
"The probability of any event is always greater than or equal to 0."

$P(E) \ge 0$ (For any event *E*)

### Rule 2: Normalization Rule
"The probability of the sample space is equal to 1." Which means, the sum of the probabilities of all possible outcomes is 1.

$P(S) = 1$ (*S*: sample space)

### Rule 3: Addition Rule for Mutually Exclusive Events
"If two events _cannot happen at the same time_ (**mutually exclusive events**), the probability of _either one or the other_ event occurring is the sum of their individual probabilities. 

$P(A \cup B) = P(A) + P(B)$ (If *A* and *B* are mutually exclusive)

**Example:**  

The probability of rolling a 1 or a 2 on a die: $P(1 \cup 2) = P(1) + P(2) = \frac{1}{6} + \frac{1}{6} = \frac{1}{3}$

### Rule 4: Complement Rule
"The probability of an event **not** happening (its complement) is 1 minus the probability of the event happening."

$P(A^c) = 1 - P(A)$  (*Aᶜ* is the complement of *A* - the event that *A* does not occur)

**Example:**  

The probability of *not* rolling a 6 on a die: $P(6^c) = 1 - P(6) = 1 - \frac{1}{6} = \frac{5}{6}$

### Rule 5: General Addition Rule
"Even if two events are **not** mutually exclusive (i.e., they _can_ happen at the same time), the probability of _either one of the other_ (or both) events occurring is calculated as:

*   $P(A \cup B) = P(A) + P(B) - P(A \cap B)$

    $P(A \cap B)$: The probability of *both* *A* and *B* occurring (the probability of their *intersection*). We subtract this to avoid double-counting the outcomes that are in both A and B.

**Example:**  

We draw a random card from a standard deck of playing cards.

* Event *A*: The card is a heart.  $P(A) = \frac{13}{52} = \frac{1}{4}$
* Event *B*: The card is a king. $P(B) = \frac{4}{52} = \frac{1}{13}$
* Event $A \cap B$: The card is *both* a heart *and* a king (the king of hearts). $P(A \cap B) = \frac{1}{52}$
* The probability of drawing a heart *or* a king: $P(A \cup B) = P(A) + P(B) - P(A \cap B) = \frac{13}{52} + \frac{4}{52} - \frac{1}{52} = \frac{16}{52} = \frac{4}{13}$

---

## 3. 