# Probability Review
- Total Probability Theorem
- Single Event
- Multiple Independent Events
- Conditional Probabilities
- Bayes Theorem
- Random Variables

## Total Probability Theorem

\begin{aligned}
P(A) = \sum_i{P(A|B_i)P(B_i)},\ i=1,2,...
\end{aligned}



## Single Event

### Probability of an event A:

\begin{aligned}
P(A) = \frac{\text{number of favorable outcomes}}{\text{number of possible outcomes}}
\end{aligned}

#### Example 1: given a deck, what is the probability of getting a black card?

\begin{aligned}
P(\text{black card}) = \frac{26}{52} = \frac{1}{2}
\end{aligned}

#### Example 2: you throw 2 dice. What is the probability that the sum will be 6? 



In [6]:
number_of_occurences = 0
for die_1 in range(1, 7):
    for die_2 in range(1, 7):
        if dice_1 + dice_2 == 6:
            number_of_occurences += 1

number_of_possible_outcomes = 6 * 6
probability = number_of_occurences / number_of_possible_outcomes
print(f'The probability that the sum of the two dice will be 6 is {number_of_occurences} in {number_of_possible_outcomes}:\n{number_of_occurences}/{number_of_possible_outcomes} = {probability:.2f}')

The probability that the sum of the two dice will be 6 is 5 in 36:
5/36 = 0.14


## Multiple Events

### Probability of an event A __and__ B:

\begin{aligned}
P(\text{A and B}) = P(A)P(B)
\end{aligned}

Events are independents if the probability of A occurring is the same whether or not B occurs. 

#### Example 1: consider two decks and one card is picked from each of them. The probability of picking a certain card from the second deck is 1/52 regardless of the picked card from the first deck.

#### Example 2 (__not independent events__): if you throw two six-sided die, what is the probability of their sum is equal to 6 where the first dice is 4? 

\begin{aligned}
\begin{gather*}
A = \text{sum is 6}; P(A) = \frac{5}{36} \\
B = \text{first dice is 4}; P(B) = \frac{6}{36} \\
P(\text{A | B}) = \frac{1}{36} \rightarrow \{dice_1 = 4, dice_2 = 2\} \\
\end{gather*}
\end{aligned}

Therefore, A and B are not independents because $P(\text{A | B})\ !=\ P(A)P(B)$.

### Probability of and event A __or__ B:
\begin{aligned}
P(\text{A or B}) = P(A) + P(B) - P(\text{A and B})
\end{aligned}

$A$ occurs and $B$ does not; $B$ occurs and $A$ does not; both $A$ and $B$ occur;]

#### Example: Consider throwing a die and flipping a coin. What is the probability of getting a 6 or a head?

\begin{aligned}
P(\text{6 or head}) = \frac{1}{6} + \frac{1}{2} - (\frac{1}{6} * \frac{1}{2}) = \frac{7}{12}
\end{aligned}

If the problem has more then two events, it is easier to compute the probability of not getting either 6 or a head. Then subtract this value from 1 to get the probability of 6 or head:

\begin{aligned}
\begin{gather*}
\text{(not 6)}\ \text{and}\ \text{(not head)} \\
(1 - \frac{1}{6}) * (1 - \frac{1}{2}) = \frac{5}{12} \\
1 - \frac{5}{12} = \frac{7}{12}
\end{gather*}
\end{aligned}


## Conditional Probabilities

If A and B are not independents, then $P(A|B) = \frac{P(\text{A and B})}{P(B)}$ or $P(\text{A and B}) = P(A)P(B|A)$. You can read lenght of $A \cap B$ on lenght of $B$.

#### Example 1: drawing two cards from a deck, what is the probability of being one ace of diamonds (event AD) and one black card (event BC)? 

Possible outputs:
- Case A: first you get an AD _and_ then a BC;
- Case B: first you get a BD _and_ then an AD.

First case: 

\begin{aligned}
\begin{gather*}
P(\text{AD and BC}) = P(AD)P(BD | AD) \\ 
P(\text{AD and BC}) = \frac{1}{52} * \frac{26}{51} \\
P(\text{AD and BC}) = \frac{1}{102}
\end{gather*}
\end{aligned}

Notice that $P(BD | AC) = \frac{26}{51}$ because we are assuming that one card was already drawn from the deck (the AD).

Second case:
\begin{aligned}
\begin{gather*}
P(\text{BC and AD}) = P(BD)P(AD | BC) \\ 
P(\text{BC and AD}) = \frac{26}{52} * \frac{1}{51} \\
P(\text{BC and AD}) = \frac{1}{102}
\end{gather*}
\end{aligned}

Answering the question, the probability of drawing an ace of diamonds and a black card is the probability of occurring event A or event B. Thus, $P(\text{A or B}) = P(A) + P(B) - P(A|B)$, where $P(A|B)=0$ because an ace of diamonds cannot be a blak card. The result:

\begin{aligned}
\begin{gather*}
P(\text{A or B}) = \frac{1}{102} + \frac{1}{102} - 0 \\
P(\text{A or B}) = \frac{1}{51}
\end{gather*}
\end{aligned}

### Example 2: Throwing a die we have an even number, what is the probability of getting the number 2?

- Event A: it is the number 2 $\rightarrow P(A)=\frac{1}{6}$
- Event B: even number $\rightarrow P(B)=\frac{3}{6}$
- Therefore: $P(A \cap B) = 1/6$

Thus,

\begin{aligned}
\begin{gather*}
P(A|B) = \frac{\frac{1}{6}}{\frac{3}{6}} = \frac{1}{3} \\
\end{gather*}
\end{aligned}


## Bayes Theorem

### Principal aspects:
- Quantify uncertainty;
- Be flexible to the light of new evidences;
- "**Posterior** equals to the **likelihood** times **prior**"

### Formula:
\begin{aligned}
\begin{gather*}
P(A_i|B) = \frac{P(B|A_i)P(A_i)}{P(B)},\ i = 1,2,... \\
\text{where}\ P(B)=\sum_j{P(B|A_j)P(A_j)}
\end{gather*}
\end{aligned}

### Example: given two boxes, where the first one has two withe balls and seven black balls, and the second box has five white balls and six black balls. Throwing a coin, if we get a head we take one ball from box 1, otherwise we take one ball from box 2. What is the probability of getting a head from the coin, given a taken white ball?

Event A: coin is head, P(A)=1/2

Event B: white ball was taken, $P(B)=\sum_i{P(B|A_i)P(A_i)}$, i=\{head, tail\}

The probability of getting a head given a white ball:

\begin{aligned}
\begin{gather*}
P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_j{P(B|A_j)P(A_j)}},\ i=\text{head},\ j={\text\{head, tail\}} \\
P(A_i|B) = \frac{P(B|A_i)P(A_i)}{P(B|A_{head})P(A_{head}) + P(B|A_{tail})P(A_{tail})} \\
P(A_i|B) = \frac{\frac{2}{9} \frac{1}{2}}{\frac{2}{9} \frac{1}{2} + \frac{5}{11} \frac{1}{2}} = \frac{22}{67}\\
\end{gather*}
\end{aligned}


## Random Variables

### Definition: a random variable is a function that assigns real values to each element of an experiment/space/set.

**Example**: The experiment space of throwing two coins is $S=\{HH, HT, TH, TT\}$. A random variable $X$ assigns the number of heads in each element of this set, thus $X(HH)=2, X(HT)=1, X(TH)=1, X(TT)=0$.

**Notation in probability**: Consider two coloured boxes (read and blue) each containing fruit (apples and oranges). The _identity_ of the box is a **random variable**, denoted by $B$. This random variable can take one of two _possible values_: red ($r$) or blue ($b$). Similarly, the _identity_ of the fruit is also a random variable, denoted by $F$. It can take either of the values apple ($a$) or orange ($o$).

Given the probability of choosing the red box equals to $4/10$ and the probability of choosing the blue box equals to $6/10$, we write these probabilities as $p(B=r) = \frac{4}{10}$ and $p(B=b) = \frac{6}{10}$.

**Conditional Probability**: be $X$ a discrete or continuous random variable. The conditional probability that $X \in S$ given $X \in V$ is:

\begin{aligned}
\begin{gather*}
P(X \in S | X \in V) = \frac{P(X \in S \cap X \in V)}{P(X \in V)}
\end{gather*}
\end{aligned}

- Discrete
    - $X$ is a finite list $(x_1, x_2, ..., x_n)$
    - The prabability discret function is denoted by $P(X=x_i) = p(x_i) = p_i$, i=1,2,...
    - $\sum_i{P(X=x_i) = 1}$
    - $P(X=x_i) \ge 0$
- Countinous
    - $f(x) \ge 0, \forall x \in \mathbb{R}$, where $f$ is called probability density funciton
    - $\int_{-\infty}^{\infty} f(x)\,dx = 1$

## References
- Book: **Pattern Recognition and Machine Learning**, by Christopher M. Bishop;
- YouTube channel: **[Professor Francisco Rodrigues](https://www.youtube.com/channel/UCpWLePIWw2B0qUkvPjdRzxw)**
- Online platform: **[Brilliant](https://brilliant.org/courses/probability-fundamentals/)**