# Fundamentals
## Intuition
The intuition behind probability is actually pretty easy. We can think of the likelihood of events happening, which is just fractions. When we measure probability, we care about two things in particular:
1. Events are Mutually Exclusive
2. Events are Collectively Exhaustive
This motivation for this is that we can now put math to the fundamentals of probability.

From here, we can start taking fractions of events to assign probabilities. Probability is little more than fractions. The symbols are the hardest part about probability.

## Explanation

Let's start with the big picture then deduce our way to more complicated issues.

- Let $S$ represent sample space
- Let $e$ represent an event
- $S = \{e_1, e_2, \dots, e_n\}$

There are a few important laws to keep in mind:

- Law of Complements: $Pr\{\bar{A}\}=1-Pr\{A\}$
- Addition Rule: $Pr\{A \cup B\} = Pr\{A\} + Pr\{B\} - Pr\{A \cap B \}$
- Conditional Rule: $Pr\{A|B\} = \frac{Pr\{A \cap B\}} {Pr\{B\}}$
- Product Rule: $Pr\{A \cap B\} = Pr\{A\}Pr\{B|A\} = Pr\{B\}Pr\{A|B\}$

What are the most important are the Conditional Rule and the Product rule.

Conditional probability is easier to grasp than the symbols suggest. Basically, we can think of it as finding all the cases when $A$ is true AND $B$ is true. That gives us our numerator. Our denominator is all the cases when B is true. 

So conditional probability is a fraction of a subset of the sample space $S$. Now, let's start thinking about how we can use this relationship productively. 

Let's say we have a conditional probability $Pr(A|B)$. The numerator is a subset of $A\cap B$ while the denominator is $B$ itself. 

$$
\frac{Pr(A \cap B)}{Pr(B)}
$$

What if, instead of $Pr(A|B)$ we swapped out A for the denominator? This would yield 

$$
\frac{Pr(A \cap B)}{Pr(A)}
$$

So this means, if we cancel out the denominators, we're left with the same numerator.

$$
\begin{aligned}
Pr(A) \cdot \frac{Pr(A \cap B)}{Pr(A)} &= Pr(B) \cdot \frac{Pr(A \cap B)}{Pr(B)} \\
Pr(B|A) &= \frac{Pr(B) \cdot \frac{Pr(A \cap B)}{Pr(B)}}{Pr(A)} \\
Pr(B|A) &= \frac{Pr(B) \cdot Pr(A|B)}{Pr(A)}
\end{aligned}
$$

What's remarkable about this formula is the implication that you can find the probability of $A$ if you can observe a few factors.

## Example

Let's setup a simple example. Let's say there are two separate things can happen: A and B. They could be something like A is the chance of rain. B is the chance of me going to get tacos. Let's say there is a $40\%$ chance of rain. If it rains, let's say my likelihood of getting tacos is $10\%$. If it's sunny, let's say its $20\%$.

I think it's actually logically easier to set up an example and work backwards through it. 

### Setup

In [1]:
import numpy as np
import pandas as pd


def game():
    r1 = np.random.rand()
    r2 = np.random.rand()
    
    prain = 0.40
    ptaco_with_sun  = 0.2
    ptaco_with_rain = 0.1
    
    if r1 < prain:
        if r2 < ptaco_with_sun:
            return "A", "B"
        else:
            return "A", "!B"
    else:
        if r2 < ptaco_with_rain:
            return "!A", "B"
        else:
            return "!A", "!B"
        

# generate results
n = 15000
results = pd.DataFrame([game() for x in range(n)])

# create results table
xtab = pd.crosstab(index=results[0], columns=results[1])
xtab.loc[:,'Total'] = xtab.sum(axis=1)
xtab.loc['Total',:] = xtab.sum(axis=0)
xtab.loc['Total','Total'] = n

#### Cross-tab: #

In [2]:
xtab

1,!B,B,Total
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
!A,8006.0,872.0,8878.0
A,4946.0,1176.0,6122.0
Total,12952.0,2048.0,15000.0


#### Cross-tab: %

In [3]:
xtab / n

1,!B,B,Total
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
!A,0.533733,0.058133,0.591867
A,0.329733,0.0784,0.408133
Total,0.863467,0.136533,1.0


Now, let's go thorugh the definitions, both using #s and probabilities.

#### Law of Complements

In [4]:
print("# !A:\t\t", xtab.loc["!A","Total"]) 
print("# 10,000 - A:\t", xtab.loc["Total","Total"] - xtab.loc["A","Total"])
print("% !A:\t\t", xtab.loc["!A","Total"] / n) 
print("% 10,000 - A:\t", (xtab.loc["Total","Total"] - xtab.loc["A","Total"]) / n)

# !A:		 8878.0
# 10,000 - A:	 8878.0
% !A:		 0.5918666666666667
% 10,000 - A:	 0.5918666666666667


#### Addition Rule

$Pr\{A \cup B\} = Pr\{A\} + Pr\{B\} - Pr\{A \cap B \}$

In [5]:
aorb = xtab.loc["A","Total"] + xtab.loc["Total","B"] - xtab.loc["A","B"]

print(f'#:\t{xtab.loc["A","Total"]} + {xtab.loc["Total","B"]} - {xtab.loc["A","B"]} = {aorb}')
print(f'#:\t{xtab.loc["A","Total"]/n} + {xtab.loc["Total","B"]/n} - {xtab.loc["A","B"]/n} = {aorb/n}')

#:	6122.0 + 2048.0 - 1176.0 = 6994.0
#:	0.40813333333333335 + 0.13653333333333334 - 0.0784 = 0.46626666666666666


#### Conditional Rule

$Pr\{A|B\} = \frac{Pr\{A \cap B\}} {Pr\{B\}}$

In [6]:
agivenb = xtab.loc["A","B"] / xtab.loc["Total","B"]
agivenb_pct = xtab.loc["A","B"] / n / xtab.loc["Total","B"] / n

print(f'#:\t{agivenb}')
print(f'%:\t{agivenb}')

#:	0.57421875
%:	0.57421875


Now, we see with the conditional rule, these measurements are standardized. That means, we can entirely compare raw #s and %s.

#### Product Rule

$Pr\{A \cap B\} = Pr\{A\}Pr\{B|A\} = Pr\{B\}Pr\{A|B\}$

In [7]:
agivenb = xtab.loc["A","B"] / xtab.loc["Total","B"]
bgivena = xtab.loc["A","B"] / xtab.loc["A","Total"]

pr_agivenb = xtab.loc["Total","B"] / n * agivenb
pr_bgivena = xtab.loc["A","Total"] / n * bgivena

print(f'Pr(A|B) = {pr_agivenb}')
print(f'Pr(B|A) = {pr_bgivena}')

Pr(A|B) = 0.0784
Pr(B|A) = 0.0784
