# Probability and Bayes' Review

Marginal Distributions: $p(A), p(B)$
<br>
Joint Distribution: $p(A,B)$
<br>
Conditional Distribution: $p(A|B), p(B|A)$
<br>

Conditional distribution formula:
<br>
$p(A|B)=\frac{p(A,B)}{p(B)}$

### Discrete vs Continuous Random Variables
-  p() is now a probability density (NOT probability)
<br>
Joint: $p(x,y)$
Marginal: $p(x) = \int p(x,y) dy$

#### Bayes' Rule
$p(x|y) = \frac{p(y|x) p(x)}{\int p(y|x) p(x) dx} $

## Maximum Likelihood Estimation

#### The Bernoulli Distribution 
- Describes a coin toss. But what is the equation?
- Discrete random variable
- PMF (Probability mass function)

$p(x) = \theta ^x (1- \theta)^{1-x} $

- x can only take the values of 0 or 1 (like heads or tails)
- theta is also the probability that x=1

What is the likelihood a function of?
- The variable is theta

Why is it called a maximum likelihood?
- What value of theta maximizes the likelihood?
- What value of theta makes the data we collected most probable?

To maximize, we want to take the derivative of L with respect to theta. Then we will set this to zero. Shoutout calculus. However, it's sometimes easier to take the log of likelihood before differentiating.

This is best applied to click-through rates.
- Probability that user clicks on a link
- Probability user buys a product
- BINARY EVENTS!

Formulas:
Click Through Rate (CTR) = $\frac{n * clicks}{n * impressions}$

Conversion Rate = $\frac{n* desired action}{n*page visits}$

#### The Gausian Distribution
Also known as the normal distribution

This can be used for ratings or anything without a binary outcome, time user spends on webpage, etc



# Probability Review In Code

In [3]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
np.random.seed(0)

In [4]:
mu = 170
sd = 7

In [5]:
# generate samples from our distribution
x = norm.rvs(loc=mu, scale=sd, size=100)

In [6]:
x.mean()

170.41865610874137

In [7]:
x.var()

49.77550434153163

In [8]:
x.std()

7.055175713016057

In [9]:
# at what height are you in the 95th percentile?
norm.ppf(.95, loc=mu, scale=sd)

181.5139753886603

In [10]:
# if you are 160 cm tall, what percentile are you in?
norm.cdf(160, loc=mu, scale=sd)

0.07656372550983476

In [11]:
# if you are 180 cm tall, what is the probability that someone is taller than you?
1 - norm.cdf(180, loc=mu, scale=sd)

0.07656372550983481

In [12]:
norm.sf(180, loc=mu, scale=sd)

0.07656372550983476