# Bayesian Stats
* Why does prior probability matter? 


### Objectives
* Use pandas to manipulate data and solve probabilities
* Apply Bayes formula to solve conditional probability problems

### Outline
* Review Baye's Theorem
* Multi-Arm Bandit
* Example
* Activities

# Bayes Formula

**P(A|B) = Probability of A, given B.**

![](images/bayes-formula.png)

$$ \large \text{Posterior} = \dfrac{\text{Likelihood} \cdot \text{Prior}}{\text{Evidence}}$$

- probability it rains given it is cloudy 
- likelihood: probability cloudy given it rains 
- prior: proability it rains
- evidence: probability it is cloudy

## Exploration - gather more information that might lead us to the better decisions in the future 
## Exploitation - make the best decision given current information 

### Multi Arm Bandit

![](images/bandit1.png)

![](images/bandit3.png)

https://cxl.com/blog/bandit-tests/

# Benefit
![](images/bandit2.jpg)

# Question
A fair dice is rolled. What is the probability that it is a 2, given that it is even?

![](images/bayes-formula.png)

In [2]:
# p(2 | even)

# p(even | 2)
p_even_2 = 1

# p(2)
p_2 = 1/6

# p(even)
p_even = 1/2

In [3]:
(p_even_2 * p_2) / p_even

0.3333333333333333


$$\text{P(2|even)} = \frac{\text{P(even|2)P(2)}}{\text{P(even)}}$$

$$\text{P(2|even)} = \frac{\text{(1.0)(1/6)}}{\text{(1/2)}}$$

$$\text{P(2|even)} = 0.333...$$

$$\text{P(2|even)} = 33\%$$

### Scenario 1

You are given an array of points and their labels (see below).  A point is chosen at random. What is the probability that the point is less than 5, given that its label is a 0?

In [4]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

df = pd.read_csv("bayes_data.csv")
df.head(3)

Unnamed: 0,array,labels
0,8.206975,0
1,5.543411,0
2,6.127242,0


![](images/bayes-formula.png)

In [20]:
df[df['array'] < 5]['labels'].value_counts(normalize = True).loc[0]

0.043478260869565216

In [21]:
# p(array < 5 | label = 0)

p_less_than_5 = len(df[df['array'] < 5]) / len(df)

# p (0)
p_0 = (df.labels == 0).mean()

# p(label = 0 | array < 5)
p_0_less_than_5 = df[df['array'] < 5]['labels'].value_counts(normalize = True).loc[0]

In [24]:
(p_0_less_than_5 * p_less_than_5) / p_0 

0.04

In [31]:
df[(df['labels'] == 0) & (df['array'] < 5)]

Unnamed: 0,array,labels
14,4.726106,0
43,3.276684,0
63,4.951455,0
87,4.234923,0


In [33]:
len(df[df['labels'] == 0])

100

### Scenario 2

Using the iris dataset from sklearn, what is the probability that a flower has a sepal length greater than 6.3, given the flower is a Iris-Versicolour (label=1)?

[Google Colab Solution](https://colab.research.google.com/drive/1Q8kqYwpAmSLwCr_yrdBe3rcGbPdc2fVL?usp=sharing)

In [45]:
import pandas as pd 

from sklearn.datasets import load_iris

iris = load_iris()
data = iris.data
target = iris.target
feature_names = iris.feature_names

df = pd.DataFrame(data, columns = feature_names)
df['target'] = target
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [46]:
#p(sepal_length > 6.3 | flower = versicolour(1))

# p(sepal_length > 6.3)
p_sl_63 = (df['sepal length (cm)'] > 6.3).mean()

# p(versicolour)
p_versi = df['target'].value_counts(normalize = True).loc[1]

# p(versicolour | sepal_length > 6.3)
p_versi_sl_63 = df[df['sepal length (cm)'] > 6.3]['target'].value_counts(normalize = True).loc[1]

In [47]:
(p_versi_sl_63 * p_sl_63) / p_versi

0.22000000000000006

In [48]:
len(df[(df['sepal length (cm)'] > 6.3) & (df['target'] == 1)]) / len(df[df['target'] == 1])

0.22

## Additional Resource
- https://www.countbayesie.com/blog/2015/2/18/hans-solo-and-bayesian-priors

Given an array of integers, return a new array with each value doubled.

For example:

[1, 2, 3] --> [2, 4, 6]

In [34]:
arr = [1, 2, 3]

In [36]:
[2*x for x in arr]

[2, 4, 6]

In [37]:
a = np.array([1,2,3]) 

In [42]:
%%timeit
b = a*2
b

779 ns ± 36.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [41]:
%%timeit
list(map(lambda x: x * 2, arr))

609 ns ± 18.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [44]:
def multiply_2(val):
    return val * 2

df['array1'] = df['array'].map(multiply_2)
df.head()

Unnamed: 0,array,labels,array1
0,8.206975,0,16.41395
1,5.543411,0,11.086821
2,6.127242,0,12.254483
3,6.583582,0,13.167164
4,8.69053,0,17.38106
