# 0. Introduction
**Before we start:** this environment that allows us to enter both text and run codes interactively, is called ***[notebook](https://jupyter.org/)***).

There are two types of cells: *Text* and *Code*. You can add your own cells. You can also edit the texts by double-clicking on them. It follows the [markup rules](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet). 

In order to execute (run) a cell, you can use one of the following ways:

0. `Shift + Enter` : executes a cell and goes to the next one.
1. `Ctrl + Enter` : executes a block but stays at the same block. This is equivalent to clicking on the *run* butten to the left of the cell, which appears when you hover the mouse over the `[ ]` icon. 
2. Use the `Runtime` tab (at the top of the page), which gives you more options as well.

In [None]:
# import libraries we will be using
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sn
import scipy.stats as stats
from sklearn import datasets

In this and subsequent labs, we will be using the libraries imported above among others.

You are strongly encouraged to refer to the API's or each library and familiarize yourselves with the available classes and methods.

# 1. Bayes Rule
Bayes Rule is used a lot for inference.
\begin{equation}
P(\textrm{A }
| \textrm{ B}) = 
\frac{P(\textrm{B }
| \textrm{ A})
\times
P(\textrm{A})}{P(\textrm{B})}
\end{equation}

In [None]:
def bayes_rule(likelihood: float, prior_a:float, prior_b: float) -> float:
  """
  Bayes rule calculation.
  P(A|B) = (P(B|A) * P(A)) / P(B)

  Given the likelihood i.e. P(B|A), the prior probability of A (P(A)) and the 
  prior probability of B (P(B)), the method calculates the posterior probability
  P(A|B).
  Args:
    likelihood: the likelihood, 0 <= P(B|A) <= 1
    prior_a: the prior probability of A, 0 <= P(A) <= 1
    prior_b: the  prior probability of B, 0 <= P(B) <= 1
  Returns:
    The posterior probability P(A|B)
  """
  assert likelihood >= 0 and likelihood <= 1, ("Probability should be between 0 and 1")
  assert prior_a >= 0 and prior_a <= 1, ("Probability should be between 0 and 1")
  assert prior_b >= 0 and prior_b <= 1, ("Probability should be between 0 and 1")

  posterior = likelihood * prior_a /  prior_b
  return posterior

##1.1 Revisit the excercise from the lecture notes!


A bank asks a machine learning engineer to build a system to decide whether to approve or not, loan applications (0.1% of all loan applications will not be repaid). 

The bank installs a system that recommends whether the loan should be approved or not.

The system, is such that it will correctly identify loans that will not be repaid, 99% of the time.

Similarly, the system will correctly identify loans that will be repaid 99% of the time.

What is the probability that a loan will not be repaid given that the system has identified it as such?

In [None]:
p_nr = 0.001 # 0.1% of all loan applications will not be repaid
p_r = 0.999 # 1-p_nr

p_neg_nr = 0.99 # correctly identify loans that will not be repaid, 99% of the time
p_pos_nr = 0.01 # 1-p_neg_nr

p_pos_r = 0.99 # correctly identify loans that will be repaid 99% of the time
p_neg_r = 0.01 # 1-p_pos_r

In [None]:
p_neg = p_neg_r * p_r + p_neg_nr * p_nr
p_nr_neg = bayes_rule(likelihood=p_neg_nr, prior_a=p_nr, prior_b=p_neg)
print(p_nr_neg)

## 1.2 Ex. 1: The Sally Clark Case (aka prosecutor's fallacy)

Sally Clark, a lawyer who lost her first son at 11 weeks and her second at 8 weeks, was convicted in 1999.


The chance of one random infant dying from SIDS was about **1 in 1,300** during this period in Britain.

The estimated odds of a second SIDS death in the same family was much larger, perhaps **1 in 100**, because family members can share a common environmental or genetic propensity for SIDS.

About **30 children out of 650,000** annual births in England, Scotland, and Wales were known to have been murdered by their mothers.

The number of double murders must be much lower, estimated as **10 times less likely**.


What is the probability of SIDS for both children?

What is the probability a random pair of siblings dies suddenly and expectedly but not from SIDS?

What is the probability that the cause of death was SIDS, given their unexplained deaths?

What scenario is more likely?

(Hint: Probability of death given SIDS is 1)

In [None]:
### your calculations here ###

## 1.3 Ex. 2
Company A supplies 40% of the computers sold and is late 5% of the time. 

Company B supplies 30% of the computers sold and is late 3% of the time. 

Company C supplies another 30% and is late 2.5% of the time. 

A computer arrives late - what is the probability that it came from Company A?


In [None]:
### your calculations here ###

# 2. Gaussian Distribution
aka the ‘normal distribution’, with probability density function (pdf):
\begin{equation}
f(x) = \frac{1}{\sigma\sqrt{2\pi}} 
  \exp\left( -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^{\!2}\,\right)
\end{equation}
where μ is the mean and σ is the standard deviation.

In [None]:
mean = 0; std = 1; variance = np.square(std)
x = np.arange(-5,5,.01)
f = np.exp(-np.square(x-mean)/2*variance)/(np.sqrt(2*np.pi*variance))

plt.plot(x,f)
plt.ylabel('gaussian distribution')
plt.show()

Plot a few distributions with different μ and σ.

What do you observe?

In [None]:
### your code here ###

# 3. Independent Random Variables
The covariance of two RVs X and Y is:
\begin{equation}
  \textrm{Cov}(X,Y)=\frac{\sum_{i=1}^{N}(x_{i}-\bar{x})(y_{i}-\bar{y})}{N-1}
\end{equation}

and Pearson's Correlation Coefficient is:

\begin{equation}
  \textrm{PCC}(X,Y) = \frac{\textrm{Cov}(X,Y)}{σ_xσ_y}
\end{equation}

Write two methods, one for each formula

In [None]:
def COV(x, y):
  cov = 0
  ### your code here
  return cov

def PCC(x, y):
  pcc = 0
  ### your code here
  return pcc

Below are annual data of people who drowned by falling into a pool and films Nicolas Cage appeared in.

Now using matplotlib, plot the data below as timeseries (hint: check matplotlib [documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html))

In [None]:
year = np.arange(1999, 2010)
cage_films = np.array([2, 2, 2, 3, 1, 1, 2, 3, 4, 1, 4])
drownings = np.array([109, 102, 102, 98, 85, 95, 96, 98, 123, 94, 102])

In [None]:
### your code here

Using the methods for COV and PCC, calculate the correlation coefficient between annual drownings in pools and Nic Cage films. Does it look like a strong correlation? What conclusion can be drawn for correlation vs causation in Independent Random Variables?

In [None]:
### your code here

Modify the COV and PCC methods to handle n-dimensional data

In [None]:
def COV(x: np.array, y: np.array) -> np.array:
  ### your code here
  return 

def PCC(x: np.array, y: np.array) -> np.array:
  ### your code here
  return

For a more complex example, we will use the [iris dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_iris.html#sklearn.datasets.load_iris) from sklearn

In [None]:
iris_db = datasets.load_iris(as_frame=True)
sn.pairplot(iris_db.data)

Using the COV and PCC methods written above, write your own code to produce the covariance and correlation matrices of the data features and seaborn [heatmap](https://seaborn.pydata.org/generated/seaborn.heatmap.html?highlight=heatmap#seaborn.heatmap) method to visualize the output (hint: the matrix will have shape  $num\_feats × num\_feats$ )

In [None]:
### your code here

Let's focus on `petal length` and `petal width.`

Using `numpy` make a 2D histogram for the two features, with `8x8` bins and visualize it using `matplotlib`.

In [None]:
data = iris_db.data[['petal length (cm)', 'petal width (cm)']]
### your code here
plt.legend()
plt.show()

Using the concepts learnt in the lectures and the histogram generated, find:


*   The Joint probability matrix
*   The conditional probability matrix
*   The marginal probabilities of the two features (2 vectors of size 8)
*   Fit a Gaussian to each dimension and plot it along with the marginal probability vectors





In [None]:
### your code here


Are petal length and width independent? Are they correlated?