# LSE ST451: Bayesian Machine Learning
## Author: Kostas Kalogeropoulos

## Week 1: Bayesian Inference Concepts

Topics covered

 - Introduction to Python, e.g.working with arrays, basic operation and plotting
 - Pseudo-Random numbers
 - Bayesian Inference (Point and Interval Estimation, Forecasting) with Monte Carlo

### Basic operations in Python

In this first session we cover basic math operations. 

We start with an example of basic arithmetic.

In [None]:
a=3
b=2*a
print(b)
print(a*b)

Next we will load the library **NumPy**

In [None]:
import numpy as np

and see how we can handle 1d arrays

In [None]:
a = np.array([0, 1, 2, 3])
print(a,a.ndim, a.shape, len(a))

and 2d arrays

In [None]:
b = np.array([[0, 1, 2], [3, 4, 5]])
print(b,b.ndim, b.shape, len(b))

We continue with some standard commands to create arrays and perform operations on them. The command linspace(start,stop,N), returns N evenly spaced values in the interval [start, stop]

In [None]:
x = np.linspace(0, 3, 20)
y = np.linspace(0, 9, 20)
print('x',x)
print('y',y)

Some commands to access 1d array elements and compute sums and averages on them

In [None]:
y[0], y[2], y[-1], y[y>7], y[0:3], np.mean(y), np.sum(y[y>8])

Some commands to create 2d arrays

In [None]:
np.ones((3, 3))

In [None]:
np.zeros((2, 2))

In [None]:
np.eye(3)

In [None]:
a=np.diag(np.array([1, 2, 3, 4]))
print('a',a)

In [None]:
Some operations to access 2d array elements and compute sums and averages on them

In [None]:
a[0,0], a[2,1], a[3,3], a[2], a[:,1], a[a>2], np.sum(a), np.mean(a[3]), np.sum(a[a>1])

### Pseudo - random numbers

Define a sequence $\{s_i\}$ and set (for some $a,b,s_0,M$)
$$
s_{i+1}=(a\; s_i \;+ \;b)  \text{mod} M.
$$

Then $U_i =\frac{s_i}{M} \;\sim\;$ Uniform$(0,1)$.

 For large $M$, the numbers obtained by the algorithm above satisfy all the properties of random samples from a Uniform$(0,1)$ distribution. They are called pseudo-random numbers.

 Given pseudo-random numbers from Uniform($0,1$) we can simulate from other known distributions.

In [None]:
a = np.random.rand(4)       # Uniform(0, 1)
print('4 uniform random numbers',a)
b = np.random.randn(10)      # Normal(0,1)
print('10 normal random numbers',b)  

Note that you all take different numbers. Also if you repeat you will get another set of numbers.

Invdividually, everything seems to be different but collectively there is something common in all these numbers.

### Bayesian Inference with Monte Carlo 

We will draw 100 points from a Uniform Distribution and use the **sample*** mean, variance, std and median to calculate their population counterparts

In [None]:
n=100
x = np.random.rand(n)
print('mean',np.mean(x),np.sum(x)/n)
print('variance',np.var(x),np.mean(x**2)-np.mean(x)**2)
print('standard deviation',np.std(x))
print('median',np.median(x))

The calculations seem to be in the right direction but there is Monte Carlo error. This can be reduced to arbitrarily small level by simply drawing more random numbers.

In [None]:
n=100000
x = np.random.rand(n)
print('mean',np.mean(x),np.sum(x)/n)
print('variance',np.var(x),np.mean(x**2)-np.mean(x)**2)
print('standard deviation',np.std(x))
print('median',np.median(x))

By the way did you know the variance of Uniform[0,1] distribution? 

If you did well done!

If not, don't worry you can always find it with Monte Carlo :)

You can also find percentiles, minimums and maximums.

In [None]:
n=10000
x = np.random.rand(n)
print('min and max',np.min(x),np.max(x))
x = np.random.randn(n)
print('95% credible intervals',np.percentile(x,(2.5,97.5)))

If you all want to get the same results or if you want to make sure you get the same results each time you can fix the randon seed.

In [None]:
np.random.seed(1234)  
np.random.randn(4)

### Plotting in Python with MatPlotLib

Finally time for some plotting. We start with scatter plots.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
n = 10
x = np.linspace(0,1,n)
y = 2-3*x
plt.plot(x, y)

In [None]:
plt.plot(x, y,'o')

In [None]:
plt.plot(x, y)
plt.plot(x, y,'o')

and continute with histograms

In [None]:
n=400
y = np.random.randn(n) 
plt.hist(y,density=1)

### Density estimation

Estimation of a pdf using a Kernel density estimator Draw a sample of n number of observations 
from a density f, and call the sample y. (Plot a histogram of y). Use a KDE on the sample y.
Obtain the actual density, f. Plot actual and estimated pdf

In [None]:
from scipy import stats
from scipy.stats import norm
from scipy.stats import gamma
from scipy.stats import beta

In [None]:
n=400
y = np.random.randn(n) 
estf = stats.gaussian_kde(y)
x = np.linspace(-4,4,n)
f = norm.pdf(x)
plt.hist(y,density=1)                   #histogram of sample
plt.plot(x,estf(x),label='KDE')         #Kerned Density Estimate
plt.plot(x,f,color='r',label='True Density')  #Density of Normal distribution 
plt.legend()

### Activities

Now let's put it all together with some activities relevant to Bayesian Inference

#### Activity 1

Assume that the posterior of the variable of interest $\theta$, $\pi(\theta|y)$ is the Gamma($2,1$) distribution. Provide the answers to the below using Monte Carlo. Verify by contrasting with the exact answers where possible.

- Provide the Bayes estimate for $\theta$.
- Give a 95\% credible interval for $\theta$.
- Calculate the posterior variance, $E[\theta^3|y]$ and $P(\theta>3|y)$.
- Give a kernel density plot for its pdf. 


#### Activity 2

Assume that the posterior of the variable of interest $\theta$, $\pi(\theta|y)$ is the Beta($2,7$) distribution. Provide the answers to the below using Monte Carlo. Verify by contrasting with the exact answers where possible.

- Provide the Bayes estimate for $\theta$.
- Give a 95\% credible interval for $\theta$.
- Calculate the posterior variance, $E[1/\theta|y]$ and $P(\theta<0.2|y)$.
- Give a kernel density plot for its pdf. 



#### Activity 3 - Bank Casheiers stuff

A bank manager wants to decide on whether she should employ additional staff in the cashiers of their branch. This is usually determined by the number of customers visiting their branch on an average day.

**Sample:** An experiment is conducted and the number of customers visiting the branch is recorded on $20$ random days and are shown below:

$103, 115,  94, 102, 108, 108,  92, 113, 109,  89,$

$96, 106, 118, 104, 116, 106, 104, 100,  98, 114$.

1. Split the data into a training set (first 10 observations) and a test set (remaining observations).
2. Assign a Poisson model to the training set and use it to obtain point forecasts for the test set. Use both Frequentist and Bayesian inference and compare their performance.
3. Repeat the previous step but with 95\% prediction intervals rather than point forecasts. 
4. For the frequentist approach one can estimate the Poisson parameter $\lambda$ via the MLE, $\hat{\lambda}$ which is the sample mean (of the training data), . The distribution for the future data can then be the Poisson($\hat{\lambda}$).
5. Since there is no prior information, the Gamma($0.001,0.001$) can be used as prior. See lecture slides for the posterior.
6. The Mean Squared Error can be used to assess the forecasts. For the prediction intervals we can just count the number of test data contained in them. 