# Neural Computation (Autumn 2020)
# Lab 4: Maximum Likelihood
In this tutorial, we will try to make you further understand:

* Probability density/mass functions
* Joint probability density/mass functions
* Log-likelihood functions
* Maximum likelihood estimation for different distributions, such as
* Bernoulli distribution 
* Univariate Gaussian distribution

First, you need to download a `PDF` file called *`maximum_likelihood.pdf`* from canvas, where we have toilored three exercises for this tutorial. Note that for each exercise in the `PDF` you are given a few sub questions to answer. These sub questions were used to break down the original problem into different pieces, each of which is relatively easy to derive an answer. After you have the answers of these exercises ready, you need to come back to this notebook to finish the following two exercises (unlike the `PDF` Exercise 3 here is different). 

In each following exercise, we have randamly sampled three datasets, i.e., Dataset 1, 2 and 3, respectively having 10, 100 and 100000 obervations. For Exercise 1, each observation $x^{(i)}$, $\forall i\in \{1,...,n\}$ where the sample size $n=10$, $100$ or $100000$, is an independent sample from a Bernoulli distribution with the *success probability* $q \in [0,1]$. For Exercise 2, each observation $x^{(i)}$ is independent and was sampled from a normal distribution $\cal N (\mu, \sigma^2)$. We note that for both excerises we used the same model parameters ($q$ or $(\mu, \sigma^2)$) for all three datasets. 


In Exercise 1, you are asked to code to estimate the maximum likelihood for $q$ for each dataset. 

In Exercise 2, you are asked to code to estimate the maximum likelihood for $(\mu, \sigma^2)$ for each dataset. 

Note that you will need to use your derviations from Excerise 1 and 2 in the `PDF` file to guide you to do the exceries in this notebook. 



## Exercise 1: Bernoulli Distribution

### Reading data from a CSV file

We first read the cat data set from a local comma separated value (CSV) file, and store the body and heart weights in two arrays. 

We will use the [genfromtxt](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.genfromtxt.html) function in NumPy to read the data set from a CSV file located at some URL.

    np.genfromtxt(url, delimiter=None, skip_header=0, usecols=None) 

This function takes several arguments, including: 

* `url`: a string that specifies a file name or an URL for the CSV file, 
* `delimiter`: a string used to separate value, 
* `skip_header`: an int indicates the number of lines to skip at the beginning of the file,  
* `usecols`: a sequence indicating which columns to read, with 0 as first column. 
            
We can read data from the file and store them into a variable such as `B10` using the following line. Each variable is a 1-dimensional NumPy arrary. 

You first need to import the numpy package to load the dataset as well as to do matrix calculation to derive statistics 

In [None]:
import numpy as np

In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Bernoulli_Distrbution/Bernoulli_10.csv"
B10 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

Above B10 is a dataset with 10 observations and was randamly sampled a Bernoulli distribution with the *success probability* $q$. Write your answer below to compute the maximum likelihood for $q$

In [None]:
q = np.mean(B10)

print(q)

In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Bernoulli_Distrbution/Bernoulli_100.csv"
B100 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

Above B100 is a dataset with 100 observations and was randamly sampled a Bernoulli distribution with the *success probability* $q$. Write your answer below to compute the maximum likelihood for $q$

In [None]:
q = np.mean(B100)

print(q)

0.49


In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Bernoulli_Distrbution/Bernoulli_100000.csv"
B100000 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

Above B100000 is a dataset with 100000 observations and was randamly sampled a Bernoulli distribution with the *success probability* $q$. Write your answer below to compute the maximum likelihood for $q$

In [None]:
q = np.mean(B100000)

print(q)

0.49945


### Question: from above calcuations what kind of rules have you found ? 
Answer: 
* The three distributions above were randomly generated by using a Bernoulli distribution with *success probability* $q = 0.5$ 

* From the solution of Exercise 1, it is clear that the *success probability* is the mean value of all the observations. However, due to sample size, the mean value is not strictly 0.5 for each case. You should be noting that as simple goes to infinity, we are able to get the mean value of 0.5.

## Exercise 2: Univariate Gaussian Distribution

In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Gaussian_Distrbution/Gaussian_10.csv"
G10 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

G10 is a dataset with 10 observations and was randamly sampled a normal distribution with $(\mu, \sigma^2)$. Write your answers below to compute the maximum likelihoods for $(\mu, \sigma^2)$, respectively.

In [None]:
mu = np.mean(G10)
sigma_square = np.var(G10)

print(mu)
print(sigma_square)

1.7359903060390747
7.946229149386781


In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Gaussian_Distrbution/Gaussian_100.csv"
G100 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

G100 is a dataset with 100 observations and was randamly sampled a normal distribution with $(\mu, \sigma^2)$. Write your answers below to compute the maximum likelihoods for $(\mu, \sigma^2)$, respectively.

In [None]:
mu = np.mean(G100)
sigma_square = np.var(G100)

print(mu)
print(sigma_square)

1.9096529632848913
16.471460626535436


In [None]:
url = "http://www.cs.bham.ac.uk/~duanj/log/Gaussian_Distrbution/Gaussian_100000.csv"
G100000 = np.genfromtxt(url, delimiter=",", skip_header=0, usecols=(0))

G100000 is a dataset with 100000 observations and was randamly sampled a normal distribution with $(\mu, \sigma^2)$. Write your answers below to compute the maximum likelihoods for $(\mu, \sigma^2)$, respectively.

In [None]:
mu = np.mean(G100000)
sigma_square = np.var(G100000)

print(mu)
print(sigma_square)

2.02020308037172
15.958237670611767


### Question: from above calcuations what have you found ? 
Answer:

* The three distributions above were randomly generated by using a normal distribution with mean being 2 and variance being 16 

* From the solution of Exercise 2, it is clear that $\mu$ is the mean value of all the observations and $\sigma^2$ is the variance of all the observations. When simple size is samll, the error is big.

## Exercise 3: Sample a normal Distribution

Write down your code below to sample a Bernoulli distrbution with the success probability being 0.1 and a normal distribution with mean being 4 and standard derviation being 8. For each sampling, please show the impact of using different sample sizes, i.e., $n=10$, $100$ and $100000$.  

In [None]:
sample_size = [10, 100, 100000]
success_prob = 0.1 
mean = 4 
std = 8
# setting binomial parameter 1 gives a Bernoulli distribution
for s in sample_size:
    BD = np.random.binomial(1, success_prob, size=s)
    ND = np.random.normal(mean, std, size=s)
    print('for sample size {} of Bernoulli distrbution, success probability is {}'.format(s, BD.mean()))
    print('for sample size {} of normal distr, mean value is {}'.format(s, ND.mean()))
    print('for sample size {}, standard derviation is {}'.format(s, ND.std()))
    

for sample size 10 of Bernoulli distrbution, success probability is 0.0
for sample size 10 of normal distr, mean value is 2.0760007757034904
for sample size 10, standard derviation is 9.973458367435411
for sample size 100 of Bernoulli distrbution, success probability is 0.11
for sample size 100 of normal distr, mean value is 3.752333156271396
for sample size 100, standard derviation is 7.958633077992727
for sample size 100000 of Bernoulli distrbution, success probability is 0.10031
for sample size 100000 of normal distr, mean value is 3.9702218004060432
for sample size 100000, standard derviation is 7.999950935216351
