<a href="https://colab.research.google.com/github/danielacthomas2001/cs-uy-4613/blob/main/mle_gaussian.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Gaussian Maximum Likelihood

##  MLE of a  Gaussian $p_{model}(x|w)$

You are given an array of data points called `data`. Your course site plots the negative log-likelihood  function for several candidate hypotheses. Estimate the parameters of the Gaussian $p_{model}$ by  coding an implementation that estimates its optimal parameters (15 points) and explaining what it does (10 points). You are free to use any Gradient-based optimization method you like.  

In [None]:
import numpy as np

data = [4, 5, 7, 8, 8, 9, 10, 5, 2, 3, 5, 4, 8, 9]

# add your code here

def neg_log_likelihood(data, mu, sigma):
    n = len(data)
    ll = -n/2 * np.log(2*np.pi*sigma**2) - 1/(2*sigma**2) * np.sum((data - mu)**2)
    return -ll

def compute_gradients(data, mu, sigma):
    data = np.array(data)
    n = len(data)
    d_mu = -1/sigma**2 * np.sum(data - mu)
    d_sigma = n/(2*sigma**2) - 1/(2*sigma**4) * np.sum((data - mu)**2)
    return d_mu, d_sigma

'''
here gradient descent is being used to estimate the paramters mu and sigma
What the algorithm is doing is minimizing the cost function (nll) 
'''
def grad_descent(data):
    mu, sigma = 0, 1
    n_epochs = 4000
    eta = 0.001
    for epoch in range(n_epochs):
        d_mu, d_sigma = compute_gradients(data, mu, sigma)
        mu -= eta * d_mu
        sigma -= eta * d_sigma
    return mu, sigma


mu, sigma = grad_descent(data)

print("mu:", mu)
print("sigma:", sigma)

mu: 6.2117740882059715
sigma: 2.48647227660039


## MLE of a conditional Gaussian $p_{model}(y|x,w)$

You are given a problem that involves the relationship between $x$ and $y$. Estimate the parameters of a $p_{model}$ that fit the dataset (x,y) shown below.   You are free to use any Gradient-based optimization method you like.  


In [None]:
import numpy as np

x = np.array([8, 16, 22, 33, 50, 51])
y = np.array([5, 20, 14, 32, 42, 58])

# add your code here
def neg_log_likelihood(x, y, mu, sigma):
    n = len(x) + len(y)
    ll = (-n/2 * np.log(2*np.pi*sigma**2) - 1/(2*sigma**2) * np.sum((x - mu)**2)) + (-n/2 * np.log(2*np.pi*sigma**2) - 1/(2*sigma**2) * np.sum((y - mu)**2))
    return -ll

def compute_gradients(x, y, mu, sigma):
    x = np.array(x)
    y = np.array(y)
    n = len(x) + len(y)
    d_mu = (-1/sigma**2 * np.sum(x - mu)) + (-1/sigma**2 * np.sum(y - mu))
    d_sigma = (n/(2*sigma**2) - 1/(2*sigma**4) * np.sum((x - mu)**2)) + (n/(2*sigma**2) - 1/(2*sigma**4) * np.sum((y - mu)**2))
    return d_mu, d_sigma

def grad_descent(x, y):
    mu, sigma = 0, 1
    n_epochs = 8000
    eta = 0.001
    for epoch in range(n_epochs):
        d_mu, d_sigma = compute_gradients(x, y, mu, sigma)
        mu -= eta * d_mu
        sigma -= eta * d_sigma
    return mu, sigma


mu, sigma = grad_descent(x, y)

print("mu:", mu)
print("sigma:", sigma)






mu: 18.72591586115306
sigma: 10.582351326996752
