# Maximum Likelihood Estimation

In [None]:
import numpy as np, pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from scipy.optimize import minimize
import scipy.stats as stats
import statsmodels.api as sm
from myst_nb import glue

Let's consider the simplest possible scenario, where some force $f_y$ is modelled as function of velocity $v$ and some hydrodynamic coefficient $\beta$:

$$ f_y = \beta \cdot v  $$ (eq_model)

One physical experiment is carried out where the force $f_y$ is measured at a certain speed $v$. (We also measure that there is no force at rest ($v=0$) to confirm the lack of interception term in the model {eq}`eq_model`)

In [None]:
# generate data
np.random.seed(42)
N = 10
v = np.linspace(0,5,N)
beta = 3
scale = 1.0
ϵ = np.random.normal(loc = 0.0, scale = scale, size = N)
f_y = beta*v + ϵ
df = pd.DataFrame({"f_y":f_y, "v":v})

In [None]:
n = 5
f_y_sample = f_y[n]
v_sample = v[n]
beta_hat = f_y_sample/v_sample
glue("f_y_sample", f_y_sample, display=False)
glue("v_sample", v_sample, display=False)
glue("beta_hat", beta_hat, display=False)

In [None]:
df['beta'] = df['f_y'] / df['v']
glue("tab_experiments", df)

{glue:}`f_y_sample` [N] force ($f_y$) was measured during the conducted experiment at a speed ($v$) of {glue:}`v_sample` [m/s].  
As the model ({eq}`eq_model`) contains only one unknown parameter: $\beta$ this one only experiment is enought to determine $\beta$:
$$\beta = \frac{f_y}{v} $$ (eq_beta_deterministic)
So that beta can be estimated as {glue:}`beta_hat`.

If the measurement was perfect and the used model describes the physics perfectly this estimation of $\beta$ is the correct one. In order to double check this several experiment was conducted, as seen in the table below:

{glue:`tab_experiments`}

It can be seen that {eq}`eq_beta_deterministic` gives different estimates of $\beta$ from the different experiments. So there must be some measurement errors or model errors (or booth) in the data from these experiments.

In [None]:
# plot
sns.regplot(data=df, x='v', y='f_y');

In [None]:
# define likelihood function
def MLERegression(params, y):
    beta, sd = params[0], params[1]
    yhat = beta*x # predictions
    # next, we flip the Bayesian question
    # compute PDF of observed values normally distributed around mean (yhat)
    # with a standard deviation of sd
    negLL = -np.sum( stats.norm.logpdf(y, loc=yhat, scale=sd) )
    # return negative LL
    return(negLL)

In [None]:
# let’s start with some random coefficient guesses and optimize
guess = np.array([5,2])
results = minimize(MLERegression, guess, args=(y,), method = "Nelder-Mead")

In [None]:
results