# Bayesianness

Importing the necessary modules

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit, minimize

In [2]:
%matplotlib notebook

Defines the model used for the fitting and the negative log likelihood function of that model.

In [3]:
def Model(x, m, c, Noise=0):
    return m*x + c + Noise

def MinusLogLikelihood(Params, xdata, ydata, σ):
    m, c = Params[0], Params[1]
    Regression = 0
    σSum = 0
    for k in range(len(xdata)):
        Regression += (ydata[k] - Model(xdata[k], m, c))**2
        σSum += σ[k]
    σAverage = σSum/len(xdata)
    return len(xdata)/2*np.log(2*np.pi*σAverage**2) + (2*σAverage**2)**-1*Regression

....

I different form of the likelihood function.

In [25]:
def MinusLogLikelihood2(Params, x, y, yError):
    m, c = Params[0], Params[1]
    model = m*x + c
    σ = 1/(yError**2)
    return - (-0.5*(np.sum((y-model)**2*σ - np.log(σ))))

In [28]:
MinusLogLikelihood(Params, x, y, yError)

44.82173110027859

This is ~10 times higher than the other log function, and doesn't agree with minimising.

....

Creates a random set of numbers normally distributed and inputting it into y.

In [4]:
SetSeed = True

if SetSeed:
    Seed = 923114
    np.random.seed(Seed)

x = np.random.randn(20)
m, c = 2, 3
Noise = np.random.randn(20)
y = Model(x, m, c, Noise)
yError = np.random.random(20)

This minimises each of the parameters and the standard deviation on the normally distributed noise, in the negative log likelihood function. This gives us the "Best fit parameters" for this data.

In [29]:
x0 = [0, 0]
Minimising = minimize(MinusLogLikelihood, x0, args=(x, y, yError))

if Minimising.success:
    Params = Minimising.x
    ParamsError = np.sqrt((np.diag(Minimising.hess_inv)))
    print(f'The fitted parameters are m = {round(Params[0], 3)} ± {round(ParamsError[0], 3)}, '
          f'c = {round(Params[1], 3)} ± {round(ParamsError[1], 3)}, Cov = {round(1000*Minimising.hess_inv[0][1], 3)} x 10^-3')
else:
    print('There was an error when minimising this function')

The fitted parameters are m = 2.172 ± 0.109, c = 3.384 ± 0.124, Cov = 1.813 x 10^-3


An array of aribitrary numbers is created to plot the model with the calculated best fit parameters.

In [6]:
xOutput = np.linspace(min(x), max(x), 1000)
yOutput = Model(xOutput, Params[0], Params[1])

In [7]:
plt.figure()
plt.errorbar(x, y, yerr=yError, fmt='.', label='Normally Distributed Data')
plt.plot(xOutput, yOutput, label='Model for Data')
plt.title('Fitting a Straight Line to Data')
plt.xlabel('x')
plt.ylabel('y = f(x)')
plt.legend()
plt.show()

<IPython.core.display.Javascript object>

## It's Covariancin' Time

Checking the covariance using curve_fit

In [8]:
def CurveFit(X, Y):    
    Parameters, Covarience = curve_fit(Model, X, Y)
    XModel = np.linspace(min(X), max(X), 100)
    YModel = Model(XModel, *Parameters)
    return XModel, YModel, Parameters, Covarience

In [9]:
X, Y, Paramaters, COV = CurveFit(x, y)

It seems to be quite different, which could be a problem....

In [10]:
COV

array([[ 5.41132435e-02, -2.52400227e+05,  2.52400233e+05],
       [-2.52400233e+05,  7.59645135e+14, -7.59645130e+14],
       [ 2.52400233e+05, -7.59645130e+14,  7.59645126e+14]])

In [11]:
plt.figure()
plt.plot(x, y, '.')
plt.plot(X, Y)
plt.show()

<IPython.core.display.Javascript object>