# Thinking about data probabilistically

When we make a measurement, there will always be some uncertainty about that measurement. 
The source of this uncertainty is not under consideration just now, but rather what the uncertainty means and how we think about it in our analysis. 
Typically, a value with an uncertainty is written as $\mu \pm \sigma$, where $\mu$ and $\sigma$ are the measurement and uncertainty, respectively.
While there is (unfortunately) no standard for what this means, a common interpretation is that this is describing a {term}`normal distribution`, which is centred on $\mu$ with a standard deviation of $\sigma$ ({numref}`normal`). 

```{figure} ./images/normal.png
---
height: 250px
name: normal
---
A normal distribution (blue line), centred on 10.4 with a standard deviation of 1.6.
```

Consider the mathematical model, that provides a model dataset of a single data point, $x = a^2$, where $a$ is the parameter to be optimised.[^1]
We want to get the best agreement between our mathematical model and the experimental dataset in {numref}`normal`, i.e., we want a value of $a$ that maximises the value of $p(x)$, the model shows the highest probability of representing the data. 
We know the functional form for the distribution in {numref}`normal` (later, we will call this the {term}`likelihood`), 

```{math}
:label: normal
p(x) = \frac{1}{\sigma\sqrt{2\pi}}\exp{\bigg[-\frac{1}{2}\Big(\frac{x-\mu}{\sigma}\Big)^2\bigg]},
```

and therefore to get the best agreement between the model and the data, we need a value of $a$ that maximises (or more commonly, minimises the negative) of this function. 
This is achieved using an {term}`optimisation algorithm`, however, first we need a function that calculates $-p(x)$ from $a$.

In [1]:
from scipy.stats import norm

mu = 10.4
sigma = 1.6

def nl(a):
    x = a ** 2
    N = norm(loc=mu, scale=sigma)
    return -N.pdf(x)

The above function calculates the negative likelihood of a given $a$ for the data shown in {numref}`normal`.
We can then use the `scipy.optimize.minimize` optimisation algorithm to minimise this, other minimization libraries are available in Python. 

In [2]:
from scipy.optimize import minimize

result = minimize(nl, 2)
print(result)

  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: -0.24933892525083448
        x: [ 3.225e+00]
      nit: 1
      jac: [-6.724e-07]
 hess_inv: [[1]]
     nfev: 24
     njev: 12


We can see that the optimization was successful and the optimized value of $a$ is 3.225 (this is the parameter `result.x`), this is shown alongside the data in {numref}`normal_fit`.

```{figure} ./images/normal_fit.png
---
height: 250px
name: normal_fit
---
A normal distribution (blue line), centred on 10.4 with a standard deviation of 1.6 with the maximum likelihood value (red circle).
```

We can imagine extending this beyond datasets with just a single data point, where each data point is itself a normal distribution. 
We visualise this in {numref}`multid` for five data points, where on the plot left there is a familar way to show data, while on the right we show the view if we were to sit on the plane of the screen and look along the *x*-axis. 

```{figure} ./images/multid.png
---
height: 250px
name: multid
---
Plots with more data points, on the left we see the standard way to plot data with some uncertainty, while the right shows the view of the five likelihood functions that exist for each dataset (note that here the uncertainty in each data point is taken to be the same, i.e., it is {term}`homoscedastic`)
```

If we perform a "straight line fit" for the data in {numref}`multid`, we try to find values for the gradient and intercept of a straight line (our mathematical model), which results in model data that maximises, as best as possible given the constraint of the model, the likelihood for each data point. 
It is clear in {numref}`multid_fit` that the green model cannot reach the maximum of any individual distribution, but overall, this is the best possible agreement for the distributions. 

```{figure} ./images/multid_fit.png
---
height: 250px
name: multid_fit
---
{numref}`multid` with a fitted linear model (green). 
```

We will continue this way of thinking about our experimental data and the fitting of models through our investigations of model dependent analysis.

[^1]: This model has no real physical rationality, but is only a representative example. 