# Statistics: a Guide to the use of Statistical Methods in the Physical Sciences

### Chapter 6

In [1]:
import sys
import os
sys.path.append(os.path.abspath(".."))

import numpy as np
import scripts.functions as f

%load_ext autoreload
%autoreload 2

The **least squares** is a method of estimation to determine unknown parameters from a set of data. 

$$
\chi^2 = \sum^N_{i = 1} \left[\frac{y_i - f(x_i;a)}{\sigma_i} \right]^2
$$

The unknown parameter $a$ is in the function $f(x;a)$ which predicts the value of $y$ for any $x$. The data are a set of $N$ precise values of $x - \{x_1,x_2,...x_N\}$ with corresponding set of measurements of $y-\{y_1, y_2,..., y_N\}$ measured with some accuracy $\sigma_i$. The parameter $a$ is the value which gives the smallest $\chi^2$

$$
\frac{d\chi^2}{da} = 0
$$

For a gaussian distribution 
$$
- 2 \ln{L} = \chi^2
$$

#### Problems

Problem 1:

In [3]:
t = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
d = [0.0, 11, 19, 33, 40, 49, 61]
sigma = 2.0

velocity = f.ls_simple(t,d)
variance = f.ls_simple_var(t, sigma)

fx = [velocity * t_i for t_i in t]
chi2 = f.chi2(d, fx, sigma)

print(f"{round(velocity, 2)}±{round(np.sqrt(variance), 1)}")
print(round(chi2))

10.1±0.2
3


Problem 2

In [None]:
t_2 = [1.1, 2.2, 2.9, 4.1, 5.0, 5.8]
d_2 = [10, 20, 30, 40, 50, 60]
sigma_2 = 0.1

# given this situation, for y = mx, y -> t and x -> d since sigma is a error of t, so m = 1 / velocity
m = f.ls_simple(d_2, t_2)
velocity_2 = 1 / m
variance_m = f.ls_simple_var(d_2, sigma_2)
variance_v = variance_m / m**4 # from the law of combination of errors
fx_2 = [d_2i * m for d_2i in d_2] # f(x;a) = d*m
chi2_2 = f.chi2(t_2, fx_2, sigma_2)

print(f"{round(velocity_2, 1)}±{round(np.sqrt(variance_v), 1)}")
print(round(chi2_2, 2))

10.1±0.1
10.6
