# Exercises: Data Analysis with Python

In [None]:
import numpy as np
import matplotlib.pyplot as plt

## Linear Fit with Trial & Error
In this exercise we simulate a simple linear fit algorithm. The idea is to vary the slope and find the value that corresponds to the _least square_.

First, we create some fictional data. The y-values linearly depend on x, but with a random perturbation.

In [None]:
n_points = 10
slope = 2
noise = 1

x = np.arange(n_points)
y = slope * x + noise * np.random.rand(n_points)

Make a graph of y vs. x.

In [None]:
plt.plot(x, y, '.')
plt.xlabel('x')
plt.ylabel('y')

Now we assume that we do not know the "exact" slope, but we can estimate it to be between 1 and 2.

In [None]:
def sq_dev(x, y, m):
    diff = y - m * x
    sq = diff ** 2
    return np.sum(sq)

m_min = 1
m_max = 3

step = 0.001
trials = np.arange(m_min, m_max, step)

lsq = sq_dev(x, y, 0)

for t in trials:
    if (s:=sq_dev(x, y, t)) < lsq:
        lsq = s
        opt = t

print(f'The best fit is achieved for m = {opt:.3f}')

Compare the result to the values obtained with a built-in fit function.

In [None]:
from scipy.optimize import curve_fit

def f(x, m):
    return m * x
    
coeff, pcov = curve_fit(f, x, y)
m = coeff[0]

print(f'Best slope according to scipy.curve_fit: m = {m:.3f}')

As long as the steps for the guess are small enough, the result of the (more efficient) curve_fit can be reproduced.