## Exercise 1: Line of Best Fit

In school, you probably worked with scatterplots and drew lines of best fit on the data that didn't deviate too far from the scatterpoints. By doing it this way, we would minimize the **Error** of the line of best fit.

It is important to choose the model that will generate the least errors when fitted to your data. This will ensure far more reliable outputs. 

![image.png](../assets/ex6-line-best-fit.png)

There are many error functions to choose from when it comes to modelling data. In this exercise , we will look at *Ordinary Least Squares*. As you can see in the picture, the vertical distance from the line of best fit to our data point is called the error. If we square the errors and take an average of the errors, we have obtained the **Mean Square Error (MSE)** for the model.

For example, given

- a line of best fit with $\hat y = mx_i$.
- data points $(-2,5)$, $(0,0)$, $(3,-6)$.

 We can compute the MSE:

 $$ MSE = \frac{1}{n}\sum_{i=1}^n (y_i - \hat y)^2$$

where $n$ is the number of data points, $y_i$ is the $y$ coordinate of the data point and $hat y$ is the output the line of best fit predicts for that specific $x$ coordinate.

$$ MSE = \frac{1}{n}\sum_{i=1}^n (y_i - mx_i)^2$$

Let's plug our data points into the error function for the model.

$$ MSE = \frac {1}{3}[(5-(-2)m)^2 + (0-(0)m)^2 + (-6-3m)^2]$$

### Exercise 1.1

Use code to figure out which value of $m$ will minimize our error function. By hand, try to expand the equation for the MSE that we found above and clean it up. Then write it as a function in the cell below.

Remember to format it as:

```
def f(x):
    return...
```

When defining your function, use `x` instead of $m$.

In [56]:
# exercise 1.1
import numpy as np

#1- let's try to find x for Y=0

def find_Solution_With_Discriminant(y):
    #2nd degree Polynome  a^2.x + b.x + c = 0
    #2nd degree polynom is: (13/3)x^2 + (56/3)x + 61/3 = 0
    a=13/3
    b=56/3
    c=61/3
    # Δ=b**2 - 4ac
    #if Δ < 0 : No solution in R
    #if Δ = 0 : 1 solution x = −b/2a. 
    #if Δ > 0 : 2 solutions
    #x1 =( −b + √Δ ) / 2a et x2 =( −b − √Δ ) / 2a.
    delta  = (b)**2 - 4*(a)*(c)
    if delta < 0:
        print('no solution in R: f(x) is never 0.')
        x1,x2 = None, None
    if delta == 0:
        x1, x2 = -(b)/(2*a)
    if delta > 0:
        x1 = (-(b)+math.sqrt(delta))/(2*(a))
        x2 = (-(b)-math.sqrt(delta))/(2*(a))
    return x1,x2
                         
find_Solution_With_Discriminant(0)  # => RETURNS:  no solution in R. Y is never 0



#2- Let's try to find x, when Y is at its minimum. It means the derivative f'(x)~0. 
#Here a value of x was found for 0<f'(x)<0.09

def f(x):
    return (13/3)*x**2 + (56/3)*x + 61/3

def fprime(x):
    h = 1e-5
    return (f(x+h) - f(x))/h 

xRange = np.linspace(-10,0,500)
#print(xRange)
for i in range(len(fprime(xRange))):
    if 0 < abs(fprime(xRange)[i]) and abs(fprime(xRange)[i]) < 0.09:
        print(f"\n The value of x for when f(x) is at its minimum is around: {xRange[i]}")
        print(f'f\'(x) = {fprime(xRange)[i]}')





no solution in R: f(x) is never 0.

 The value of x for when f(x) is at its minimum is around: -2.1442885771543088
f'(x) = 0.08287566437559235


### Exercise 1.2

1. Use the `linspace` function to create 29 points between -28 and 10 and save the result as `W`.
2. After that, divide every element in W by 13 and save this new result as `W`.

In [21]:
# exercise 1.2
W = np.linspace(-28,10,29)
W = W/13
W

array([-2.15384615, -2.04945055, -1.94505495, -1.84065934, -1.73626374,
       -1.63186813, -1.52747253, -1.42307692, -1.31868132, -1.21428571,
       -1.10989011, -1.00549451, -0.9010989 , -0.7967033 , -0.69230769,
       -0.58791209, -0.48351648, -0.37912088, -0.27472527, -0.17032967,
       -0.06593407,  0.03846154,  0.14285714,  0.24725275,  0.35164835,
        0.45604396,  0.56043956,  0.66483516,  0.76923077])

### Exercise 1.3

Run $f(W)$ and $fprime(W)$ in the cell below and determine the value in `W` that makes the `fprime = 0` (or very close to it!).

Use the loop that you wrote in the previous exercise. That value should give us the value of $m$ that makes it so our line of best fit has the smallest error.

NOTE: Make the print statement in your loop as "The value of m that gives the smallest error is..."

In [22]:
# exercise 1.3

def f(x):
    return (13/3)*x**2 + (56/3)*x + 61/3

def fprime(x):
    h = 1e-5
    return (f(x+h) - f(x))/h 


MinumumSolutionsList = []
upperLimit=0 #Let's try to find the closest value next to Zero. The UpperLimit starts from Zero and increase by 0.0001 if no solution is found where fPrime is 0
while (len(MinumumSolutionsList)==0):
    for i in range(len(fprime(W))):
        if 0 < abs(fprime(W)[i]) and abs(fprime(W)[i]) < upperLimit:
            print(f"\nThe value of x is: {W[i]}")
            print(fprime(W)[i])
            MinumumSolutionsList.append(W[i])
            print(f'Note: The upper limit is tested with : {upperLimit} accuracy')
            
    upperLimit+=0.0001








The value of x is: -2.1538461538461537
4.333315928306546e-05
Note: The upper limit is tested with : 0.0001 accuracy


## Exercise 2: Multivariate Calculus + Linear Algebra

Up to now, we've looked at functions with respect to one variable, but what if we have more than one variable in our function and we want to take a derivative?

Going back to our Error function exercise from the line of best fit, what if I wanted to fit the line:

$$\hat y = mx_i + b$$
to the points (-3,7), (-2,5) and (-1,3).

This would give a Mean Square Error function as:

$$ f(m,b) = \frac{1}{n}\sum_{i=1}^n (y_i - mx_i - b)^2$$
$$f(m,b) = \frac {1}{3}[(7+3m-b)^2 + (5+2m-b)^2 + (3+m-b)^2]$$

and say we wanted to find values of $m$ and $b$ that minimized this function. In this case, we'd apply a **partial derivative**. In other words, a derivative with respect to one of the variables holding the other constant. If we take derivatives of the above function with respect to $m$ and $b$, we get:

$$\frac{\partial f(m,b)}{\partial m} = \frac{2}{3}[(7+3m-b)(3) + (5+2m-b)(2) + (3+m-b)] $$

$$\frac{\partial f(m,b)}{\partial b} = \frac{2}{3}[(7+3m-b)(-1) + (5+2m-b)(-1) + (3+m-b)(-1)] $$

> To better understand how we obtained these derivatives by hand, watch [this video](https://youtu.be/TgIl15Nlg_U) for a more detailed explanation.

Let's clean up the above equations:

$$\frac{\partial f(m,b)}{\partial m} = \frac{2}{3}[34 + 14m - 6b] $$

$$\frac{\partial f(m,b)}{\partial b} = \frac{2}{3}[-15 -6m + 3b] $$

Equating the partial derivatives to 0 since we want to obtain a minimum and multiplying both sides by $\frac{3}{2}$ we get a familiar system of equations:

$$34 + 14m - 6b = 0$$

$$-15-6m+3b = 0$$

Converting to matrix form, we get:

$$\begin{bmatrix} 34 & 14 & -6 \\ -15 & -6 & 3 \end{bmatrix} \begin{bmatrix} 1 \\ m \\ b \end{bmatrix} = \begin {bmatrix} 0\\ 0\end{bmatrix} $$

From here, we can use our standard matrix operations to solve for values of $m$ and $b$.

We can rewrite the above equation as:

$$\begin{bmatrix} 14 & -6 \\ -6 & 3 \end{bmatrix} \begin{bmatrix} m \\ b \end{bmatrix} = \begin {bmatrix} -34\\ 15\end{bmatrix} $$

**EXTRA** Try to workout by hand how I was able to make the conversion between the two matrices. 

Use the cell below to write code that will solve the above matrix for the values of $m$ and $b$ that minimize our error.

In [19]:
# exercise 2
from scipy import linalg
Matrix = np.array([[14,-6],[-6,3]])
#let's check the determinant:
print(f'The determinant is {linalg.det(Matrix)} > 0')

Solutions = np.array([-34,15])
m,b = np.linalg.solve(Matrix, Solutions)
print(f'the solution is m={m} and b={b} ')



The determinant is 6.0000000000000036 > 0
the solution is m=-2.0 and b=1.0 


In [None]:
# exercise 2 EXTRA
1- if we calulate the result of the multiplication by anticipation (without knowing m or b)
with the known parameters 1 of the Matrix of Unknown i.e |1|,
it gives The final mutliplication result is as follow:

            | 1 |        
            | m |
            | b |
|34 14 -6| *      = | 34*1 + 14m -6b |
|-15 -6 3|          |-15*1 - 6m  +3b |

2- But we want the result of these 2 first degree polynomes to be equal to 0:

| 34*1 + 14m -6b | = |0|
|-15*1 - 6m  +3b | = |0|
     
3- so we can pass the known values on the right side of the equal sign

| 14m -6b | = |- 34|
|- 6m +3b | = |+ 15|

4- The system becomes:
                  
          | m |
          | b |
|14 -6| *      = |-34|
|-6  3|          | 15|


_Using this method, we can fit more complicated models that have more than one parameter to our data for better results!_