# ___

# [ Machine Learning in Geosciences ]

**Department of Applied Geoinformatics and Carthography, Charles University** 

*Lukas Brodsky lukas.brodsky@natur.cuni.cz*



## Gradient Descent - exerscises

Task: **fix the gradient descent optimisation** for different data set. The steps of GD are the same. 


**Sub-tasks**:

1/ Improve the Gradient descent function: store history of loss. 

2/ Plot the loss against the epoches. 

3/ Store the history of w and b paramseters.  

4/ Plot w and b against the epochs  


In [None]:
import os
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
import matplotlib
matplotlib.rcParams.update({'font.size': 16})

### Data 

Use `data_gd.csv`,  fetature `f1` from the dataset. 


In [None]:
# os.listdir()

In [None]:
# Adjust dir path as needed 
df = pd.read_csv('data_gd.csv', sep=',', header='infer')

In [None]:
x = df['f1']
y = df['y']

In [None]:
# Data plot 
plt.scatter(x, y)

### Task
Build a regression model to predict `y` based on other `x`values.

In [None]:
print("We have {} records to tranin the model.".format(x.size))

### Model: linear regression

 
$$ 
    f(x) = wx + b
$$ 

where `x` is a vector of input features;
`y` is a vector of outputs (targets), also response variable; 
`b` is the bias term (intercept), also often abbreviated as w0
`w` is the weight(s) (direction of the linear model) 

Find the optimal values of `w` and `b`!



In [None]:
# Prediction model
def model(x, w, b):
    """Linear model
    """
    
    return w*x + b  

### Loss function (MSE)

$$
    Loss = L = \frac{1}{N} \sum_{i=1}^{N}(y_i - (wx_i + b))^2 
$$ 

In [None]:
# Define the loss function 
def loss(x, y, w, b):
    """Loss function (MSE)  
    """
    N = len(x)

    # Initialize loss
    total_error = 0.0 
    for i in range(N):
        total_error += (y[i] - (w*x[i] + b))**2

    return total_error / N 


### Set parameters

In [None]:
# Learning rate (here alpha) 
alpha = 0.001 

# Number of epochs to iterate 
epochs = 15000

### Model weights initialisation

In [None]:
# Initialization
w = 0 
b = 0 

### Calcualte the gradinets


$$
\frac{\partial L}{\partial w}=\frac{1}{N} \sum_{i=1}^{N}(2\cdot(y_i - (wx_i + b))\cdot(-x_i);
$$ 

$$
\frac{\partial L}{\partial b}=\frac{1}{N} \sum_{i=1}^{N}(2\cdot(y_i - (wx_i + b))\cdot(-1);
$$ 



### Define update function

$$
    w \leftarrow w - \alpha\frac{\partial L}{\partial w}, 
$$

and 

$$
   b \leftarrow b - \alpha\frac{\partial L}{\partial b}. 
$$

In [None]:
# Define the update function

def update(x, y, w, b, alpha):
    """Update function, which returns updated parameters. 
    """
    dr_dw = 0.0
    dr_db = 0.0
    N = len(x)

    for i in range(N):
        dr_dw += -2 * x[i] * (y[i] - (w * x[i] + b))
        dr_db += -2 * (y[i] - (w * x[i] + b))

    # Update w and b
    w = w - (dr_dw/float(N)) * alpha
    b = b - (dr_db/float(N)) * alpha

    return w, b 

### Gradient descent function

The Gradien Descent function to iterate **over the epochs** where we recalculate partial derivatives using the above function, update `w`nand `b`; we **continue the process until convergence**. 

In [None]:
# Define the gradient function

def gradient_descent(x, y, w, b, alpha, epochs):
    """Gradient descent process. 
    """

    counter = 0;
    for e in range(epochs):
        w, b = update(x, y, w, b, alpha)

        # Log the progress
        if (e == 0) or (e < 3000 and e % 200 == 0) or (e % 3000 == 0):
            print("epoch: ", str(e), "loss: "+str(loss(x, y, w, b)))
            print("w, b: ", w, b)
            print('---')
            # Plot the update 
            plt.figure(counter)
            axes = plt.gca()
            axes.set_xlim([0,300])
            axes.set_ylim([0,50])
            plt.scatter(x, y)
            X_plot = np.linspace(0,300,300)
            plt.plot(X_plot, X_plot*w + b, 'r-')
            counter += 1
    return w, b                

### Run the Gradient Descent procedure

In [None]:
# Run the Gradient Descent algorithm
w, b = gradient_descent(x, y, w, b, alpha, epochs)
print('End of training!')

In [None]:
print(w, b)

#### What is the result of the optimisation procedure?

1/ Improve the Gradient descent function: store history of loss. 

2/ Plot the loss against the epoches. 

3/ Store the history of w and b paramseters.  

4/ Plot w and b against the epochos  

5/ Evaluate the plots 

### Task 1/ Improve the Gradient Descent function, store history of loss. 

In [None]:
pass

### Task 2/ Plot the loss against the epochs. 

In [None]:
pass

### Task 3/ Store the history of w and b paramseters.  

In [None]:
pass 

### Task 4/ Plot w and b against the epochos  

In [None]:
pass