# Simplest possible machine learning example: linear regression

We use linear regression i.e. adjusting a straight line to a set of data points as an example of how we can gradually learn the parameters for a model.

## Imports (technical detail)

In [None]:
%matplotlib inline
import numpy
import matplotlib.pyplot as plt

## Training data

In [None]:
X = numpy.array(range(1,11))
y = numpy.array([2.5, 3, 6.5, 7.3, 6.9, 7.3, 10.3, 9.9, 10.3, 12.5])

plt.scatter(X, y)

## Supposed relationship between input and output

From the plot above, it seems reasonable to believe that there is a linear relationship between x values (input) and y values (output).

In [None]:
def line(k, m):
    return k + m*X

## Training the model

We'd like to find out the best values for k and m. This can be done algebraically, but for demonstration purposes we'll do it using machine learning techniques.

We'll use mean squared error as our cost function and optimize using batch gradient descent.

The model training setup looks like this:

![Model training setup](linear_regression.jpg)

In [None]:
alpha = 0.01 # learning rate

def plot_current_state(k, m, cost_history):
    plt.figure(figsize=(16, 6))
    plt.subplot(1,2,1)
    plt.xlabel("X", fontsize=20, fontweight='bold')
    plt.ylabel("y", fontsize=20, fontweight='bold')
    plt.plot(X, line(k, m), 'r-')
    plt.plot(X, y, 'bo')
    plt.subplot(1,2,2)
    plt.xlim(0, 10)
    plt.xlabel("iteration", fontsize=20, fontweight='bold')
    plt.ylabel("cost", fontsize=20, fontweight='bold')
    plt.plot(range(len(cost_history)), cost_history, 'rx-', markersize=12, linewidth=1)
    plt.show()

def mean_squared_error(predicted_y, actual_y):
    errors = numpy.subtract(predicted_y, actual_y)
    squared_errors = numpy.multiply(errors, errors)
    return 0.5 * (1/len(errors)) * numpy.sum(squared_errors) 

def update_parameters_with_gradient_descent(k, m):
    errors = numpy.subtract(line(k, m), y)
    k = k - alpha * (1/len(X)) * numpy.sum(errors)
    m = m - alpha * (1/len(X)) * numpy.sum(numpy.multiply(errors, X))
    return (k, m)

def training_iteration(k, m, cost_history):
    (k, m) = update_parameters_with_gradient_descent(k, m)
    current_cost = mean_squared_error(line(k, m), y)
    
    cost_history.append(current_cost)
    plot_current_state(k, m, cost_history)
    
    return (k, m, cost_history)

In [None]:
k = 0
m = 0
cost = mean_squared_error(line(k, m), y)

cost_history = [] # this is just to be able to plot the data

In [None]:
(k, m, cost_history) = training_iteration(k, m, cost_history)