# A simple liner regression example in Flux

In this notebook, we go through a simple linear regression model using [Flux](https://fluxml.ai/) which is a Machine Learning library for Julia. This example is based on [this one](https://discourse.julialang.org/t/training-a-simple-linear-model-in-flux/24741/2).

To train a ML model we need to define the following elements:

* Data set
* Parameters of the model
* Objective function
* Optimization routine

In [1]:
#Load the necessary packages
using Flux, Plots
using Flux: throttle, @epochs

#Backed for Plots
plotly()

┌ Info: For saving to png with the Plotly backend ORCA has to be installed.
└ @ Plots /home/liliana/.julia/packages/Plots/8GUYs/src/backends.jl:363


Plots.PlotlyBackend()

## Data set

Usually, we start by obtaining the data set that we use to train our model and test it. For this example, we'll create some random data and reshape it so we can pass it to our model.

We are creating a simple regression model with one input. Hence, we need to create a data set with one feature and a target variable.

In [2]:
X = rand(100);
Y = 0.5X + rand(100);

To visualize the data we just created, we plot a scatterplot with the feature `X` and target `Y`.

In [3]:
scatter(X, Y, title = "Data")

The data we just created might not have a strong linear correlation but still is useful for trying Flux.

We prepare the data for feeding it into the model.

In [4]:
#Zip the data
Xd = reduce(hcat,X)
Yd = reduce(hcat,Y)

data = [(Xd, Yd)];

Also, we prepare some test data for validating our model.

In [5]:
X_test = rand(20);
Y_test = 0.5X_test + rand(20);

In [6]:
Xd_test = reduce(hcat,X_test)
Yd_test = reduce(hcat,Y_test);

## Model

In MAchine Learning, a simple model is defined as $m = \sigma.(W*x + b)$. We can define this type of model by setting a neuron with no activation function. In Flux, we can use the [Dense](https://fluxml.ai/Flux.jl/stable/models/layers/#Flux.Dense) function to define this model:

In [7]:
m = Dense(1, 1)

Dense(1, 1)

Also, we need to obtain the parameters of the model so we can pass it to the train function we'll define below.

In [8]:
ps = Flux.params(m)

Params([Float32[-0.7137795], Float32[0.0]])

When we create a model, Flux initializes the parameters with random values. At this point we obtain predictions with the current values for the parameters so that we can compare with the values obtained after training the model.

In [9]:
pred_0 = m(reduce(hcat,X));

## Loss function

For this example, we use the **MSE** function. Flux has many loss functions that we can use out of the box.

In [10]:
loss(x, y) = Flux.Losses.mse(m(x), y)

loss (generic function with 1 method)

## Optimiser

We need to set the optimization routine (optimiser) that we'll use to train our model. This optimiser will optimise the loss function.

In [11]:
#We use Gradient Descent with a learning rate of 0.3
opt = Descent(0.3)

Descent(0.3)

## Train the model

After setting the data, model, loss function and optimiser we can finally train our model. In Flux, we can execute a training step with the `Flux.train!` function.

In [12]:
Flux.train!(loss, ps, data, opt)

Flux allows us to create a *callback* function so that we can print information during the training process.

In [13]:
evalcb() = @show(loss(Xd_test, Yd_test))

evalcb (generic function with 1 method)

In [14]:
Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2))

loss(Xd_test, Yd_test) = 0.16068115046976908


To train the model, we can either put the `Flux.train!` function inside a *for loop* or use the *@epoch* macro.

In [15]:
n_epochs = 10;

In [16]:
for i in 1:n_epochs
   Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2)) 
end

loss(Xd_test, Yd_test) = 0.14066560026903524
loss(Xd_test, Yd_test) = 0.13385540934739282
loss(Xd_test, Yd_test) = 0.12971208575526966
loss(Xd_test, Yd_test) = 0.1263509508619403
loss(Xd_test, Yd_test) = 0.12333780553390308
loss(Xd_test, Yd_test) = 0.12055608817689482
loss(Xd_test, Yd_test) = 0.11796643695595285
loss(Xd_test, Yd_test) = 0.11554931480655442
loss(Xd_test, Yd_test) = 0.1132908712645185
loss(Xd_test, Yd_test) = 0.11117936982211041


In [17]:
@epochs n_epochs Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2))

loss(Xd_test, Yd_test) = 0.10920422656594098
loss(Xd_test, Yd_test) = 0.10735572520126865
loss(Xd_test, Yd_test) = 0.10562490163186551
loss(Xd_test, Yd_test) = 0.10400351727038441
loss(Xd_test, Yd_test) = 0.10248388782183111
loss(Xd_test, Yd_test) = 0.10105896594736372
loss(Xd_test, Yd_test) = 0.09972221055806488
loss(Xd_test, Yd_test) = 0.09846756402337094
loss(Xd_test, Yd_test) = 0.09728942894690235
loss(Xd_test, Yd_test) = 0.09618260996730169


┌ Info: Epoch 1
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 2
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 3
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 4
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 5
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 6
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 7
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 8
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 9
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 10
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114


In [18]:
#Final results 
pred_1 = m(reduce(hcat,X));

## Results

Now, we plot the final results to see how good the model is.

In [19]:
plot(X,Y, seriestype = :scatter)
plot!(Xd, pred_0, lc = :green)
plot!(Xd, pred_1)