# A simple liner regression example in Flux

In this notebook, we go through a simple linear regression model using [Flux](https://fluxml.ai/) which is a Machine Learning library for Julia. This example is based on [this one](https://discourse.julialang.org/t/training-a-simple-linear-model-in-flux/24741/2).

To train a ML model we need to define the following elements:

* Data set
* Parameters of the model
* Objective function
* Optimization routine

In [1]:
#Load the necessary packages
using Flux, Plots
using Flux: throttle, @epochs

#Backed for Plots
plotly()

┌ Info: For saving to png with the Plotly backend ORCA has to be installed.
└ @ Plots /home/liliana/.julia/packages/Plots/8GUYs/src/backends.jl:363


Plots.PlotlyBackend()

## Data set

Usually, we start by obtaining the data set that we use to train our model and test it. For this example, we'll create some random data and reshape it so we can pass it to our model.

We are creating a simple regression model with one input. Hence, we need to create a data set with one feature and a target variable.

In [3]:
X = rand(100);
Y = 0.5X + rand(100);

To visualize the data we just created, we plot a scatterplot with the feature `X` and target `Y`.

In [4]:
scatter(X, Y, title = "Data")

The data we just created might not have a strong linear correlation but still is useful for trying Flux.

We prepare the data for feeding it into the model.

In [5]:
#Zip the data
Xd = reduce(hcat,X)
Yd = reduce(hcat,Y)

data = [(Xd, Yd)];

Also, we prepare some testa data for validating our model.

In [6]:
X_test = rand(20);
Y_test = 0.5X_test + rand(20);

In [7]:
Xd_test = reduce(hcat,X_test)
Yd_test = reduce(hcat,Y_test);

## Model

A simple linear regresssion model is defined as $m = \sigma.(W*x + b)$. We can define this type of model by setting a neuron with no activation function. In Flux, we can use the [Dense](https://fluxml.ai/Flux.jl/stable/models/layers/#Flux.Dense) function to define this model:

In [8]:
m = Dense(1, 1)

Dense(1, 1)

Also, we need to obtain the parameters of the model so we can pass it to the train function we'll define below.

In [9]:
ps = Flux.params(m)

Params([Float32[-0.946168], Float32[0.0]])

When we create a model, Flux initializes the parameters with random values. At this point we obtain predictions with the current values for the parameters so that we can compare with the values obtained after training the model.

In [10]:
pred_0 = m(reduce(hcat,X));

## Loss function

For this example, we use the **MSE** function. Flux has many loss functions that we can use out of the box.

In [11]:
loss(x, y) = Flux.Losses.mse(m(x), y)

loss (generic function with 1 method)

## Optimiser

We need to set the optimization routine (optimiser) that we'll use to train our model. This optimiser will optimise the loss function.

In [12]:
#We use Gradient Descent with a learning rate of 0.3
opt = Descent(0.3)

Descent(0.3)

## Train the model

After setting the data, model, loss function and optimiser we can finally train our model. In Flux, we can execute a training step with the `Flux.train!` function.

In [13]:
Flux.train!(loss, ps, data, opt)

Flux allows us to create a *callback* function so that we can print information during the training process.

In [14]:
evalcb() = @show(loss(Xd_test, Yd_test))

evalcb (generic function with 1 method)

In [15]:
Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2))

loss(Xd_test, Yd_test) = 0.10897509166799908


To train the model, we can either put the `Flux.train!` function inside a *for loop* or use the *@epoch* macro.

In [16]:
n_epochs = 10;

In [17]:
for i in 1:n_epochs
   Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2)) 
end

loss(Xd_test, Yd_test) = 0.1006425324615469
loss(Xd_test, Yd_test) = 0.0962499936421996
loss(Xd_test, Yd_test) = 0.09256173243582275
loss(Xd_test, Yd_test) = 0.08924318457689616
loss(Xd_test, Yd_test) = 0.08622045415482855
loss(Xd_test, Yd_test) = 0.08346178309904281
loss(Xd_test, Yd_test) = 0.08094474006032217
loss(Xd_test, Yd_test) = 0.07864994687894471
loss(Xd_test, Yd_test) = 0.076559759158489
loss(Xd_test, Yd_test) = 0.07465792123426583


In [18]:
@epochs n_epochs Flux.train!(loss, ps, data, opt, cb = throttle(evalcb, 2))

loss(Xd_test, Yd_test) = 0.07292942104970548
loss(Xd_test, Yd_test) = 0.07136037709835193
loss(Xd_test, Yd_test) = 0.0699379751968858
loss(Xd_test, Yd_test) = 0.06865036269286448
loss(Xd_test, Yd_test) = 0.0674865997082997
loss(Xd_test, Yd_test) = 0.06643658957597971
loss(Xd_test, Yd_test) = 0.06549100089508153
loss(Xd_test, Yd_test) = 0.06464120738441338
loss(Xd_test, Yd_test) = 0.06387926362139953
loss(Xd_test, Yd_test) = 0.06319783585071694


┌ Info: Epoch 1
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 2
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 3
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 4
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 5
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 6
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 7
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 8
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 9
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114
┌ Info: Epoch 10
└ @ Main /home/liliana/.julia/packages/Flux/05b38/src/optimise/train.jl:114


In [20]:
#Final results 
pred_1 = m(reduce(hcat,X));

## Results

Now, we plot the final results to see how good the model is.

In [21]:
plot(X,Y, seriestype = :scatter)
plot!(Xd, pred_0, lc = :green)
plot!(Xd, pred_1)