In [1]:
using Flux
using Plots

First we can create some training data. 

In [2]:
x = rand(5)
y = rand(2)

2-element Vector{Float64}:
 0.2055343543242527
 0.8547895008102977

In [3]:
# define a simple regression model 
model(x) = W*x .+ b

model (generic function with 1 method)

In [4]:
# now set the parameters of W and b to random values
W = rand(2, 5)
b = rand(2)

2-element Vector{Float64}:
 0.22772129746254488
 0.005443637383175659

Now we can define a loss function. This is what evaluates the machine learning model's performance. We use the loss function to determine how changing parameters in the weights $W$ and biases $b$ affects the overall performance and then update the weights and biases accordingly. 

In [5]:
function loss(x, y)
    ŷ = model(x)
    sum((y .- ŷ).^2)
end

loss (generic function with 1 method)

This simply compares the predicted output $\hat{y}$ and the desired label $y$ and returns the loss. Now we can set an optimizer. This is the algorithm that finds the best parameters $W, b$ given the loss function value. 

In [6]:
# classic gradient descent optimizer with learning rate 0.1
opt = Descent(0.1)

Descent(0.1)

Now we can train the model, which computes the gradients with respect to the parameters for each data point in the data. At each step the optimizer updates all the parameters. This can be written as a for loop. 

In [7]:
data = zip(x, y)
ps = params(W, b)

for d in data
    gs = Flux.gradient(ps) do 
        # multiple args accepted here
        loss(d...)
    end
    Flux.Optimise.update!(opt, ps, gs)
end

In [15]:
# create a callback for training
loss_vector = Vector{Float32}()
callback() = push!(loss_vector, loss(x, y))
Flux.train!(loss, params(model), data, opt, cb=callback)

In [18]:
show(loss_vector)

Float32[3.0550015, 3.0550015]