# Machine Learning in Julia: Flux.jl

<img src="https://fluxml.ai/logo.png" width=800>

<img src="flux.png" width=900>

Web page: https://fluxml.ai/

Examples: [Model zoo](https://github.com/FluxML/model-zoo/)

# A single neuron

In [43]:
using Flux
import Flux.Tracker: gradient

In [45]:
model(W,b,x) = σ.(W * x + b)

model (generic function with 2 methods)

In [59]:
# single neuron 5 in 1 out
W = randn(1, 5) # weights
b = zeros(1)    # biases
x = rand(5)     # input

5-element Array{Float64,1}:
 0.4944243981290406 
 0.366416169296387  
 0.28300735323982606
 0.2613408585166095 
 0.22797402563434477

In [60]:
model(W, b, x)

1-element Array{Float64,1}:
 0.3879253475178959

In [61]:
loss(W, b, x) = Flux.mse(model(W,b,x), 0.5)

loss (generic function with 1 method)

In [62]:
loss(W,b,x)

0.012560727728984398

In [63]:
gradient(loss, W, b, x)

([-0.0263142 -0.0195013 … -0.013909 -0.0121332] (tracked), [-0.0532218] (tracked), [-0.00605705, 0.0602944, -0.0399467, 0.0576423, 0.00620331] (tracked))

Since there can be hundreds of parameters in a neural network, we use a slightly different approach.

In [85]:
using Flux.Tracker: param, back!, grad

W = param(randn(1, 5))
b = param(zeros(1))
x = rand(5)

y = loss(W, b, x)

back!(y) # Automatic differentiation (backpropagation)

grad(W), grad(b)

([0.0461669 0.0725071 … 0.0186868 0.014291], [0.0751374])

We can now use these gradients to update our parameters.

In [86]:
using Flux.Tracker: update!

η = 0.1
for p in (W, b)
  update!(p, -η * grad(p)) # gradient descent
end

Of course, Flux offers more sophisticated optimizers, like [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) etc.

# Neural Network

Our full deep learning code:

In [87]:
m = Chain(
    Dense(10, 5),
    Dense(5, 2),
    softmax # normalize output neurons
)

opt = ADAM(0.01)

data, labels = rand(10, 100), fill(0.5, 2, 100)

loss(x, y) = sum(Flux.mse(m(x), y))

Flux.train!(loss, params(m), [(data,labels)], opt)