# Machine Learning in Julia: Flux.jl

<img src="https://fluxml.ai/logo.png" width=800>

<img src="flux.png" width=900>

Web page: https://fluxml.ai/

Examples: [Model zoo](https://github.com/FluxML/model-zoo/)

# A single neuron

In [None]:
using Flux

In [None]:
model(W,b,x) = σ.(W * x + b)

In [None]:
# single neuron 5 in 1 out
W = randn(1, 5) # weights
b = zeros(1)    # biases
x = rand(5)     # input

In [None]:
model(W, b, x)

In [None]:
loss(W, b, x) = Flux.mse(model(W,b,x), 0.5)

In [None]:
loss(W,b,x)

In [None]:
import Flux.Tracker: gradient # AD

gradient(loss, W, b, x)

Since there can be hundreds of parameters in a neural network, we use a slightly different approach.

In [None]:
using Flux.Tracker: param, back!, grad

W = param(randn(1, 5))
b = param(zeros(1))
x = rand(5)

y = loss(W, b, x)

back!(y) # Automatic differentiation (backpropagation)

grad(W), grad(b)

We can now use these gradients to update our parameters.

In [None]:
using Flux.Tracker: update!

η = 0.1
for p in (W, b)
  update!(p, -η * grad(p)) # gradient descent
end

Of course, Flux offers more sophisticated optimizers, like [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) etc.

# A small Neural Network

In [None]:
m = Chain(
    Dense(10, 5),
    Dense(5, 2),
    softmax # normalize output neurons
)

opt = ADAM(0.01)

data, labels = rand(10, 100), fill(0.5, 2, 100)

loss(x, y) = sum(Flux.mse(m(x), y))

Flux.train!(loss, params(m), [(data,labels)], opt)

In [None]:
m(rand(10)) # trained model