# Machine Learning in Julia

## A neural network in 10 lines of Julia code

Let's define a fully-connected two layer network.

In [24]:
dense(W, b, σ = identity) = x -> σ.(W * x .+ b)

chain(f...) = foldl(∘, reverse(f))

m = chain(
    dense(randn(5,10), randn(5), tanh),
    dense(randn(2,5), randn(2)))

#56 (generic function with 1 method)

In [2]:
x = rand(10); # some input

In [3]:
m(x)

2-element Array{Float64,1}:
 0.8754463570982725
 6.689820598949918 

Let's train it!

In [4]:
using Zygote # source-to-source reverse mode AD

dm, = gradient(m) do model
    sum(model(x))
end

((f = (W = [-0.9943756641734363 -0.7264554321852847 … 0.9902804359311098 -0.9764219816644919; -0.9943756641734363 -0.7264554321852847 … 0.9902804359311098 -0.9764219816644919], b = [1.0, 1.0], σ = nothing), g = (W = [-0.013256470222723266 -0.004703684439401277 … -0.012285795091278674 -0.009904269650146834; 0.08985334383138735 0.03188192392899642 … 0.08327403539792784 0.06713187834480724; … ; 0.013515618537128547 0.004795635907136311 … 0.012525967854883273 0.010097886408011332; 0.02146570098870199 0.007616498360802444 … 0.019893923450738475 0.016037609352231724], b = [-0.029014298518512876, 0.19666107923221282, -0.1568155031808028, 0.029581493739291607, 0.0469817565257875], σ = nothing)),)

In [33]:
m.f.W

2×5 Array{Float64,2}:
 -1.97977  2.02208  1.21005  -1.10881    -1.61574 
 -0.19324  1.42054  0.12562   0.0174818   0.377436

In [6]:
dm.f.W

2×5 Array{Float64,2}:
 -0.994376  -0.726455  -0.973175  0.99028  -0.976422
 -0.994376  -0.726455  -0.973175  0.99028  -0.976422

In [7]:
η = 0.01

m.f.W .-= η * dm.f.W # Gradient descent!

m.f.W .= m.f.W - η * dm.f.W # Gradient descent!

2×5 Array{Float64,2}:
 -0.133093  -0.97321  -1.24067  -0.517461   1.18785 
 -2.41376    1.41869  -1.68311   2.00703   -0.140608

In [8]:
m(x)

2-element Array{Float64,1}:
 0.787493441482203
 6.601867683333849

## Flux - The ML library that doesn't make you tensor

Web page: https://fluxml.ai/, Examples: [Model zoo](https://github.com/FluxML/model-zoo/)

<img src="https://fluxml.ai/logo.png" width=300>

<img src="flux.png" width=800>

In [9]:
using Flux

└ @ Flux C:\Users\carsten\.julia\packages\Flux\2i5P1\src\Flux.jl:58


In [10]:
m = Chain(
    Dense(10, 5),
    Dense(5, 2),
    softmax # normalize output neurons
)

Chain(Dense(10, 5), Dense(5, 2), softmax)

In [11]:
data, labels = rand(10, 100), fill(0.5, 2, 100); # fake data

In [12]:
loss(x, y) = sum(Flux.mse(m(x), y)) # mean squared error

loss (generic function with 1 method)

In [13]:
opt = Descent(0.01) # or ADAM

Descent(0.01)

In [14]:
Flux.train!(loss, params(m), [(data,labels)], opt)

In [15]:
m(rand(10)) # trained model

2-element Array{Float32,1}:
 0.32356524
 0.67643476