# Introduction

**Lilith** is a deep learning library in Julia with focus on **high performance** and **interoperability with existing DL frameworks**. Its main features include:

 * _tracing autograd engine_ - models are just structs, transformations are just functioins
 * _optimizing code generator_ based on hackable computational graph
 * _GPU support_
 * _layer API similar to PyTorch's_ to ease translation of existing Python code to Julia
 * high _backward compatibility_ to allow accumulation of models
 
 

A quick example of Lilith model definition:

In [5]:
using Lilith


mutable struct Net
    conv1::Conv2d
    conv2::Conv2d
    fc1::Linear
    fc2::Linear
end


Net() = Net(
    Conv2d(1, 20, 5),
    Conv2d(20, 50, 5),
    Linear(4 * 4 * 50, 500),
    Linear(500, 10)
)


function (m::Net)(x::AbstractArray)
    x = maxpool2d(relu.(m.conv1(x)), (2, 2))
    x = maxpool2d(relu.(m.conv2(x)), (2, 2))
    x = reshape(x, 4*4*50, :)
    x = relu.(m.fc1(x))
    x = logsoftmax(m.fc2(x))
    return x
end

Training (you may see a few warnings from underlying packages, but they shouldn't break anything):

In [6]:
using MLDatasets    # run `] add MLDatasets` to install this package

function get_mnist_data(train::Bool; device=CPU())
    X, Y = train ? MNIST.traindata() : MNIST.testdata()
    X = convert(Array{Float64}, reshape(X, 28, 28, 1, :)) |> device
    # replace class label like "0" with its position like "1"
    Y = Y .+ 1 |> device
    return X, Y
end

# choose device: if CUDA is available on the system, GPU() will be used, otherwise - CPU()
device = best_available_device()

# instantiate the model
m = Net() |> device
# load training data
X_trn, Y_trn = get_mnist_data(true);
# set loss function and optimizer, then fit the model
loss_fn = NLLLoss()
opt = Adam(lr=1e-2)
@time fit!(m, X_trn, Y_trn, loss_fn; n_epochs=10, opt=opt, batch_size=100, device=device)

┌ Info: Epoch 1: avg_cost=0.0818943272211722, elapsed=12.219861569
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 2: avg_cost=0.08067442104220389, elapsed=11.577970364
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 3: avg_cost=0.078615898798619, elapsed=11.786745248
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 4: avg_cost=0.07356383970805576, elapsed=11.84160564
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 5: avg_cost=0.05924170171575886, elapsed=11.949758801
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 6: avg_cost=0.03640910144895315, elapsed=11.874725977
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 7: avg_cost=0.02330157606463347, elapsed=11.598434645
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 8: avg_cost=0.018194284556167464, elapsed=11.583370803
└ @ Lilith /home/slipslop/work/Lilith/src/fit.jl:25
┌ Info: Epoch 9: avg_cost=0.015693947872413

117.652222 seconds (42.22 M allocations: 8.994 GiB, 2.04% gc time)


Net(Conv2d(5x5, 1=>20), Conv2d(5x5, 20=>50), Linear(800=>500), Linear(500=>10))

And evaluation:

In [9]:
import Lilith: accuracy

# load test data
X_tst, Y_tst = get_mnist_data(false, device=device)
# predict log probabilities and calculate accuracy
Ŷ = m(X_tst)
@info accuracy(Y_tst, Ŷ)

┌ Info: 0.8959
└ @ Main In[9]:7
