Merlin: deep learning framework for Julia
Merlin is a deep learning framework written in Julia.
It aims to provide a fast, flexible and compact deep learning library for machine learning.
Merlin is tested against Julia
0.6 on Linux, OS X, and Windows (x64).
- Julia 0.6
- g++ (for OSX or Linux)
- Wrap your data with
- Apply functions to
Varmemorizes a history of function calls for auto-differentiation.
- Compute gradients if necessary.
- Update the parameters with an optimizer.
Here is an example of three-layer network:
Merlin supports both static and dynamic evaluation of neural networks.
using Merlin T = Float32 x = zerograd(rand(T,10,5)) # instanciate Var with zero gradients y = Linear(T,10,7)(x) y = relu(y) y = Linear(T,7,3)(y) params = gradient!(y) println(x.grad) opt = SGD(0.01) foreach(opt, params)
If you don't need gradients of
x = Var(rand(T,10,5)) where
x.grad is set to
For static evaluation, the process are as follows.
- Construct a
- Feed your data to the graph.
When you apply
Node to a function, it's lazily evaluated.
using Merlin T = Float32 n = Node(name="x") n = Linear(T,10,7)(n) n = relu(n) n = Linear(T,7,3)(n) @assert typeof(n) == Node g = Graph(n) x = zerograd(rand(T,10,10)) y = g("x"=>x) params = gradient!(y) println(x.grad) opt = SGD(0.01) foreach(opt, params)
When the network structure can be represented as static, it is recommended to use this style.
- See MNIST
This is an example of batched LSTM.
using Merlin T = Float32 a = rand(T,20,3) b = rand(T,20,2) c = rand(T,20,5) x = Var(cat(2,a,b,c)) lstm = LSTM(T, 20, 20) # input size: 20, output size: 20 y = lstm(x, [3,2,5])
More examples can be found in