Julia port of the Python autograd package.
Julia Perl
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

AutoGrad

Build Status coveralls codecov

AutoGrad.jl is an automatic differentiation package for Julia. It is based on the popular Python autograd package and forms the foundation of the Knet Julia deep learning framework. AutoGrad can differentiate regular Julia code that includes loops, conditionals, helper functions, closures etc. by keeping track of the primitive operations and using this execution trace to compute gradients. It uses reverse mode differentiation (a.k.a. backpropagation) so it can efficiently handle functions with array inputs and scalar outputs. It can compute gradients of gradients to handle higher order derivatives. Please see the comments in core.jl for a description of how the code works in detail.

Installation

You can install AutoGrad in Julia using:

julia> using Pkg; Pkg.add("AutoGrad")

In order to use it in your code start with:

using AutoGrad

Example

Here is a linear regression example simplified from housing.jl:

using AutoGrad

function loss(w)
    global xtrn,ytrn
    ypred = w[1]*xtrn .+ w[2]
    sum(abs2, ypred - ytrn) / size(ypred,2)
end

function train(w; lr=.1, epochs=20)
    lossgrad = grad(loss)
    for epoch=1:epochs
        g = lossgrad(w)
        for i in 1:length(w)
            w[i] -= lr * g[i]
        end
    end
    return w
end

The loss function takes parameters as input and returns the loss to be minimized. The parameter w for this example is a pair: w[1] is a weight matrix, and w[2] is a bias vector. The training data xtrn,ytrn are in global variables. ypred is the predicted output, and the last line computes the quadratic loss. The loss function is implemented in regular Julia.

The train function takes initial parameters and returns optimized parameters. grad is the only AutoGrad function used: it creates a function lossgrad that takes the same arguments as loss, but returns the gradient instead. The returned gradient will have the same type and shape as the input argument. The for loop implements gradient descent, where we calculate the gradient and subtract a scaled version of it from the weights.

See the examples directory for more examples, and the extensively documented core.jl for details.

Extending AutoGrad

AutoGrad can only handle a function if the primitives it uses have known gradients. You can add your own primitives with gradients as described in detail in core.jl or using the @primitive and @zerograd macros in macros.jl Here is an example:

@primitive hypot(x1,x2),dy,y  (dy.*x1./y)  (dy.*x2./y)

The @primitive macro marks the hypot(::Any,::Any) method as a new primitive and the next two expressions define gradient functions wrt the first and second argument. The gradient expressions can refer to the parameters (x1,x2), the return variable y and its gradient dy (optionally indicated after the argument list) in the method declaration.

Note that Julia supports multiple-dispatch, i.e. a function may have multiple methods each supporting different argument types. For example hypot(x1::Number,x2::Number) and hypot(x1::Array,x2::Array) are two different hypot methods. In AutoGrad.jl each method can independently be defined as a primitive and can have its own specific gradient. Generally AutoGrad defines gradients without using argument types to keep the rules generic.

Code structure

core.jl implements the main functionality and acts as the main documentation source. macros.jl has some support functions to define and test new primitives. getindex.jl, iterate.jl and cat.jl set up support for common data structures including Arrays, Tuples, and Dictionaries. The numerical gradients are defined in files such as base.jl and math.jl.

Current status and future work

The gradient coverage and unit testing are spotty, I am still adding more gradients and tests to cover the Julia base. Documentation needs to be improved. Overwriting functions (e.g. setindex!) are not supported. Efficiency could be improved by reducing runtime compilation, memoization, and support for static computation.

Acknowledgments and references

AutoGrad.jl was written by Deniz Yuret. Parts of the code were initially ported from the Python autograd package. I'd like to thank autograd author Dougal Maclaurin for his support. See (Baydin et al. 2015) for a general review of automatic differentiation, autograd tutorial for some Python examples, and Dougal's PhD thesis for design principles. JuliaDiff and FluxML have alternative differentiation tools for Julia. I would like to thank Carlo Lucibello, Mike Innes, Rene Donner, Ekin Akyurek, Ozan Arkan Can and Emre Yolcu for their contributions.

The suggested citation for AutoGrad is:

@inproceedings{knet2016mlsys,
  author={Yuret, Deniz},
  title={Knet: beginning deep learning with 100 lines of Julia},
  year={2016},
  booktitle={Machine Learning Systems Workshop at NIPS 2016}
}