Initial implementation of autograd #30

pavanky · 2017-07-02T19:56:25Z

What is done so far:

A proof of concept implementation for automatic differentiation using autograd::Variable, autograd::backward.
This currently implements only a few basic operations (only +, * for now).
Ability to perform first order derivatives. Higher order derivatives to come later.

Variable

This can be constructed in two ways
- Using an af::array from the user
- An operator returning a Variable. The operator constructs the Variable using a set of input Variables, the output array and a grad function.
When var.backward(grad_var) is invoked, it builds a DAG as vector starting with the current variable and propagates gradients down the graph to all the Variables in the graph using the grad function specified at each variable.
Calculating Gradients for a variable (and its subgraph) can be disabled by ivoking var.setCalcGrad(false)

Functions

Each function takes in Variable parameters and return Variable as a parameter.
Each function performs the operation immediately on the data.
Each function returns a Variable constructed using arguments as parameters:
- af::array: The result calculated earlier
- vector<Variable>: containing the inputs to the function
- BackwardFunction_t: A function pointer to the backward pass. Usually implemented as a lambda function.

Example function:

       Variable operator +(const Variable lhs, const Variable rhs)
       {
           auto result = lhs.getData() + rhs.getData();
           auto backward = [](std::vector<Variable> inputs, Variable grad_output) {
               inputs[0].addGrad(grad_output);
               inputs[1].addGrad(grad_output);
           };
           return Variable(result, {lhs, rhs}, backward);
       }

Example:

A simple example showcasing how this can be done currently

void test()
{
    using af::autograd::Variable;
    auto x = Variable(af::randu(5), true);
    af_print(x.array());
    auto y = Variable(af::randu(5), true);
    af_print(y.array());
    auto z = x * x + x * y + y * y;
    auto dz = Variable(af::constant(1.0, 5), false);
    z.backward(dz);
    auto dx = x.grad();
    auto dy = y.grad();
    af_print(dx.array() - 2 * x.array() - y.array());
    af_print(dy.array() - 2 * y.array() - x.array());
}

TODO: for this PR

Add all math operations: +, -, *, /, sin, cos, exp, tanh
Add array operations: tile, sum, transpose
Add operations required for Dense layers: matmul
Reimplement existing layers using autograd
Option to enable or disable building sub graphs
Option to enable or disable retaining graphs for gradients
Make sure perceptron example is working.
Add train and evaluation mode for modules

Fixes Automatic Differentiation #2

pavanky · 2017-07-02T20:06:33Z

@botev @jramapuram @itsnarsi This has been a long time coming, but I'd appreciate if you guys had any feedback as well.

pavanky · 2017-07-02T20:06:47Z

CC @arrayfire/core-devel

pavanky · 2017-07-02T20:08:21Z

@Reithan too

jramapuram · 2017-07-02T20:37:55Z

Awesome work @pavanky . Will take a look in more detail when I get to a terminal. Quick question: can you take second derivatives with your implementation?

pavanky · 2017-07-02T22:00:26Z

@jramapuram Not yet, I wanted to get the first order working first :)

pavanky · 2017-07-02T23:03:57Z

@jramapuram went ahead and changed the gradients to be Variables too. This should make it easy to perform higher order derivatives.

itsnarsi · 2017-07-03T02:08:34Z

@pavanky just tested it on my laptop and it looks pretty neat. Unlike python, I did not see any initial delay. This might be because of no JIT I guess.
When will this be merged to this repo?

pavanky · 2017-07-03T04:11:02Z

@itsnarsi This is still very nascent. I want to incorporate some of the stuff mentioned here to make it more efficient:
http://pytorch.org/docs/master/notes/autograd.html#excluding-subgraphs

FloopCZ

Hmm, nice job!

FloopCZ · 2017-07-03T03:42:42Z

examples/FFNet.cpp


 using namespace af;
-using namespace afml;
-using namespace afml::nn;
+using namespace af;


Duplicated line

Do you have a tool for detecting this or a really good eye :D

A tool would be great. Unfortunately, I'm just an irritating nitpicker. 😇

FloopCZ · 2017-07-03T03:54:58Z

include/af/autograd/Variable.hpp

+                {
+                    if (m_grads.size() == 1) return;
+                    Variable grad = m_grads[0];
+                    for (int i = 1; i < (int)m_grads.size(); i++) {


I would prefer unsigned iterable to avoid clang's -Wconversion signedness warnings when indexing to std::vector.

Will do thanks.

pavanky · 2017-07-05T00:03:10Z

Decreased the scope of the PR to get a minimum viable thing going. The additional functions and operators can be added once this PR gets merged.

- autograd::Variable::Shared now a thin layer without methods - Variable::BackwardFunc_t renamed to Variable::GradFunc_t - Variable::getData renamed to Variable::array - Variable::getGrad renamed to Variable::grad - Variable::backward renamed to Variable::calcGradInputs

pavanky · 2017-07-05T02:08:50Z

@jramapuram I think enabling the support for higher order derivatives by default will increase the memory being used. I am going to enable a flag to enable it during the backward pass. By default only the values will be stored.

- Disabled by default - can be enabled by passing true as second argument to backward

umar456

Minor preliminary comments. Everything looks great. We can refactor it later as long as we have a clean user-facing API.

umar456 · 2017-07-05T17:42:48Z

CMakeLists.txt

+
+find_package(ArrayFire REQUIRED)
+
+add_library(afml SHARED "")


If you don't add SHARED then you can control the type of library you make by BUILD_SHARED_LIBS variable

umar456 · 2017-07-05T17:47:41Z

src/autograd/Functions.cpp

+        Variable operator +(const Variable &lhs, const Variable &rhs)
+        {
+            auto result = lhs.array() + rhs.array();
+            auto grad_func = [](std::vector<Variable> &inputs, const Variable &grad_output) {


Don't we usually have outputs then inputs?

It looks like you know the # of inputs for each function. I would use something like std::array<Variable, N> for something like that

Both of these are inputs. grad_output is an input coming from a different place.

And using std::array is not an option. All functions need to share the same signature so they can be stored as GradFunc_t inside Variable.

- Implemented baseclass nn::Module - Added basic modules: nn::Linear, nn::Sigmoid, nn:Tanh - Added container modules: nn:Container, nn:Sequential - Deleted unnecessary examples, cleaned up perceptron.cpp

- Trying to solve for the entire batch was a bad idea

umar456

A couple of minor issues. This is looking great!

umar456 · 2017-07-06T15:32:48Z

examples/perceptron.cpp

+
+            // Update parameters
+            // TODO: Should use optimizer
+            for (auto param : perceptron.parameters()) {


umar456 · 2017-07-06T15:32:57Z

examples/perceptron.cpp

@@ -0,0 +1,88 @@
+/*******************************************************
+ * Copyright (c) 2015, ArrayFire


umar456 · 2017-07-06T15:34:06Z

include/af/autograd/Variable.hpp

+                GradFunc_t m_grad_func;
+            };
+
+            public:


Needs to be aligned with other access qualifiers.

umar456 · 2017-07-06T15:37:01Z

src/nn/Modules/Module.cpp

@@ -0,0 +1,61 @@
+/*******************************************************
+ * Copyright (c) 2015, ArrayFire


umar456 · 2017-07-06T15:37:24Z

src/nn/Modules/Module.cpp

+
+        void Module::eval()
+        {
+            for (auto parameter : m_parameters) {


umar456 · 2017-07-06T15:40:19Z

include/af/autograd/Variable.hpp

+        private:
+            void evalGrad(bool retain_grad_graph = false);
+
+            std::vector<Variable> getInputs() const;


Does this need to return by value?

pavanky added 3 commits July 1, 2017 23:33

Reorganizing the CMake files

5ed7a88

Reorganizing the include files and namespace

dfa8fda

First attempt at implementing autograd

562b860

pavanky mentioned this pull request Jul 2, 2017

Generic Graph based Automatic Differentiation arrayfire/arrayfire#1098

Closed

Store gradients as autograd::Variable instead of af::array

3f832a0

FloopCZ reviewed Jul 3, 2017

View reviewed changes

pavanky force-pushed the autograd branch from fe38648 to 72ceb33 Compare July 5, 2017 00:00

pavanky force-pushed the autograd branch from 72ceb33 to 22210c6 Compare July 5, 2017 00:14

pavanky added 2 commits July 4, 2017 17:19

Changing autograd::backward function to Variable::backward method

7bb0b6c

pavanky force-pushed the autograd branch from e03cae5 to 7bb0b6c Compare July 5, 2017 00:20

Moving autograd from header only lib to a compiled lib

a316af0

pavanky force-pushed the autograd branch from 563c428 to a316af0 Compare July 5, 2017 05:20

pavanky added 6 commits July 4, 2017 23:36

Adding negate, reciprocal, subtract and divide

d7edafc

Add scalar support for operators

664cf7c

Adding exp, sin, cos, tanh, and sigmoid functions

45b21da

Adding expandAs, reduceAs, and transpose

3b985b0

Adding matmul, matmulTN, and matmulNT functions

9b05273

Add option to explicitly request higher order gradients.

49b8917

- Disabled by default - can be enabled by passing true as second argument to backward

pavanky force-pushed the autograd branch from eaa8e37 to 49b8917 Compare July 5, 2017 08:24

umar456 reviewed Jul 5, 2017

View reviewed changes

Convert Variable::build and Variable::buildSubGraph to static functions

8bf7f1b

pavanky force-pushed the autograd branch from 73345fb to 0442d51 Compare July 6, 2017 06:38

Overhaul of af::nn to use af::autograd

5eda600

- Implemented baseclass nn::Module - Added basic modules: nn::Linear, nn::Sigmoid, nn:Tanh - Added container modules: nn:Container, nn:Sequential - Deleted unnecessary examples, cleaned up perceptron.cpp

pavanky force-pushed the autograd branch from 0442d51 to 5eda600 Compare July 6, 2017 07:09

pavanky added 3 commits July 6, 2017 00:47

Fixing bugs in backward pass for activation functions

9aefea4

Fixing perceptron example to use smaller batch size

6d5751a

- Trying to solve for the entire batch was a bad idea

Adding model.eval() and model.train()

a01504b

umar456 approved these changes Jul 6, 2017

View reviewed changes

pavanky added 2 commits July 6, 2017 08:46

Formatting changes

2776aa2

Use references while iterating when possible

04cd450

umar456 merged commit 8129b47 into arrayfire:master Jul 6, 2017

pavanky deleted the autograd branch July 6, 2017 15:57

pavanky changed the title ~~[WIP] Initial attempt at autograd~~ Initial implementation of autograd Jul 10, 2017

pavanky mentioned this pull request Jul 10, 2017

TODO List for 0.1 release #17

Open

20 tasks

pavanky modified the milestone: 0.1 Jul 11, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial implementation of autograd #30

Initial implementation of autograd #30

pavanky commented Jul 2, 2017 •

edited

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

jramapuram commented Jul 2, 2017

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

itsnarsi commented Jul 3, 2017

pavanky commented Jul 3, 2017

FloopCZ left a comment

FloopCZ Jul 3, 2017

pavanky Jul 3, 2017

FloopCZ Jul 3, 2017

FloopCZ Jul 3, 2017

pavanky Jul 3, 2017

pavanky commented Jul 5, 2017

pavanky commented Jul 5, 2017

umar456 left a comment

umar456 Jul 5, 2017

umar456 Jul 5, 2017

pavanky Jul 5, 2017

pavanky Jul 5, 2017

umar456 left a comment

umar456 Jul 6, 2017

umar456 Jul 6, 2017

umar456 Jul 6, 2017

umar456 Jul 6, 2017

umar456 Jul 6, 2017

umar456 Jul 6, 2017

		@@ -0,0 +1,88 @@
		/*******************************************************
		* Copyright (c) 2015, ArrayFire

		@@ -0,0 +1,61 @@
		/*******************************************************
		* Copyright (c) 2015, ArrayFire

Initial implementation of autograd #30

Initial implementation of autograd #30

Conversation

pavanky commented Jul 2, 2017 • edited

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

jramapuram commented Jul 2, 2017

pavanky commented Jul 2, 2017

pavanky commented Jul 2, 2017

itsnarsi commented Jul 3, 2017

pavanky commented Jul 3, 2017

FloopCZ left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavanky commented Jul 5, 2017

pavanky commented Jul 5, 2017

umar456 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

umar456 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavanky commented Jul 2, 2017 •

edited