# ATNN tutorial pt. 1: design of autograd functions

## settings

- ATNN only requires the [ATen library](https://github.com/zdevito/ATen) and recent C++17 compiler (GCC7, Clang4).
- If you install pytorch via conda, you can find `${CONDA_PREFIX}/lib/python3.6/site-packages/torch/lib/libATen.so.1` because it is backend of pytorch.
- If you did not have it or met any troubles, you can build it by `cd <atnn_repo>/test; make build/stage/lib/libATen.so`

## concepts

- Variable: autograd `at::Tensor` object defined at `<atnn/variable.hpp>`
- Function: ephemeral forward/backward implementation object used inside Chain. As well as pytorch, Varibles are never directly applied to Functions. defined at `<atnn/function.hpp>`
- Chain: autograd computation graph combining Functions and Variables defined at `<atnn/chain.hpp>`

In [1]:
.L libATen.so



In [2]:
#include <atnn/function.hpp>
#include <atnn/testing.hpp>
#include <atnn/grad_check.hpp>
#include <atnn/chain.hpp>
#include <iostream>
#include <vector>



In [4]:
// maybe cling's bug
at::CPU(at::kFloat).randn({1})

IncrementalExecutor::executeFunction: symbol '__emutls_v._ZSt11__once_call' unresolved while linking [cling interface function]!
IncrementalExecutor::executeFunction: symbol '__emutls_v._ZSt15__once_callable' unresolved while linking [cling interface function]!




# Functional interface for autograd function

- use `atnn::chain::lambda(forward_func_obj, backward_func_obj)(args ...)` to create autograd functions. type of this expression is `atnn::Variable` or `std::vector<atnn::Variable>` that depends on the return type of `forwad_func_obj`, where
  - `forward_func_obj` takes `std::vector<at::Tensor>` as inputs and returns `at::Tensor` or `std::vector<at::Tensor>`
  - `backward_func_obj` takes `std::vector<atnn::Variable>` as inputs and returns `std::vector<atnn::Variable>`
- you can find more examples in `atnn/chain.hpp`

In [5]:
// this function is double-backwardable
auto fpow(atnn::Variable lhs, atnn::Variable rhs) {
    return atnn::chain::lambda(
        [](auto&& xs) { return xs[0].pow(xs[1]); },
        [](auto&& xs, auto&& gys) { return
            std::vector<atnn::Variable> {
                gys[0] * fpow(xs[0], xs[1] - 1.0) * xs[1], // recursive call
                gys[0] * fpow(xs[0], xs[1]) * xs[1].log()
            };
        }
    )(lhs, rhs);
}



In [6]:
// functional pow test
// if you use GCC, no need to write `atnn::Init<T, N>`
atnn::Variable u0 = atnn::Init<float, 2> {{1, 2}, {3, 4}};
auto u1 = fpow(u0, atnn::Init<float> {2});
std::cout << u1 << std::endl;
u1.backward();
std::cout << u0.grad() << std::endl;

Variable(
data=
  1   4
  9  16
[ CPUFloatTensor{2,2} ]
)
Variable(
data=
 2  4
 6  8
[ CPUFloatTensor{2,2} ]
)


(std::basic_ostream<char, std::char_traits<char> >::__ostream_type &) @0x7f45d3f58420


# OOP interface for autograd function

**NOTE** OOP style is not recommended because of its complexity. Use functional interface

- use CRTP of `class Foo : atnn::function::Function<Foo>`
- implement `auto Foo::impl_forward(const atnn::TList&)`. Here, return type is `at::Tensor` or `atnn::TList a.k.a. std::vector<at::Tensor>`
- implement `atnn::VList Foo::impl_forward(const atnn::VList&)` to return each gradients w.r.t. inputs

In [7]:
// this function does not support double-backward
struct Pow : atnn::function::Function<Pow> {
    auto impl_forward(const std::vector<at::Tensor>& x) {
        this->save_for_backward(x);
        return x[0].pow(this->n);
    }

    std::vector<atnn::Variable> impl_backward(const std::vector<atnn::Variable>& gy) {
        auto&& _x = this->saved_tensors[0];
        return {atnn::Variable(gy[0].data() * _x.pow(this->n - 1) * this->n, false)};
    }
    double n = 2;
    Pow(double n) : n(n) {}
};



In [8]:
// oop pow test
// if you use GCC, no need to write `atnn::Init<T, N>`
atnn::Variable v0 = atnn::Init<float, 2> {{1, 2}, {3, 4}};
auto func = atnn::chain::chain_ptr<Pow>(2); // need to wrap with chain_ptr
auto v1 = func(v0);
std::cout << v1 << std::endl;
v1.backward();
std::cout << v0.grad() << std::endl;

Variable(
data=
  1   4
  9  16
[ CPUFloatTensor{2,2} ]
)
Variable(
data=
 2  4
 6  8
[ CPUFloatTensor{2,2} ]
)


(std::basic_ostream<char, std::char_traits<char> >::__ostream_type &) @0x7f45d3f58420


# gradient check

you can use `atnn::grad_check` for validating your backward implementation with numeric grad

In [9]:
auto device = at::CPU;
atnn::Variable x = device(at::kFloat).rand({3, 4});
atnn::Variable y = device(at::kFloat).rand({3, 4});
auto gy = device(at::kFloat).rand({3, 4});
atnn::grad_check([](auto xs) { return fpow(xs[0], xs[1]); }, {x, y}, {gy});

(void) @0x7f45c9ff9c10
