#  AD Tutorial - CLAD & Jupyter Notebook 

xeus-cpp provides a Jupyter kernel for C++ with the help of the C++ interpreter clang-repl and the native implementation of the Jupyter protocol xeus.

Within the xeus-cpp framework, Clad can enable automatic differentiation (AD) such that users can automatically generate C++ code for their computation of derivatives of their functions.



In [1]:
#include "clad/Differentiator/Differentiator.h"
#include <iostream>

## Forward Mode AD

For a function _f_ of several inputs and single (scalar) output, forward mode AD can be used to compute (or, in case of Clad, create a function) computing a directional derivative of _f_ with respect to a single specified input variable. Moreover, the generated derivative function has the same signature as the original function _f_, however its return value is the value of the derivative.

In [2]:
double fn(double x, double y) {
  return x*x*y + y*y;
}

In [3]:
auto fn_dx = clad::differentiate(fn, "x");

In [4]:
std::cout << fn_dx.execute(5, 3) << std::endl;

30


## Reverse Mode AD

Reverse-mode AD enables the gradient computation within a single pass of the computation graph of _f_ using at most a constant factor (around 4) more arithmetical operations compared to the original function. While its constant factor and memory overhead is higher than that of the forward-mode, it is independent of the number of inputs.

Moreover, the generated function has void return type and same input arguments. The function has an additional argument of type T*, where T is the return type of _f_. This is the “result” argument which has to point to the beginning of the vector where the gradient will be stored.

In [5]:
double fn2(double x, double y) {
  return x*x + y*y;
}

In [6]:
auto d_fn_2 = clad::gradient(fn2, "x, y");
double d_x_2, d_y_2;

In [7]:
d_fn_2.dump();

The code is: 
void fn2_grad(double x, double y, double *_d_x, double *_d_y) {
    {
        *_d_x += 1 * x;
        *_d_x += x * 1;
        *_d_y += 1 * y;
        *_d_y += y * 1;
    }
}



## Hessian

Clad can produce the hessian matrix of a function using its forward and reverse mode capabilities. Its interface is similar to reverse mode but differs when arrays are involved. It returns the matrix as a flattened vector in row major format.

In [8]:
double kinetic_energy(double mass, double velocity) {
  return mass * velocity * velocity * 0.5;
}

In [9]:
auto hessian = clad::hessian(kinetic_energy, "mass, velocity");
double matrix[4];
hessian.execute(10, 2, matrix);

In [10]:
for(int i = 0; i < 4; i++) {
    std::cout << matrix[i] << " ";
}

0 2 2 10 

## Jacobian

Clad can produce the jacobian of a function using its reverse mode. It returns the jacobian matrix as a `clad::matrix` for every pointer/array parameter. The generated function has `void` return type and same input arguments. For every pointer/array parameter `arr`, the function has an additional argument `_d_vector_arr`. Its type is `clad::matrix<T>`, where `T` is the pointee type of `arr`. These variables store their derivatives w.r.t. all inputs. The caller is responsible for allocating the matrices.

In [11]:
void fn_jacobian(double i, double j, double *res) {
  res[0] = i*i;
  res[1] = j*j;
  res[2] = i*j;
}

In [12]:
auto d_fn = clad::jacobian(fn_jacobian);
double res[3] = {0, 0, 0};
clad::matrix<double> d_res(3, 5);
d_fn.execute(3, 5, res, &d_res);

In [13]:
for(int i = 0; i < 3; i++) {
    for(int j = 0; j < 5; j++) {
        std::cout << d_res[i][j] << " ";
    }
    std::cout << std::endl;
}

6 0 0 0 0 
0 10 0 0 0 
5 3 0 0 0 


In [14]:
std::cout<<"Jacobian matrix:\n";
  for (int i=0; i<3; ++i) {
    for (int j=0; j<2; ++j) {
      std::cout<<d_res[i][j]<<" ";
    }
    std::cout<<"\n";
  }

Jacobian matrix:
6 0 
0 10 
5 3 


## Floating-point error estimation

Clad is capable of annotating a given function with floating point error estimation code using the reverse mode of AD.

**clad::estimate_error(f)** takes 1 argument: *f*, i.e. a pointer to the function or method to be annotated with floating point error estimation code.

The function signature of the generated code is the same as the one of *clad::gradient(f)* with the exception that it has an extra argument at the end of type double&. This argument returns the total floating point error in the function by reference.

In [15]:
// Generate the floating point error estimation code for 'f'.
auto df = clad::estimate_error(fn);
// Print the generated code to standard output.
df.dump();
// Declare the necessary variables.
double x, y, d_x, d_y, final_error = 0;
// Finally call execute on the generated code.
df.execute(x, y, &d_x, &d_y, final_error);
// After this, 'final_error' contains the floating point error in function 'f'.

The code is: 
void fn_grad(double x, double y, double *_d_x, double *_d_y, double &_final_error) {
    double _ret_value0 = 0.;
    _ret_value0 = x * x * y + y * y;
    {
        *_d_x += 1 * y * x;
        *_d_x += x * 1 * y;
        *_d_y += x * x * 1;
        *_d_y += 1 * y;
        *_d_y += y * 1;
    }
    _final_error += std::abs(*_d_x * x * 1.1920928955078125E-7);
    _final_error += std::abs(*_d_y * y * 1.1920928955078125E-7);
    _final_error += std::abs(1. * _ret_value0 * 1.1920928955078125E-7);
}



## Functors

In [16]:
class Equation {
  double m_x, m_y;

  public:
  Equation(double x = 0, double y = 0) : m_x(x), m_y(y) {}
  double operator()(double i, double j) {
    return m_x*i*j + m_y*i*j;
  }
  void setX(double x) {
    m_x = x;
  }
};

In [17]:
Equation E(3,5);
auto d_E = clad::differentiate(E, "i");

In [18]:
d_E.dump();

The code is: 
double operator_call_darg0(double i, double j) {
    double _d_i = 1;
    double _d_j = 0;
    Equation _d_this_obj;
    Equation *_d_this = &_d_this_obj;
    double _d_m_x = 0;
    double _d_m_y = 0;
    double &_t0 = this->m_x;
    double _t1 = _t0 * i;
    double &_t2 = this->m_y;
    double _t3 = _t2 * i;
    return (_d_m_x * i + _t0 * _d_i) * j + _t1 * _d_j + (_d_m_y * i + _t2 * _d_i) * j + _t3 * _d_j;
}

