#  AD Tutorial - CLAD & Jupyter Notebook 

xeus-cling provides a Jupyter kernel for C++ with the help of the C++ interpreter cling and the native implementation of the Jupyter protocol xeus.

Within the xeus-cling framework, Clad can enable automatic differentiation (AD) such that users can automatically generate C++ code for their computation of derivatives of their functions.



In [3]:
#include "clad/Differentiator/Differentiator.h"
#include <iostream>

## Forward Mode AD

For a function _f_ of several inputs and single (scalar) output, forward mode AD can be used to compute (or, in case of Clad, create a function) computing a directional derivative of _f_ with respect to a single specified input variable. Moreover, the generated derivative function has the same signature as the original function _f_, however its return value is the value of the derivative.

In [4]:
double fn(double x, double y) {
  return x*x*y + y*y;
}

In [3]:
auto fn_dx = clad::differentiate(fn, "x");

In [4]:
fn_dx.execute(5, 3)

30.000000

## Reverse Mode AD

Reverse-mode AD enables the gradient computation within a single pass of the computation graph of _f_ using at most a constant factor (around 4) more arithmetical operations compared to the original function. While its constant factor and memory overhead is higher than that of the forward-mode, it is independent of the number of inputs.

Moreover, the generated function has void return type and same input arguments. The function has an additional argument of type T*, where T is the return type of _f_. This is the “result” argument which has to point to the beginning of the vector where the gradient will be stored.

In [5]:
double fn(double x, double y) {
  return x*x + y*y;
}

In [6]:
auto d_fn_2 = clad::gradient(fn, "x, y");
double d_x, d_y;

In [7]:
d_fn_2.dump();

The code is: void fn_grad(double x, double y, clad::array_ref<double> _d_x, clad::array_ref<double> _d_y) {
    double _t2;
    double _t3;
    double _t4;
    double _t5;
    _t3 = x;
    _t2 = x;
    _t5 = y;
    _t4 = y;
    goto _label0;
  _label0:
    {
        double _r0 = 1 * _t2;
        * _d_x += _r0;
        double _r1 = _t3 * 1;
        * _d_x += _r1;
        double _r2 = 1 * _t4;
        * _d_y += _r2;
        double _r3 = _t5 * 1;
        * _d_y += _r3;
    }
}



## Hessian

Clad can produce the hessian matrix of a function using its forward and reverse mode capabilities. Its interface is similar to reverse mode but differs when arrays are involved. It returns the matrix as a flattened vector in row major format.

In [8]:
double kinetic_energy(double mass, double velocity) {
  return mass * velocity * velocity * 0.5;
}

In [9]:
auto hessian = clad::hessian(kinetic_energy, "mass, velocity");
double matrix[4];
hessian.execute(10, 2, matrix)

In [10]:
matrix

{ 0.0000000, 2.0000000, 2.0000000, 10.000000 }

## Jacobian

Clad can produce the jacobian of a function using its reverse mode. It returns the jacobian matrix as a flattened vector in row major format. The generated function has void return type and same input arguments. The function has an additional argument of type T\*, where T is the pointee type of the output (the last variable) of *fn_jacobian*. This variable stores the jacobian matrix. The caller is responsible for allocating and zeroing-out the jacobian storage. 

In [11]:
void fn_jacobian(double i, double j, double *res) {
  res[0] = i*i;
  res[1] = j*j;
  res[2] = i*j;
}

In [12]:
auto d_fn = clad::jacobian(fn_jacobian);
double res[3] = {0, 0, 0};
double derivatives[6] = {0, 0, 0, 0, 0, 0};
d_fn.execute(3, 5, res, derivatives);

In [13]:
derivatives

{ 6.0000000, 0.0000000, 0.0000000, 10.000000, 5.0000000, 3.0000000 }

In [14]:
std::cout<<"Jacobian matrix:\n";
  for (int i=0; i<3; ++i) {
    for (int j=0; j<2; ++j) {
      std::cout<<derivatives[i*2 + j]<<" ";
    }
    std::cout<<"\n";
  }

Jacobian matrix:
6 0 
0 10 
5 3 


## Floating-point error estimation

Clad is capable of annotating a given function with floating point error estimation code using the reverse mode of AD.

**clad::estimate_error(f)** takes 1 argument: *f*, i.e. a pointer to the function or method to be annotated with floating point error estimation code.

The function signature of the generated code is the same as the one of *clad::gradient(f)* with the exception that it has an extra argument at the end of type double&. This argument returns the total floating point error in the function by reference.

In [5]:
// Generate the floating point error estimation code for 'f'.
auto df = clad::estimate_error(fn);
// Print the generated code to standard output.
df.dump();
// Declare the necessary variables.
double x, y, d_x, d_y, final_error = 0;
// Finally call execute on the generated code.
df.execute(x, y, &d_x, &d_y, final_error);
// After this, 'final_error' contains the floating point error in function 'f'.

The code is: 
void fn_grad(double x, double y, clad::array_ref<double> _d_x, clad::array_ref<double> _d_y, double &_final_error) {
    double _t2;
    double _t3;
    double _t4;
    double _t5;
    double _t6;
    double _t7;
    double _ret_value0 = 0;
    _t4 = x;
    _t3 = x;
    _t5 = _t4 * _t3;
    _t2 = y;
    _t7 = y;
    _t6 = y;
    _ret_value0 = _t5 * _t2 + _t7 * _t6;
    goto _label0;
  _label0:
    {
        double _r0 = 1 * _t2;
        double _r1 = _r0 * _t3;
        * _d_x += _r1;
        double _r2 = _t4 * _r0;
        * _d_x += _r2;
        double _r3 = _t5 * 1;
        * _d_y += _r3;
        double _r4 = 1 * _t6;
        * _d_y += _r4;
        double _r5 = _t7 * 1;
        * _d_y += _r5;
    }
    double _delta_x = 0;
    _delta_x += std::abs(* _d_x * x * 1.1920928955078125E-7);
    double _delta_y = 0;
    _delta_y += std::abs(* _d_y * y * 1.1920928955078125E-7);
    _final_error += _delta_y + _delta_x + std::abs(1. * _ret_value0 * 1.1920928955078125E-7);
}



## Functors

In [15]:
class Equation {
  double m_x, m_y;

  public:
  Equation(double x, double y) : m_x(x), m_y(y) {}
  double operator()(double i, double j) {
    return m_x*i*j + m_y*i*j;
  }
  void setX(double x) {
    m_x = x;
  }
};

In [16]:
Equation E(3,5);
auto d_E = clad::differentiate(E, "i");

In [17]:
d_E.dump()

The code is: double operator_call_darg0(double i, double j) {
    double _d_i = 1;
    double _d_j = 0;
    double &_t2 = this->m_x;
    double _t3 = _t2 * i;
    double &_t4 = this->m_y;
    double _t5 = _t4 * i;
    return (0. * i + _t2 * _d_i) * j + _t3 * _d_j + (0. * i + _t4 * _d_i) * j + _t5 * _d_j;
}

