Automatic differentiation in C++; infinite differentiability of conditionals, loops, recursion and all things C++
dCpp is a tool for automatic differentiation made to be intuitive to the mind of a C++ programmer and non-invasive to his process. Despite its easy to use nature it retains flexibility, allowing implementations of differentiable (sub) programs operating on differentiable derivatives of other (sub) programs, where the entire process may again be differentiable. This allows trainable training processes, and flexible program analysis through operational calculus.
dCpp was originally developed as an example of how automatic differentiation can be viewed through Tensor and Operational Calculus. It has since been applied to a variety of tasks from dynamical systems analysis and digital geometry, to general program analysis and optimization by various parties.
Note that this was the authors first c++ project, which is reflected in the repository :).
We demonstrate the utilities of dCpp on a simple encapsulating example.
First we include the necessities
#include <iostream>
#include <dCpp.h>
We initialize an n-differentiable programming space
using namespace dCpp;
int n_differentiable = 3;
dCpp::initSpace(n_differentiable);
The API of var
complies with those for the standard C++ types, and when an instance of var
is left uninitialized it behaves as the type double
would have. We may envision an instance of var
as an element of the differentiable virtual memory algebra, elevating C++ to a differentiable programming space dCpp. This means that any program can be made differentiable by simply substituting the type double
for type var
and that the coding process of the user can be left unchanged towards the initially intended goal.
By coding a simple recursive function foo
we see that the usual usage of constructs such as conditionals, loops and recursion remains unchanged.
var foo(const var& x, const var& y)
{
if(x < 1)
return y;
else if(y < 1)
return x;
else
return x / foo(x / 2, y) + y * foo(x, y / 3);
}
To test it, we declare two instances of var
.
var x=10;
var y=13;
Variables with respect to which differentiation is to be performed need to be initialized as such. This assures that uninitialized instances behave as the type double
does. With the difference that all instances of var
are differentiable with respect to all initialized instances.
dCpp::init(x);
dCpp::init(y);
The derivatives are extracted by specifying the memory location of the variable with respect to which differentiation is to be performed.
var f = foo(x,y);
std::cout << f << std::endl;
std::cout << f.d(&x) << std::endl;
std::cout << f.d(&y) << std::endl;
884.998
82.1202
193.959
The virtual memory space is constructed through tensor products of C++ internal representation of the memory. This means that derivatives are themselves elements of the differentiable virtual memory.
var fx = f.d(&x);
std::cout << fx.d(&x) << std::endl;
std::cout << fx.d(&y) << std::endl;
var fy = f.d(&y);
std::cout << fy.d(&x) << std::endl;
std::cout << fy.d(&y) << std::endl;
-0.103319
18.7722
18.7722
28.8913
We can thus employ derivatives of f
in further n-1-differentiable calculations.
var F = dCpp::sqrt((fx^2) + (fy^2));
std::cout << F << std::endl;
std::cout << F.d(&x) << std::endl;
std::cout << F.d(&y) << std::endl;
210.627
17.2464
33.9239
As the derivatives of f
are n-1-differentiable (twice, in our case), we can interweave them in calculations containing f
itself.
var t = dCpp::sqrt(((fx^2) + (fy^2)) / f);
std::cout << t << std::endl;
std::cout << t.d(&x) << std::endl;
std::cout << t.d(&y) << std::endl;
7.08016
0.251241
0.364486
This is particularly useful when analyzing and optimizing differential equations, where usually both f
and its (higher) derivatives appear in the same expression.
The order of an expression is that of the lowest order of the expressions appearing in its construction.
Expression | Order |
---|---|
f | 3 |
fx = f.d(&x) | 2 |
fy = f.d(&y) | 2 |
(fx^2 + fy^2) / f | 2 |
fxx = fx.d(&x) | 1 |
fxy = fx.d(&y) | 1 |
f * (fxy + fxx) / (fx - fy) | 1 |
This means, that when we want to perform some non-differentiable operation on an expression, such as incrementing a variable in a gradient descent iteration, we should extract the value of its derivative using the id
attribute of the instance var
.
double lambda = 0.1;
double fx_double = f.d(&x).id
x += lambda * fx_double;
double fy_double = f.d(&x).id
y += lambda * fy_double
An example of a gradient descent can be found in examples/barycenterGD with a detailed explanation available in the closed issue here.
If a certain mapping the user desires is not provided in the dCpp namespace, but its derivative exists, he may create the desired map by employing the operator tau
.
Lets assume that the map log
is not provided and create it using tau
, by providing it with two maps, log
: double --> double and log_primitive
: var --> var.
var log_primitive(const var& v)
{
return 1 / v;
}
tau log(std::log, log_primitive);
The map is now ready to use
var l=log(((x^2) - (y^0.23))^2.1);
std::cout << l << std::endl;
std::cout << l.d(&x) << std::endl;
std::cout << l.d(&y) << std::endl;
9.63263
0.427715
-0.000682522
- examples/softmax demonstrates the construction of a vectorized softmax.
- examples/barycenterGD demonstrates gradient descent on the example of finding a barycenter.
- examples/dTauUse demonstrates the use of the
tau
operator. - examples/dEigenSoftmax demonstrates integration with external libraries, on the example of Eigen.
As the presenting tutorial is quite brief, please consult the discussions regarding common mistakes and solutions.
- Common mistakes with initializations, differentiability of differentiable processes, and general understanding of functionality
- Solutions with detailed explanations of the rationale behind them
Or consult the concerning papers.
- Operational Calculus for Differentiable Programming is the paper in which the theory is derived and the process of its use to the purpose of program analysis and deep learning is outlined.
- Automatic Differentiation: a look through Tensor and Operational Calculus is the paper in which the implementation of dCpp is explained, as we take a look at Automatic Differentiation through the eyes of Tensor and Operational Calculus.
If you use dCpp in your work, please cite one of the following papers
Žiga Sajovic, et al.: Operational Calculus for Differentiable Programming. arXiv e-prints arXiv:1610.07690 (2016)
@article{
Author = {Žiga Sajovic, et al.},
Title = {Operational Calculus for Differentiable Programming},
journal = {arXiv e-prints},
Year = 2016,
volume = {arXiv:1610.07690},
Eprint = {1610.07690},
Eprinttype = {arXiv},
}
Žiga Sajovic: Automatic Differentiation: a look through Tensor and Operational Calculus. arXiv e-prints arXiv:1612.02731 (2016)
@article{
Author = {Žiga Sajovic},
Title = {Automatic Differentiation: a look through Tensor and Operational Calculus},
journal = {arXiv e-prints},
Year = 2016,
volume = {arXiv:1612.02731},
Eprint = {1612.02731},
Eprinttype = {arXiv},
}
dC++ by Žiga Sajovic is licensed under a Creative Commons Attribution 4.0 International License.