You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There's a very large (factor of around 2 at second order, more at third order) efficiency in the current implementation of fvar for our intended uses. It currently uses the template type for both values and tangents.
template <typename T>
struct fvar {
T val_;
T d_;
...
};
This winds up propagating all kinds of derivative information to values which we never use.
New Approach
What we do use is fvar<var> and fvar<fvar<var> >. In both of these cases, we can get away with a double value type.
Change storage type to the following
Deal with resulting test and compilation storm
struct fvar {
double val_;
T d_;
...
};
Example of Savings at Second Order
For example, multiplying two fvar instances a and b gives you value a.val_ * b.val_ and tangent a.d_ * b.val_ + a.val_ * b.d_.
There's a very large (factor of around 2 at second order, more at third order) efficiency in the current implementation of
fvar
for our intended uses. It currently uses the template type for both values and tangents.This winds up propagating all kinds of derivative information to values which we never use.
New Approach
What we do use is
fvar<var>
andfvar<fvar<var> >
. In both of these cases, we can get away with adouble
value type.Example of Savings at Second Order
For example, multiplying two
fvar
instancesa
andb
gives you valuea.val_ * b.val_
and tangenta.d_ * b.val_ + a.val_ * b.d_
.var
: value) var_var; tangent) var_var, var_var, var+vardouble
: value) double_double; tangent) var_double, double_var, var+vardouble * double
: 0 nodes, 1 multiplyvar * double
: 1 node, 1 multiply (val), 1 multiply (adjoint), 1 add (adjoint)var * var
: 1 node, 1 multiply (value), 2 multiplies (adjoints), 2 adds (adjoint)var + var
: 1 node, 1 add (value) 2 adds (adjoint)Total original: 4 nodes, 9 multiply, 9 add
Total new: 3 nodes, 5 multiply, 5 add
By removing a node, there's less pointer chasing to
chain()
implementations and one fewer constructor called.As an added bonus, it'll save 25% of the memory used in a multiplication (half in an addition).
The text was updated successfully, but these errors were encountered: