A possible memory issue? #160

sergeigribanov · 2021-06-04T10:44:36Z

I would like to use autodiff in my project. The project is related to the problem of nonlinear constrained optimization. In the case of my problem, the optimization is done not once, but hundreds of thousands or millions of times in a row. The maximum dimension of the parameter space can be on the order of several hundred. I want to use autodiff to calculate the gradient and hessian of an objective function in an optimization algorithm. Before using autodiff in my project, I decided to conduct a series of tests. In one of my tests, I came across a memory usage issue. The code of the test program is the following:

// C++ includes
#include <iostream>

// Eigen includes
#include <Eigen/Core>
using namespace Eigen;

// autodiff include
#include <autodiff/reverse.hpp>
#include <autodiff/reverse/eigen.hpp>
using namespace autodiff;

var f(const VectorXvar& x)
{
  return x.dot(x);
}

int main() {
  const int n = 100;
  const int numberOfEntries = 1000000;
  for (int i = 0; i < numberOfEntries; ++i) {
    // VectorXvar x = VectorXvar::Random(n);
    VectorXvar x = VectorXvar::Ones(n);
    var u = f(x);
    VectorXd g;
    MatrixXd H = hessian(u, x, g);
  }
  std::cout << "Finished!" << std::endl;
  return 0;
}

CMake file that I use to build this test program is the following:

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(autodiff-tests)

find_package(Eigen3 REQUIRED NO_MODULE)
find_package(autodiff)

add_executable(reverse-hessian ${CMAKE_CURRENT_SOURCE_DIR}/src/reverse-hessian.cpp)
target_link_libraries(reverse-hessian Eigen3::Eigen autodiff::autodiff)

The issue happens when the variable numberOfEntries is equal to 1M. In this case, memory usage gradually rises to 100%. Eventually, the process ends with the message: Killed.

If the variable numberOfEntries is equal to 100K, then the program runs without issues.

I have tested this program on my laptop (16 GB of RAM, CPU: intel core i7, OS: Arch Linux).

The text was updated successfully, but these errors were encountered:

jan-grimo · 2021-08-12T08:25:07Z

I've encountered a memory leak in reverse auto-differentiation as well. My scalar function contains a dot-product of variables too, although there's plenty of other complexity too. I also run into OOM issues due to repeated evaluation.

Compiling @sergeigribanov's program and running it through valgrind --tool=memcheck --leak-check=full with 10 loop repeats yields the following:

==15024== 876,320 (560 direct, 875,760 indirect) bytes in 10 blocks are definitely lost in loss record 14 of 14
[...]
==15024==    by 0x128117: std::shared_ptr<autodiff::detail::IndependentVariableExpr<double> > std::make_shared<autodiff::detail::IndependentVariableExpr<double>, int const&>(int const&) (shared_ptr.h:718)
==15024==    by 0x1276F8: autodiff::detail::Variable<double>::Variable<int>(int const&) (var.hpp:1052)
==15024==    by 0x127081: Eigen::DenseBase<Eigen::Matrix<autodiff::detail::Variable<double>, -1, 1, 0, -1, 1> >::Ones(long) (CwiseNullaryOp.h:582)
==15024==    by 0x126447: main (main.cpp:23)

I'm having trouble making heads or tails of this. Reducing the program to

var f(const VectorXvar& x) { return x.dot(x); }

int main() {
  VectorXvar x = VectorXvar::Ones(7);
  var u = f(x);
  VectorXd g;
  MatrixXd H = hessian(u, x, g);
  std::cout << "Finished!" << std::endl;
  return 0;
}

yields the following ctor/dtor counts, omitting Variable(const int&) and Variable(const double&) which aren't particularly interesting

 N  Fn
-----------------------------
 8  Variable()
    - 7 placement new in VectorXvar::Ones(n)
    - 1 in dot product
                                                                                 
10  Variable(const Variable&)
    - 1 in VectorXvar::Ones(n) -> Variable(const int&) Eigen::nullaryop ctor
    - 1 in VectorXvar::Constant(n)
    - 1 in Eigen::nullaryop copy
    - 7 in Eigen::nullaryop eval
                                                                                 
23  Variable(const ExprPtr&)
    - 10 forwarded from Variable(const Variable&) in Eigen::nullaryop
    - 1 in dot product (independent expr)
    - 12 in dot product (dependent expr)
                                                                                 
32  ~Variable()
    - 1 temporary Variable in VectorXvar::Ones(n) from int that's copied
    - 1 moved-from Variable (?) between
      Eigen::[...]::Ones(n) and Eigen::[...]::Constant(n, var(1))
    - 7 in Eigen::nullaryop eval assignment
    - 1 in Eigen::nullaryop evaluator dtor
    - 1 in Eigen::nullaryop dtor
    - 13 in dot product
    - 8 uncaught by gdb at end of main
                                                                                 
 9  IndependentVariable()
    - 1 in VectorXvar::Ones(n) -> Variable<double>(const int&)
    - 7 in VectorXvar::Ones(n) (field initializations)
    - 1 in dot product initialization
                                                                                 
 8  ~IndependentVariable()
    - 7 in Eigen::nullaryop eval assignment
    - 1 in dot product

Most interestingly, the program does not leak if hessian is not called. The otherwise missing ~IndependentVariable() call is then performed in a ~Matrix() stack that has to be the VectorXvar deletion at the end of main.

Any ideas what this could be @allanleal so I can look closer?

allanleal · 2021-08-12T11:11:24Z

Thanks for reporting this. Are you able to create a reproducible example that does not involve Eigen? This could simplify the memory leaking checking.

jan-grimo · 2021-08-12T15:48:46Z

I don't think I can. The Eigen-related functionality seems to be important for this. If I replace the VectorXvar with individual var instances and redefine the function as a manual dot-product of sorts, the program doesn't leak.

I can try to run this through cvise though. Not quite sure how to preserve the exact leak, but it might narrow down the culprit?

jan-grimo · 2021-08-16T16:19:15Z

I whipped up a graphviz representation here and ran it against an even smaller variant to keep the graphs reasonably small:

VectorXvar x = VectorXvar::Ones(3);
var u = x.dot(x);

std::ofstream dp("dp-pre.dot");
dp << graphviz(u.expr);

VectorXd g;
MatrixXd H = hessian(u, x, g);

std::ofstream dpo("dp-post.dot");
dpo << graphviz(u.expr);

The resulting graphs are here as in svg form. The short summary is that forming the expression tree is unproblematic and looks okay, but running the gradients generates shared_ptr cycles which can't be freed. I highlighted one I found at a glance in the graph in red.

A fix might be as easy as clearing all of the gradx members in the expression tree before returning from hessian, but I've not had any success so far.

jan-grimo · 2021-08-18T11:04:55Z

There's two possible solutions to my eyes:

Separate out the grad and gradx members from the expression tree entirely. That way, there can be no ownership cycles. The gradient and hessian coefficient expression trees could be kept and reevaluated in some combination with the update functionality.
Traverse the full tree after gradient and hessian eigen.hpp functions, nulling out all VariableExpr gradx members. Cycles exist, but are temporary. All gradient or hessian coefficient expression trees need to be freed within the gradient and hessian functions to avoid having cycles in memory that could leak.

Personally, I'd advocate for 1 (it feels cleaner to me), but I think I've still not completely understood the relationships between grad and gradx in combination with the seed functions. I think it should be possible to reformulate the calculation of gradients and hessians in a way that the grad and gradx members aren't part of the expression tree itself. What do you think @allanleal?

allanleal · 2021-08-18T14:11:59Z

I'm also trying to fix this here, but with no avail. I've managed to reduce the reproducible problem to its simplest form:

#include <autodiff/reverse/var/var.hpp>
#include <autodiff/reverse/var/eigen.hpp>
using namespace autodiff;

var f(const var& x)
{
    return x*x;
}

int main()
{
    var x(1.0);
    var u = f(x);

    x.seedx();

    u.expr->propagatex(detail::constant<double>(1.0));

    return 0;
}

We may need to use a mix of shared_ptr and weak_ptr to eliminate these memory leaks, but I'm not fully certain how to at the moment. This seems the standard way of resolving these memory leak issues with cyclic shared pointers. But I'm also open to your proposed solutions above.

FYI, grad member holds the value of the gradient (useful and more efficient if 1st order derivatives are being computed). The gradx member is used for higher-order derivatives (instead of being of double type, it is of an expression type). This permits this expression to be derived again and again to obtain higher order derivatives.

allanleal · 2021-08-18T14:38:13Z

Wondering also if we could get rid of shared_prt altogether (or just not using them when creating the expression nodes that correspond to operators). We would then free these expression graph nodes later ourselves, instead of relying on a reference count hitting zero.

jan-grimo · 2021-08-18T19:51:50Z

use a mix of shared_ptr and weak_ptr

I thought about this too, but I was convinced that sometimes the gradx member would refer to an existing expression in the tree, and sometimes own its own expression. Having a maybe-owning pointer in each expression would complicate reasoning about the whole data structure quite a bit and operator overloading too.

This permits this expression to be derived again and again to obtain higher order derivatives

So the main purpose for storing them somewhere is to reuse them for higher-order derivatives? There's no inherent reason they need to be stored within the expression tree itself? So hypothetically hessian could return a VectorXvar gradient and a MatrixXvar hessian containing the coefficient calculation expressions for reuse instead? Maybe supply a function for converting those to values by calling val on each coefficient?

get rid of shared_ptr altogether

I don't see a good way to keep structural sharing between expression trees with this approach

allanleal · 2021-08-24T06:32:36Z

This issue has been fixed with PR #177 provided by @jargonzombies! Thanks a lot for this.

jan-grimo mentioned this issue Aug 19, 2021

Remove grad and gradx members from variable exprs #177

Merged

allanleal closed this as completed Aug 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A possible memory issue? #160

A possible memory issue? #160

sergeigribanov commented Jun 4, 2021 •

edited

Loading

jan-grimo commented Aug 12, 2021

allanleal commented Aug 12, 2021

jan-grimo commented Aug 12, 2021

jan-grimo commented Aug 16, 2021

jan-grimo commented Aug 18, 2021

allanleal commented Aug 18, 2021 •

edited

Loading

allanleal commented Aug 18, 2021

jan-grimo commented Aug 18, 2021 •

edited

Loading

allanleal commented Aug 24, 2021

A possible memory issue? #160

A possible memory issue? #160

Comments

sergeigribanov commented Jun 4, 2021 • edited Loading

jan-grimo commented Aug 12, 2021

allanleal commented Aug 12, 2021

jan-grimo commented Aug 12, 2021

jan-grimo commented Aug 16, 2021

jan-grimo commented Aug 18, 2021

allanleal commented Aug 18, 2021 • edited Loading

allanleal commented Aug 18, 2021

jan-grimo commented Aug 18, 2021 • edited Loading

allanleal commented Aug 24, 2021

sergeigribanov commented Jun 4, 2021 •

edited

Loading

allanleal commented Aug 18, 2021 •

edited

Loading

jan-grimo commented Aug 18, 2021 •

edited

Loading