Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow during code generation #22

Closed
guestieng opened this issue Jun 5, 2019 · 13 comments
Closed

Stack overflow during code generation #22

guestieng opened this issue Jun 5, 2019 · 13 comments

Comments

@guestieng
Copy link

Hello,

calling "createDynamicLibrary(...)" causes fatal stack overflow
when the code (e.g. for a "zero-order forward") is to be generated for
more complex models.

These contain a one-deimensional "dependent" vector and a high dimensional "independent" one.
The dependencies in the respective function are implemented, i.a., via nested loops (of nesting order > 2) .
Dependent on the complexity of the function body everything works fine below a particular low threshold dimension of the "independent" vector.
Above such a threshold dimension, however, the stack overflow occurs.
This problem is obviously originated in the recursion methods in code_handler_impl.hpp and /lang/c/language_c.hpp. Among the very bad behaving methods seems to be "markCodeBlockUsed(...)" in code_handler_impl.hpp .

@joaoleal
Copy link
Owner

joaoleal commented Jun 5, 2019

Are you using ModelCSourceGen<>.setRelatedDependents?
Could you please provide a small example that triggers this error?
I'll try to use valgrind to identify the problem.

@guestieng
Copy link
Author

guestieng commented Jun 6, 2019

No, I'm not using ModelCSourceGen<>.setRelatedDependents .
Here is a small code example that causes an overflow
(tested with a stack size of 4194304 Bytes; if, e.g., dim2 is decreased everything should work successfully; depending on the build mode (debug/release) this threshold may also be shifted):

#include <string.h>
#include <cppad/cg.hpp>
#include <cppad/cg/support/cppadcg_eigen.hpp>  

using namespace CppAD;
using namespace CppAD::cg;

typedef CppAD::cg::CG<double> CGD; 
typedef CppAD::AD<CGD> ADCG;
typedef Eigen::Matrix<ADCG, Eigen::Dynamic, 1> ADCGvec;

void ThisAdFunc(ADCGvec& adscalar, const ADCGvec& advec, int dim1, int dim2, int dim3) {
        adscalar(0) = 0;
	for (int ii = 0; ii < dim2; ++ii) {
		for (int jj = 0; jj < dim1; ++jj) {			
			for (int kk = 0; kk < dim3; ++kk) {
				adscalar(0) = adscalar(0) + advec((ii*dim1 + jj) * dim3 + kk)*advec((ii*dim1 + jj) * dim3 + kk);
			}
		}
	}
}

int main(void) {
        int dim1 = 35;
	int dim2 = 29;
	int dim3 = 3;

        ADCGvec thisAdvec(dim1*dim2*dim3);
	ADCGvec thisAdscalar(1);
	CppAD::Independent(thisAdvec);
	ThisAdFunc(thisAdscalar, thisAdvec, dim1, dim2, dim3);

        CppAD::ADFun<CGD> adfun(thisAdvec, thisAdscalar);
        CppAD::cg::ModelCSourceGen<double> cgen(adfun, "adModel");
	cgen.setCreateJacobian(true);
        CppAD::cg::ModelLibraryCSourceGen<double> libcgen(cgen);
        CppAD::cg::DynamicModelLibraryProcessor<double> dynLibProc(libcgen, "<path of dynamic lib.>");
        CppAD::cg::GccCompiler<double> compiler("<compiler path>");
        string storageDir = "<code storage directory>";
	compiler.setSaveToDiskFirst(true);
	compiler.setTemporaryFolder(storageDir );
	compiler.setSourcesFolder(storageDir );
	dynLibProc.createDynamicLibrary(compiler); 		

        return 0;
}

@joaoleal
Copy link
Owner

joaoleal commented Jun 6, 2019

I can't reproduce the issue on my side.
Which version of CppADCodeGen, CppAD, compilers, and operating system are you using?

I'm attaching the code that I compiled and ran under valgrind: error.zip
I'm also attaching the resulting sources: generated_sources.zip

Can you try to run your program under valgrind?
It should say exactly where the issue is.

ps: I was able to bring dim2 up to 100: generated_sources_100.zip

@guestieng
Copy link
Author

For CppADCodeGen I have used the version from 28.10.2019 and for CppAD the version 20180000.
As operating system I have checked Win10 (stack size 4 MB) and now also its Linux subsystem (Ubuntu, with stack size 8 MB).
In Win the compiler is MSVC (VS 2015) and in the Linux system g++ (v.5.4.0).
For the latter the indicated problems (via valgrind) differ a bit, but are also appearing in "generateSourceCode(...)" of language_c.hpp after generateCode(...) in code_handler_impl.hpp has been called.
Ultimately, as compared to your values, the Linux program ended with segmentation fault for, e.g.,

dim1 = 40;
dim2 = 85;
dim3 = 4;

What stack size have you used?
Dependent on that, can you possibly reproduce the problem for higher values of dim2 ?

@joaoleal
Copy link
Owner

joaoleal commented Jun 7, 2019

I'm running native Ubuntu 18.04.2 LTS (64bit) with a stack limit of 8192 kB.
I believe that I used the same version of CppADCodeGen (28.10.2018) and I tried CppAD 20180000 and 20170000.3.
The compiler version was significantly different: g++ (7.4.0)
The last execution I did was with:

int dim1 = 40;
int dim2 = 100;
int dim3 = 4;

Increasing the size of dim2 even further will lead to a very long execution time (the sparse Jacobian has 8000 non-zero elements for dim2=100) which becomes significantly larger under valgrind.
I will try to analyze memory usage (or reduce the default stack memory limit).

@bradbell
Copy link
Collaborator

bradbell commented Jun 8, 2019

Do you think part of the problem is the amount of memory being used by CppAD ?If so, perhaps the following functions will help diagnose the problem:
https://coin-or.github.io/CppAD/doc/ta_inuse.htm
https://coin-or.github.io/CppAD/doc/ta_available.htm

@joaoleal
Copy link
Owner

Settting dim2 = 1000 will trigger the behaviour you observed.
This is due to some parts of CppADCodeGen using recursive algorithms.
I've been replacing some of the current recursive algorithms/methods wich new implementations which use a non-recursive algorithm.
I can already successfully generate the node graph and the operation order in CodeHandler.
LanguageC still needs to be updated so that it uses a non-recursive algorithm.

@guestieng
Copy link
Author

guestieng commented Jun 11, 2019

Thank you a lot for tackling this!
Meanwhile, I have also attempted to reproduce the overflow on a system very similar to yours
(Xubuntu 18.04 (64bit), g++ (7.4.0), default stack size limit of 8192 kB, CppAD 20180000, CppADCodeGen (28.10.2018)).
Here, I also couldn't reproduce the problem up to dim2 = 100.
If, however, I decrease the stack size to, e.g., 4096 kB via "ulimit -s 4096" in the same terminal in which the program is executed, I get the overflow at around dim2 =50.
I don't know to which extend this issue is resolved already (I'm going to check it as soon as possible too), but perhaps this is useful for reducing the effort and execution times.

@joaoleal
Copy link
Owner

I've updated LanguageC to use a non-recursive algorithm and it is now able to generate sources for dim2 = 1000: sources.zip
Unfortunately, gcc can't handle these files 😆 :
imagem

@guestieng
Copy link
Author

guestieng commented Jun 13, 2019

Thanks! I have tried it out and got the same result, ...unfortunately...
However, it seems this can be circumvented by introducing temporary or auxiliary(?) variables in the code that is written into <adModel_forward_zero.c> :
I have just done a (successful) quick test by hand where I used the same gcc compiler configuration as generated in the library:

...
void adModel_forward_zero(double const *const * in,
                          double*const * out,
                          struct LangCAtomicFun atomicFun) {
   //independent variables
   const double* x = in[0];

   //dependent variables
   double* y = out[0];

   // auxiliary variables (!!!)
   double v[<appropriate dimension n>]
   v[0] = ... ;
    .
    .
    .
   v[n-1] = ... ;
   y[0] = v[0] + ... + v[n-1];
}

... or is there perhaps already some implemented option/functionality that allows you to create such temporary variables on demand, which I'm not aware of yet?

P.s.: For possible future updates perhaps the following post is also of interest: #23

@joaoleal
Copy link
Owner

Temporary variables are already created automatically but only when a variable/value is used in more than one location.
The rules for the new temporary variable, in this case, would probably have to depend on the number of operations used to compute a (left) variable so that the expression is shorter.

@joaoleal
Copy link
Owner

There is now an option to define the maximum number of operations per variable assignment:

ModelCSourceGen<Base>::setMaxOperationsPerAssignment(size_t);

@guestieng
Copy link
Author

That's great! The current version passes all the concerned tests now (even with dim2 around 10000).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants