Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory problems with sparse jacobian #19

Open
Saelyos opened this issue Mar 2, 2020 · 5 comments
Open

Memory problems with sparse jacobian #19

Saelyos opened this issue Mar 2, 2020 · 5 comments

Comments

@Saelyos
Copy link

Saelyos commented Mar 2, 2020

Issue migrated from gitlab.

Hello, I'm experimenting memory problems with the sparse jacobian.
To make them reproducible I've used the code from additional_examples/sparse/sparse_jacobian.cpp, kept the part containing the spare jacobian computation, and run it 10000 times to make bugs occur more frequently:

/*----------------------------------------------------------------------------
 ADOL-C -- Automatic Differentiation by Overloading in C++
 File:     sparse_jacobian.cpp
 Revision: $Id: sparse_jacobian.cpp 299 2012-03-21 16:08:40Z kulshres $
 Contents: example for computation of sparse jacobians

 Copyright (c) Andrea Walther, Andreas Griewank, Andreas Kowarz, 
               Hristo Mitev, Sebastian Schlenkrich, Jean Utke, Olaf Vogel
 
 This file is part of ADOL-C. This software is provided as open source.
 Any use, reproduction, or distribution of the software constitutes 
 recipient's acceptance of the terms of the accompanying license file.
 
---------------------------------------------------------------------------*/

#include <math.h>
#include <cstdlib>
#include <cstdio>

#include <adolc/adolc.h>
#include <adolc/adolc_sparse.h>

#define tag 1

void ceval_ad(adouble *x, adouble *c);

int main() {
    int n=6, m=3;
    double x[6], c[3];
    adouble xad[6], cad[3];

    int i, j;

/****************************************************************************/
/*******                function evaluation                   ***************/
/****************************************************************************/

    for(i=0;i<n;i++)
        x[i] = log(1.0+i);

    /* Tracing of function c(x) */

    trace_on(tag);
      for(i=0;i<n;i++)
        xad[i] <<= x[i];

      ceval_ad(xad,cad);

      for(i=0;i<m;i++)
        cad[i] >>= c[i];
    trace_off();


/****************************************************************************/
/*******       sparse Jacobians, complete driver              ***************/
/****************************************************************************/

    for (i = 0; i < 10000; i++) {
        std::cout << i << std::endl;

        /* coordinate format for Jacobian */
        unsigned int *rind  = NULL;        /* row indices    */
        unsigned int *cind  = NULL;        /* column indices */
        double       *values = NULL;       /* values         */
        int nnz;
        int options[4];

        options[0] = 0;          /* sparsity pattern by index domains (default) */
        options[1] = 0;          /*                         safe mode (default) */
        options[2] = 0;          /*              not required if options[0] = 0 */
        options[3] = 0;          /*                column compression (default) */

        sparse_jac(tag, m, n, 0, x, &nnz, &rind, &cind, &values, options);


        free(rind); rind=NULL;
        free(cind); cind=NULL;
        free(values); values=NULL;
    }
}


/***************************************************************************/

void ceval_ad(adouble *x, adouble *c) {
    c[0] = 2*x[0]+x[1]-2.0;
    c[0] += cos(x[3])*sin(x[4]);
    c[1] = x[2]*x[2]+x[3]*x[3]-2.0;
    c[2] = 3*x[4]*x[5] - 3.0+sin(x[4]*x[5]);
}

/***************************************************************************/

I can observe double free or corruption (out) and sometimes free(): invalid next size (fast) bugs occurring randomly (it needs on average 1000 calls to sparse_jac before having a bug).

Compiling the code with -fsanitize=address, I could get the following trace:

=================================================================
==12746==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x603000c6f5b0 in thread T0
    #0 0x7ff75a37cc40 in operator delete[](void*) (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xebc40)
    #1 0x7ff759a25f5c in ColPack::BipartiteGraphPartialColoring::Seed_reset() (/usr/lib/x86_64-linux-gnu/libColPack.so.0+0x42f5c)
    #2 0x7ff759a2bcee in ColPack::BipartiteGraphPartialColoringInterface::~BipartiteGraphPartialColoringInterface() (/usr/lib/x86_64-linux-gnu/libColPack.so.0+0x48cee)
    #3 0x7ff759a2bd18 in ColPack::BipartiteGraphPartialColoringInterface::~BipartiteGraphPartialColoringInterface() (/usr/lib/x86_64-linux-gnu/libColPack.so.0+0x48d18)
    #4 0x7ff75a275617 in freeSparseJacInfos (/usr/lib/x86_64-linux-gnu/libadolc.so.2+0xbe617)
    #5 0x7ff75a1d6740 in setTapeInfoJacSparse (/usr/lib/x86_64-linux-gnu/libadolc.so.2+0x1f740)
    #6 0x7ff75a275d6b in sparse_jac (/usr/lib/x86_64-linux-gnu/libadolc.so.2+0xbed6b)
    #7 0x55ad70fd595c in main /home/sebastien/CLionProjects/Test_Adol-C/sparse_jacobian.cpp:73
    #8 0x7ff759cf909a in __libc_start_main ../csu/libc-start.c:308
    #9 0x55ad70fd52b9 in _start (/home/sebastien/CLionProjects/Test_Adol-C/cmake-build-debug/sparse+0x22b9)

Address 0x603000c6f5b0 is a wild pointer.
SUMMARY: AddressSanitizer: bad-free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xebc40) in operator delete[](void*)
==12746==ABORTING

It seems however that this is not the only source of errors, but I couldn't catch all of them with sanitizers.


After some investigation, I found out that the bug only occurs when using column compression, and everything is fine with row compression (e.g. with options[3] = 1).
So there might be an error with the lines from sparsedrivers.cpp :

g->GenerateSeedJacobian(&(sJinfos.Seed), &(sJinfos.seed_rows),
                        &(sJinfos.seed_clms), "SMALLEST_LAST","COLUMN_PARTIAL_DISTANCE_TWO");
sJinfos.seed_rows = depen;
ret_val = sJinfos.seed_clms;

From what I've understood, it seems that the memory is being corrupted at some point (the memory is written at some place where it shouldn't be, but without causing segfault), and when this memory is freed we get the double free or corruption or invalid next size depending on which part of the allocated bloc is corrupted. However I couldn't find what caused this.

Additional information

OS: Debian 10
Adol-C version: tested with v2.6.3 and with the current master branch (9d229164)

@awalther1
Copy link
Contributor

Hi,

I just commited a fix to the master branch of ADOL-C. Could you try whether this helps to overcome your problem?

Thanks,

Andrea

@Saelyos
Copy link
Author

Saelyos commented Mar 30, 2020

Hi,

I've tried to run the same program as above with the last master version, but unfortunately it doesn't solve the issue, I still get the double free or corruption error.

@awalther1
Copy link
Contributor

Hi,

just noticed that you assume that
g->GenerateSeedJacobian(&(sJinfos.Seed), &(sJinfos.seed_rows),
&(sJinfos.seed_clms), "SMALLEST_LAST","COLUMN_PARTIAL_DISTANCE_TWO");

might cause the problem. This is then located in the ColPack library, i.e., the external library used by ADOL-C. I will write to the developers about this issue.

Best

Andrea

@jiesky
Copy link

jiesky commented Feb 3, 2023

Hi,
I also have the same program with adocl colpack.... Anyone has solved it?

@awalther1
Copy link
Contributor

Hi,
unfortunately as far as I know, this is not the case. I will try once more to contact the ColPack developers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants