Segmentation fault in CVAfindIndex #31

aseyboldt · 2020-02-26T10:33:29Z

I am using the adjoint sensitivity analysis functionality of sunodes, and sporadically I get segmentation faults during the backward pass. Unfortunately I could not reproduce this with a small example so far, but the coredump seems to indicate to me, that CVAfindIndex tries to access checkpoints that do not exist relatively near t0 of the forward problem, in a region where the solver is making (ridiculously?) small steps.

It seems that CVAfindIndex is trying to find a checkpoint for t = 161.33623519238427 but the largest of the 600 entries in ca_mem->dt_mem has only t = 161.33623519238293.

The details of how I'm using sundials are somewhat hidden in a python wrapper and pymc3 (I'm sampling the parameter space with an hamiltonian sampler), but here is a rough outline of what I'm doing:

Initialize forward and backward solvers with polynomial interpolation and checkpoints every 600
Repeat (a lot):
- Change user_data
- Call CVodeReInit and CVodeAdjReInit
- Run forward solver
- Call CVodeReInitB, CVodeQuadReInitB and CVodeBsolve repeatedly, as the adjoint rhs is not continuous.

The t of the segfault is nowhere near the discontinuities of the rhs, the first one of those is at t ~ 12000.

The source for the solver calls is here: https://github.com/aseyboldt/sunode/blob/master/sunode/solver.py#L365

I can also provide the coredump if that is helpful.

The text was updated successfully, but these errors were encountered:

aseyboldt · 2020-03-02T07:11:17Z

I think I figured out what the problem here seems to be:
Let's assume there is only one backward problem.
At the beginning of the loop that advances all the backward problems (here), ck_mem is initialized so that ck_mem->ck_t0 < cvB_mem->cv_mem->cv_tn < ck_mem->ck_t1.

The solver sets ck_mem->ck_t0 as stop time (here) and advances the backward problem. If the solver reached that stop time (so cvB_mem->cv_tout == ck_mem->ck_t0), then cvB_mem->cv_mem->cv_tn will still be larger than the stop time by a small amount, since it (incorrectly in this case) assumes it can not compute the rhs at the stop time itself.
In the next step after advancing the checkpoint, the invariant from above will not be true anymore, and CVStep will continue at cv_tn, so that CVfindIndex will access out-of-bounds memory (here) when looking for a step with t >= ck_mem->ck_t1.

Wouldn't it be better to compute a few more points when re-integrating the forward problem so that the checkpoint data sections overlap slightly? Then the solver would not have to integrate right up to the stop time in all but the last checkpoint sections. That might also lower interpolation errors somewhat I guess.

aseyboldt · 2020-03-18T21:43:49Z

@balos1 Not sure who to ping, I hope this is alright.
I just ran into an example where I think this bug leads to silently incorrect results. I'd really appreciate it if someone who knows the code could have a look.

aseyboldt mentioned this issue Feb 26, 2020

Check upper end of buffer in CVAfindIndex #32

Closed

aseyboldt mentioned this issue Jul 1, 2020

Infinite loop in CVodeF #44

Closed

balos1 added the pkg-CVODES label Sep 10, 2020

aseyboldt mentioned this issue Sep 29, 2020

Segmentation fault in CVAdataStore #49

Open

balos1 added the triage label Mar 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault in CVAfindIndex #31

Segmentation fault in CVAfindIndex #31

aseyboldt commented Feb 26, 2020 •

edited

Loading

aseyboldt commented Mar 2, 2020

aseyboldt commented Mar 18, 2020

Segmentation fault in CVAfindIndex #31

Segmentation fault in CVAfindIndex #31

Comments

aseyboldt commented Feb 26, 2020 • edited Loading

aseyboldt commented Mar 2, 2020

aseyboldt commented Mar 18, 2020

aseyboldt commented Feb 26, 2020 •

edited

Loading