Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard to trigger crash with async queries and InactiveDestroy #37

Open
johto opened this issue Apr 20, 2018 · 1 comment
Open

Hard to trigger crash with async queries and InactiveDestroy #37

johto opened this issue Apr 20, 2018 · 1 comment
Assignees

Comments

@johto
Copy link

johto commented Apr 20, 2018

Hi,

I unfortunately can't share the test case since it uses thousands of lines of proprietary code, but it does roughly this:

use DBI;
use DBD::Pg ':async';
use Time::HiRes;

sub mayhem {
    my $dbh = DBI->connect("dbi:Pg:", '', '', {pg_enable_utf8 => 1, PrintError => 0, RaiseError => 1, pg_server_prepare => 0});
    my $sth = $dbh->prepare(q{SELECT CASE WHEN pg_sleep(random() * 0.25) IS NOT NULL THEN random()::text::uuid END}, {pg_async => PG_ASYNC});
    $dbh->{InactiveDestroy} = 1;
    $sth->execute();
    for (my $i = 0; ; ++$i) {
        if ($sth->pg_ready) {
            last;
        }
        if ($i >= 100) {
            print STDERR "wait timeout\n";
            $sth->cancel();
            exit(1);
        }
        Time::HiRes::sleep(0.05);
    }
    eval {
        $sth->pg_result();
    };
}
mayhem();

which sometimes crashes with a backtrace like this:

(gdb) bt
#0  0x00007ffff61c40fb in PQclear () from /usr/lib/x86_64-linux-gnu/libpq.so.5
#1  0x00007ffff64107f8 in pg_db_destroy (dbh=0x1d1bc40, imp_dbh=0x1d22570) at dbdimp.c:688
#2  0x00007ffff6407b54 in XS_DBD__Pg__db_DESTROY (my_perl=0x603010, cv=<optimized out>) at ./Pg.xsi:390
#3  0x00007ffff664a656 in XS_DBI_dispatch (my_perl=<optimized out>, cv=0x8c0268) at DBI.xs:3797
#4  0x00007ffff7b0ad26 in Perl_pp_entersub () from /usr/lib/libperl.so.5.18
#5  0x00007ffff7a94846 in Perl_call_sv () from /usr/lib/libperl.so.5.18
#6  0x00007ffff7b13d71 in ?? () from /usr/lib/libperl.so.5.18
#7  0x00007ffff7b14500 in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#8  0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#9  0x00007ffff7aecf48 in ?? () from /usr/lib/libperl.so.5.18
#10 0x00007ffff7aed731 in Perl_mg_free () from /usr/lib/libperl.so.5.18
#11 0x00007ffff7b1479b in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#12 0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#13 0x00007ffff7ac2938 in Perl_cv_undef () from /usr/lib/libperl.so.5.18
#14 0x00007ffff7b1473c in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#15 0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#16 0x00007ffff7b36698 in Perl_free_tmps () from /usr/lib/libperl.so.5.18
#17 0x00007ffff7b03a35 in Perl_pp_nextstate () from /usr/lib/libperl.so.5.18
#18 0x00007ffff7b03346 in Perl_runops_standard () from /usr/lib/libperl.so.5.18
#19 0x00007ffff7a9bb24 in perl_run () from /usr/lib/libperl.so.5.18
#20 0x0000000000400dd9 in main ()

If I reset InactiveDestroy back to 0 before exiting the subroutine, the problem goes away.

My hypothesis is that the culprit is this code in dbd_st_destroy:

   /* If the InactiveDestroy flag has been set, we go no further */
   if (DBIc_IADESTROY(imp_dbh)) {
       if (TRACE4_slow) {
           TRC(DBILOGFP, "%sskipping sth destroy due to InactiveDestroy\n", THEADER_slow);
       }
       fprintf(stderr, "%x\n", imp_sth);
       DBIc_IMPSET_off(imp_sth); /* let DBI know we've done it */
       if (TEND_slow) TRC(DBILOGFP, "%sEnd dbd_st_destroy (InactiveDestroy set)\n", THEADER_slow);
       return;
   }

which DBIc_IMPSET_off()s the imp_sth still assigned to imp_dbh->async_sth. Later on, when dbd_db_destroy is called, its imp_dbh->async_sth points to freed memory containing garbage and the PQclear() call crashes. My digging around with gdb supports this hypothesis: imp_dbh->async_sth still has a pointer to the same location that previously held a valid sth, but *imp_dbh->async_sth is just garbage.

I've verified this against the master branches of both DBI and DBD::Pg, but this problem seems to have existed for a long time.

@turnstep
Copy link
Contributor

We recently cleared up some problems with PQclear - could you see if a newer DBD::Pg clears it up for you? Anything >= version 3.10.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants