Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqliteutil.Exec doesn't release prepare statements / stmt.Reset() after INTERRUPT #20

Closed
2 tasks done
fasterthanlime opened this issue Jun 30, 2018 · 6 comments
Closed
2 tasks done

Comments

@fasterthanlime
Copy link
Contributor

fasterthanlime commented Jun 30, 2018

In light of #18 (comment) - I've adjusted the pool.Put check to read:

defer func() {
	if rowReturned {
		stmt.lastHasRow = true
	}
}()

and sure enough, connections with non-reset statements were being put back into the pool.

There's two problems afaict:

  • stmt.Reset() is never called from sqliteutil/exec.go
    • since it's a high-level utility, it seems like it should take care of that
  • stmt.Reset() will not reset if we've already been interrupted
    • because, before it calls sqlite3_reset, it calls stmt.interrupted, which calls stmt.conn.interrupted, which sure does return an SQLITE_INTERRUPT error

What I've done locally to "fix" it is:

  • Introduce stmt.ForceReset(), which does not check for stmt.interrupted
  • defer stmt.ForceReset() at the beginning of exec() (lower-case).

However, this isn't a great solution, as will become apparent in my next issue...

@crawshaw
Copy link
Owner

Good catch on Reset. It really should always be called, even after interrupt. I'm going to remove that condition so Reset always passes through to sqlite.

I've also found another case for #18 that I have a patch coming for in a moment.

crawshaw added a commit that referenced this issue Jun 30, 2018
According to the SQLite docs:

	Any new SQL statements that are started after the
	sqlite3_interrupt() call and before the running statements
	reaches zero are interrupted as if they had been running
	prior to the sqlite3_interrupt() call.

When we detect a closed interrupt and return false from Step without
resetting a statemnt, we leave open a statement. This means future
calls to SetInterrupt don't succeed, because of an outstanding sqlite
statement.

To fix that conditon, reset stmts when we interrupt Step without
calling into cgo. To fix the more general condition, reset any
open statements on call to SetInterrupt.

Finally, make sure Reset always resets, as discovered by Amos Wenger.

For #18
For #20
@fasterthanlime
Copy link
Contributor Author

@crawshaw #21 is slightly more annoying - it's not just Reset() that has to run even after interrupt.

It might be a good idea to make the pool.Put check even more thorough, along the lines of:

// bad pseudocode ahead
for {
  C.sqlite3_stmt_next(conn, &stmt)
  if (stmt == nil) {
    break
  }
  if C.sqlite3_stmt_busy(stmt) != 0 {
    panic("conn put back with busy statement")
  }
}

This might be too expensive for prod, so it would make sense to only enable it in debug and/or testing.

But I think this would catch any other instances of #18 we have left.


Also, are you interested in:

  • Adding a chao-like test to the codebase? (it would be skipped on -short)
  • Running all the tests on travis-ci ?

I can take care of either or both of these if that's something you'd want.

@crawshaw
Copy link
Owner

Both of those are good, please send pull requests! (Out of interest, how long is the chao test?)

crawshaw added a commit that referenced this issue Jun 30, 2018
(This is very hard to test.)

For #20
@fasterthanlime
Copy link
Contributor Author

Out of interest, how long is the chao test?

I just had it failing against 7f92843 in about 25 seconds.

I've never seen it take more than a minute (except when it deadlocks, of course).

I've really tried to make it faster, but believe me when I say I've spent two days on it and I don't think there's any parts I can remove:

  • The connection pool cannot be < 2
  • Having fewer workers drops the chance of seeing the bad behavior significantly
  • The problematic behavior often exhibits in round 2 or 3, not always in round 1
  • Making the 'connection tests' deadlines shorter may result in false negatives

The actual run time will vary across machines, and there's still a bit of randomness involved (both in the deadlines and in the actual SQL operations), but yeah, under a minute for a test like that is the best I can offer for now :)

@fasterthanlime
Copy link
Contributor Author

I think this has been addressed by 7f92843 and 5954824

chao runs successfully on master if I comment out sqliteutil.Save, which would be #21

👍 to close this

@crawshaw
Copy link
Owner

I think the issues here are fixed, please reopen if I missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants