Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwinding can result in connections with broken invariants being returned to the connection pool #31

Open
sgrif opened this issue Mar 6, 2017 · 11 comments · May be fixed by #70
Open

Comments

@sgrif
Copy link
Contributor

sgrif commented Mar 6, 2017

Currently PooledConnection unconditionally returns the connection to the pool on drop. Since a panic could happen at any time, we can't assume that the connection will be in a good state if it's being dropped as the result of unwinding. r2d2 should either check thread::panicking before checking the connection back into the pool, or should have a T: UnwindSafe constraint on the connection.

@sgrif sgrif changed the title r2d2 should not attempt to return connections to the pool during unwinding Unwinding can result in connections with broken invariants being returned to the connection pool Mar 6, 2017
@sfackler
Copy link
Owner

sfackler commented Mar 7, 2017

Is this in reference to a connection implementation that will be in a bad state during a panic, or is this more of a general principle of the thing?

@sgrif
Copy link
Contributor Author

sgrif commented Mar 13, 2017

I don't have a concrete example of a connection which has broken invariants when a panic occurs if that's what you mean. I'm sure I can find one pretty easily if you really don't believe that those cases exist.

@sfackler
Copy link
Owner

The fact that I'm not aware of any instance in which this has been a problem in the last several years guides my instincts to some extent 😃 .

That being said, I don't think I have a problem dropping connections that are checked back in after a panic. It can always be made configurable if that impacts someone's use case negatively.

@sgrif
Copy link
Contributor Author

sgrif commented Mar 15, 2017

The fact that I'm not aware of any instance in which this has been a problem in the last several years

catch_unwind has been stable for less than a year. 😄

@sfackler
Copy link
Owner

I don't see how that's really related? Other kinds of "panic safety" like mutex poisoning has been around for quite a long time.

@Diggsey
Copy link

Diggsey commented Apr 17, 2018

I just ran into this issue - it's definitely not theoretical, and has some pretty nasty consequences: namely transactions may not be closed correctly when a thread panics. Since database locks are held until the transaction is closed, this can result in the entire system locking up indefinitely.

I believe the "most correct" solution here would be to require manually returning the connection to the pool, because using thread::panicking is not quite correct (the connection may have been opened whilst unwinding, in which case it is safe to return).

For backwards compatibility, perhaps you could add a "discard" method and an "unwind-safe" wrapper which automatically discards the connection when dropped unless you explicitly return it to the pool.

@Diggsey
Copy link

Diggsey commented Apr 18, 2018

Hm, on second thoughts, the thread::panicking solution is probably "good enough" (tm) - the remaining functionality can be implemented on the underlying connection type using has_broken.

sgrif added a commit to sgrif/r2d2 that referenced this issue Jan 24, 2019
Currently `Pool` does not implement `UnwindSafe` or `RefUnwindSafe`.
This is due to `Condvar` not implementing it in the standard library.
That type probably should implement it, but `Pool` shouldn't be
`UnwindSafe` prior to this commit anyway. `antidote::Mutex` incorrectly
implements `UnwindSafe`, when it should not as it removes the poisoning
mechanism that makes `Mutex` be `UnwindSafe` in the first place.

Ultimately, prior to this commit, `Pool<M>` should only be `UnwindSafe`
if `M: UnwindSafe` and `M::Connection: UnwindSafe`. The need for that
bound on `M::Connection` is because we return the connection to the pool
on panic, even if it's in a potentially invalid state.

This commit adds explicit implementations of `UnwindSafe` and
`RefUnwindSafe`, and also removes the need to bound that on the
connection being `UnwindSafe` by only returning a connection to the pool
if it was not dropped during a panic. This ensures that we don't end up
in a situation where a connection is potentially returned to the pool in
a state where `is_broken` would return `true`, but it is not in an
expected state (e.g. having an open transaction). It also means that the
connection can be expected to be dropped if a panic occurs while it is
being used (e.g. ensuring the connection is terminated if there was an
open transaction).

Fixes sfackler#63
Fixes sfackler#31
sgrif added a commit to sgrif/r2d2 that referenced this issue Jan 24, 2019
Currently `Pool` does not implement `UnwindSafe` or `RefUnwindSafe`.
This is due to `Condvar` not implementing it in the standard library.
That type probably should implement it, but `Pool` shouldn't be
`UnwindSafe` prior to this commit anyway. `antidote::Mutex` incorrectly
implements `UnwindSafe`, when it should not as it removes the poisoning
mechanism that makes `Mutex` be `UnwindSafe` in the first place.

Ultimately, prior to this commit, `Pool<M>` should only be `UnwindSafe`
if `M: UnwindSafe` and `M::Connection: UnwindSafe`. The need for that
bound on `M::Connection` is because we return the connection to the pool
on panic, even if it's in a potentially invalid state.

This commit adds explicit implementations of `UnwindSafe` and
`RefUnwindSafe`, and also removes the need to bound that on the
connection being `UnwindSafe` by only returning a connection to the pool
if it was not dropped during a panic. This ensures that we don't end up
in a situation where a connection is potentially returned to the pool in
a state where `is_broken` would return `true`, but it is not in an
expected state (e.g. having an open transaction). It also means that the
connection can be expected to be dropped if a panic occurs while it is
being used (e.g. ensuring the connection is terminated if there was an
open transaction).

We still need an `UnwindSafe` bound on the connection manager, as we
can't guarantee that it will be in a reasonable internal state if a
panic occurs in one of its methods

Fixes sfackler#63
Fixes sfackler#31
@sgrif sgrif linked a pull request Jan 24, 2019 that will close this issue
@Boscop
Copy link

Boscop commented Apr 24, 2020

We're running into this issue in production.

Does anyone know a workaround that works now?

@sfackler
Copy link
Owner

Can you not use RAII for your transaction management?

@Boscop
Copy link

Boscop commented Apr 24, 2020

@sfackler You mean rolling back the transaction on Drop if the drop was caused by a panic?

@sfackler
Copy link
Owner

I mean rolling back the connection on Drop if the transaction hasn't been committed: https://github.com/sfackler/rust-postgres/blob/master/tokio-postgres/src/transaction.rs#L30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants