New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Absolute lack of thread-safety SqlBackend
should be documented.
#981
Comments
Dang. That's a surprising thing to hear! thanks for reporting it. |
@parsonsmatt the issue boils down most (all?) query functions being build on top of |
I'm actually somewhat baffled what prompted this design in the first place... |
Hell, you're entirely right. I think I get the intent - we want to prepare a statement once, then execute it multiple times. But sharing it across multiple threads is problematic. Gating access behind an Changing it to - pure $ Map.lookup sqlText stmtMap
+ threadId <- myThreadId
+ pure $ Map.lookup (threadId, sqlText) stmtMap This fixes the multi-threaded access problem, but it isn't obvious to me that this is "good" in the initial design goals for the Could aslo have: - Map Sql (IORef Statement)
+ Map Sql (IO Statement) where the |
Ok, bad news first: After having This means your suggested fix 1) wouldn't actually fix it and 2) would trigger version bumps across the entire set of persistent packages, since SqlBackend is part of the public API. Now, on to the good news: I believe it's possible to implement a workaround that keeps the current types/API that should have negligible impact on current uses that are safe and minor performance impact (in the form of additional statement preparation/finalisation, but I think that generally shouldn't be that costly) on uses that are currently utterly broken and unsafe anyway. |
Spoke too soon, since persistent exports almost all of its internals I think a version bump is unavoidable... |
I think the right way to go is to change The happy path (i.e. code that's currently correct) will just keep reusing the same prepared statement from the map, whereas Alternatively, the Unfortunately a change like this will have a pretty large ripple effect across On the upside this change should be fairly mechanical and simple. |
Ok, so I actually did figure out how to make a backwards compatible fix to So without a fix for that issue my fix is somewhat useless (well, it still fixes a whole bunch of issues, including running the same query multiple times in the same thread and concurrent non-insert operations). |
This can probably be solved by extending |
So can this issue now be closed thanks to 983 and 984? I see #917 but am not clear on how that relates. |
Eh, no, this is very much not solved. 983 and 984 just exist so you can more easily use 1 connection per thread with the #917 is the exact error you get when you try to use the same |
I'm a little lost. Why would anyone ever imagine that they can use the same |
SQLite database connections are thread-safe when used directly. It's not obvious that Additionally, what the current "Why would you ever do that?" Well, if you try and combine the results of multiple queries in a streaming way via Conduit, for one. Suppose we have two conduit's |
Oh, actually, I think you don't even need to have the same query running twice from the same thread. Simply having a premature abort in your conduit (i.e. not consuming all the query output) will leave the query in an inconsistent state and crash you on the next invocation if I understand the code correctly. |
I admit that I have never used the SQLite backend, and I'm not familiar with whatever specialized thread-safety guarantees you get there. But in general, DB connections are not thread-safe, and pools are industry-wide standard practice. Within that context, the current design of Does it make sense to make a global breaking change for all backends just to make it a bit easier to leverage a backend-specific concurrency feature? |
I disagree that this is clearly communicated, many of the persistent functions do not really make clear how to use them together with a pool, this is what I initially did in my own code, but that kept getting stuck, so I went back to just having a single connection.
Anyway, besides the docs being rather lacking in how to use things, this remark glosses over the fact that (as I pointed out in the other comment) the current Even after switching back to using pools with one In my code I currently use one |
I completely agree that the documentation should be clarified. I should not have used the word "obviously". It's only obvious when you're coming to persistent as an experienced user of a particular DB where re-using a connection non-sequentially in any way is never safe. |
Thinking more about what @merijn is saying about conduits - this is an inherent part of the persistent API, and it is a deep problem. For most backends, the issue is not the prepared statement cache. It's methods like the I am most familiar with the PostgreSQL backend. That backend currently does not attempt to do streaming. It just dumps all the results down the conduit all at once. So if you use |
Well, conduit used to support eager cleanup, but that was gotten rid of. I made an issue on The statement query cache is actually also an issue, because the current API makes it easy/encourages sharing a single connection across an entire Conduit. If different parts of the conduit the same query at different points (as intermediate step), then these queries will trash each other's results or crash (if you're lucky). |
If you share a connection across a conduit, you are totally trashed for most backends long before you get to the prepared statement cache. For most backends, DB connections cannot be re-used if there is an active query. I believe the only reason persistent gives the appearance of being generally working is because no one is actually doing streaming in practice. Once you fix that, and use a pool to avoid multi-threading issues, the current prepared statement caching will work just fine the way it is. EDIT: To clarify - we should first focus on how to provide a way to prevent re-use of connections in the |
@ygale My team is using Persistent streaming queries with a pool and still having issues. I'm looking into it currently though, hopefully I can post something useful. I believe that for some reason persistent is returning connections to the Pool too quickly from other asynchronous processes. |
Couldn't selectSource take a Pool as input and each time part of the pipeline receives input it first has to acquire a SqlBackend from the pool? Also relevant, this is how streaming-postgresql-simple does it: doFold :: forall row m.
(MonadIO m,MonadMask m,MonadResource m)
=> FoldOptions
-> RowParser row
-> Connection
-> Query
-> Stream (Of row) m ()
doFold FoldOptions{..} parser conn q = do
stat <- liftIO (withConnection conn LibPQ.transactionStatus)
case stat of
LibPQ.TransIdle ->
bracket (liftIO (beginMode transactionMode conn))
(\_ -> ifInTransaction $ liftIO (commit conn))
(\_ -> go `onException` ifInTransaction (liftIO (rollback conn)))
LibPQ.TransInTrans -> go
LibPQ.TransActive -> liftIO (fail "foldWithOpts FIXME: PQ.TransActive")
-- This _shouldn't_ occur in the current incarnation of
-- the library, as we aren't using libpq asynchronously.
-- However, it could occur in future incarnations of
-- this library or if client code uses the Internal module
-- to use raw libpq commands on postgresql-simple connections.
LibPQ.TransInError -> liftIO (fail "foldWithOpts FIXME: PQ.TransInError")
-- This should be turned into a better error message.
-- It is probably a bad idea to automatically roll
-- back the transaction and start another.
LibPQ.TransUnknown -> liftIO (fail "foldWithOpts FIXME: PQ.TransUnknown")
-- Not sure what this means.
where
ifInTransaction m = ... snip ... |
So, it looks like the new SQLite release this week would allow getting rid off @ygale question, in #918 you commented you were opposed to that PR, because:
I wonder what makes you say it's a breaking change? If there is only ever one statement that is prepared/used (i.e. the happy path), the semantics/statement cache stays the same as it is now. The only change in my proposed PR results in preparing a new second statement IFF the existing is already in use. Which, I guess, is technically a change in semantics, but given that the current semantics are "produce random garbage and/or queries", I'd hardly call that a breaking change. |
FWIW I'm totally fine with breaking changes, especially if they improve the UX of the library, and especially especially if they fix a bug, and super duper especially if it wasn't even possible to use the old code path in the first place. |
Ran into this in prod the first time a year ago (or so), just fixed one way that I ran into it today. |
Right now
SqlBackend
has almost no documentation, there should be some big, bright warning saying "No way, no how can you ever under any conditions assume that using a single SqlBackend is remotely safe to use from separate threads"I've been bitten by cryptic errors in the past and only NOW by browsing the source do I realise how utterly unsafe SqlBackend is in this regard.
The text was updated successfully, but these errors were encountered: