concurrent connections either cause SQLITE_BUSY errors or really long pauses #5

Closed
nurpax opened this Issue Dec 31, 2012 · 10 comments

Comments

Projects
None yet
2 participants
Owner

nurpax commented Dec 31, 2012

If multiple concurrent page requests are done to a snap app that all talk to the sqlite database, it seems that it's very easy to trigger SQLITE_BUSY conditions the way connection pooling is currently implemented in snaplet-sqlite-simple. The connection pool code is probably just fine, it's just that sqlite seems to work in a surprising way.

How to repro:

  • send 5-10 simultaneous requests to Snap that do write access the the sqlite db
  • observe how you get either SQLITE_BUSY or weird latencies (depending on configuration, see below)

Different options & how they affect this:

  1. instead of connection pooling, open only a single connection, stick it into an MVar and effectively serve sqlite sequentially - no problems (mvar-singleton-connection-pooling.png)
    • works without errors, throughput ok
  2. out of the box behavior with connection pooling + no setting of busy timeout (master-no-set-busy-timeout.png)
    • start seeing SQLITE_BUSY errors (see red lines on the screenshot)
  3. connection pool + set sqlite3 max busy timeout to 10s (set-busy-timeout-1s-latency.png)
    • no busy errors, but requests take really long time to complete (1s or so)

mvar-singleton-connection-pooling

master-no-set-busy-timeout

set-busy-timeout-1s-latency

Owner

nurpax commented Dec 31, 2012

Some interesting pointers here: http://beets.radbox.org/blog/sqlite-nightmare.html

Owner

nurpax commented Jan 1, 2013

This? http://sqlite.org/faq.html#q6

The restriction on moving database connections across threads was relaxed somewhat in version 3.3.1. With that and subsequent versions, it is safe to move a connection handle across threads as long as the connection is not holding any fcntl() locks. You can safely assume that no locks are being held if no transaction is pending and all statements have been finalized.

How to ensure there are no pending transactions? Statements should be finalized as all queries are written with withStatement (but double check this.)

Is this due to lazy IO?

Owner

nurpax commented Jan 1, 2013

Here's a simpler repro case without snap or snaplet-sqlite-simple:

{-# LANGUAGE OverloadedStrings #-}

import Control.Concurrent
import Control.Exception
import Control.Monad
import Database.SQLite.Simple


waitableForkIO :: IO () -> IO (MVar ())
waitableForkIO io = do
  mvar <- newEmptyMVar
  forkIO (io >> putMVar mvar ())
  return mvar

randomDbActions :: Connection -> IO ()
randomDbActions c = do
  putStrLn "db action here"
  replicateM_ 20 $ do
    execute_ c "INSERT INTO a (text) VALUES ('foo')"
    foo <- readList
    return ()
  where
    readList :: IO [Only Int]
    readList = query_ c "SELECT id FROM a"


main :: IO ()
main = do
  bracket (open "test.db") close $ \c -> do
    execute_ c "DROP TABLE IF EXISTS a"
    execute_ c "DROP TABLE IF EXISTS b"
    execute_ c "CREATE TABLE a (id INTEGER PRIMARY KEY, text TEXT)"
    execute_ c "CREATE TABLE b (id INTEGER PRIMARY KEY, text TEXT)"

--  conns <- replicateM 5 (open "test.db" >>= \c -> execute_ c "PRAGMA busy_timeout=100000" >> return c)
  conns <- replicateM 5 (open "test.db")
  threads <- mapM (waitableForkIO . randomDbActions) conns
  mapM_ takeMVar threads
  mapM_ close conns

Cabal file:

Name:                sqlite-test
Build-type:          Simple
version:             0.1
Cabal-version:       >=1.2

Executable sqlite-test
  hs-source-dirs: .
  main-is: Main.hs

  Build-depends:
    base >= 4 && < 5,
    bytestring >= 0.9.1 && < 0.11,
    text >= 0.11 && < 0.12,
    sqlite-simple

The errors I get are:

~/dev/sqlite-test$ ./dist/build/sqlite-test/sqlite-test
db action here
db action here
sqlite-test: SQLite3 returned ErrorBusy while attempting to perform step: database is locked
db action here
sqlite-test: SQLite3 returned ErrorBusy while attempting to perform step: database is locked
db action here
sqlite-test: SQLite3 returned ErrorBusy while attempting to perform step: database is locked
db action here
sqlite-test: SQLite3 returned ErrorBusy while attempting to perform step: database is locked
sqlite-test: thread blocked indefinitely in an MVar operation

The errors go away a little bit if I set a high busy timeout (see commented out code), but not completely. Setting the busy timeout also adds significant delays into executing the SQL statements, similar to what's happening in the ajax response timeline in the above screenshots.

Owner

nurpax commented Jan 1, 2013

See www.caijibbs.com/attachment.php?id=193, page 197:

In SQLite, using multiple connections in the same block of code can cause problems. You have to be careful of how you go about it. Consider the following example:

c1 = open('foods.db')
c2 = open('foods.db')

stmt = c1.prepare('SELECT * FROM episodes')

while stmt.step()
  print stmt.column('name')
  c2.exec('UPDATE episodes SET …)
end

stmt.finalize()

c1.close()
c2.close()

In the while loop, c2 attempts an UPDATE while c1
has a SHARED lock open. That SHARED lock won’t be released until stmt is finalized after the while
loop. Therefore, it is impossible to write to the database within the while loop.

Owner

nurpax commented Jan 2, 2013

I would've expected the below code to not have the busy problem, but it still fails. The only allows for a single writer at a time.

{-# LANGUAGE OverloadedStrings #-}

import Control.Concurrent
import Control.Exception
import Control.Monad
import Database.SQLite.Simple


waitableForkIO :: IO () -> IO (MVar ())
waitableForkIO io = do
  mvar <- newEmptyMVar
  forkIO (io >> putMVar mvar ())
  return mvar

randomDbActions :: MVar () -> Connection -> IO ()
randomDbActions writerMVar c = do
  putStrLn "db action here"
  replicateM_ 20 $ do
    withMVar writerMVar (\() -> execute_ c "INSERT INTO a (text) VALUES ('foo')")
    foo <- readList
    return ()
  where
    readList :: IO [Only Int]
    readList = query_ c "SELECT id FROM a"


main :: IO ()
main = do
  bracket (open "test.db") close $ \c -> do
    execute_ c "DROP TABLE IF EXISTS a"
    execute_ c "DROP TABLE IF EXISTS b"
    execute_ c "CREATE TABLE a (id INTEGER PRIMARY KEY, text TEXT)"
    execute_ c "CREATE TABLE b (id INTEGER PRIMARY KEY, text TEXT)"

  writerMVar <- newMVar ()
  conns <- replicateM 5 (open "test.db")
  threads <- mapM (waitableForkIO . randomDbActions writerMVar) conns
  mapM_ takeMVar threads
  mapM_ close conns
Owner

nurpax commented Jan 4, 2013

I think the problem is overlapped statements. One is opened for read, it's being stepped, when another prepared statement tries to insert into the same table. Here's an example that fails with Busy:

{-# LANGUAGE OverloadedStrings #-}

module Direct (
  directSqliteTest
  ) where

import Control.Monad
import Database.SQLite3.Direct

directSqliteTest :: IO ()
directSqliteTest = do
  Right conn1 <- open "test.db"
  Right conn2 <- open "test.db"

  exec conn1 "INSERT INTO a (text) VALUES ('foo')"
  exec conn1 "INSERT INTO a (text) VALUES ('foo')"

  Right (Just sr) <- prepare conn1 "SELECT * from a"
  a <- step sr
  a <- step sr
  print a
  x <- exec conn2 "INSERT INTO a (text) VALUES ('foo')"
  finalize sr
  print x

  void $ close conn1
  void $ close conn2

The latter INSERT step prints "Left ErrorBusy". If I move the "finalize sr" line before "f <- step sw", the insert succeeds.

This is a somewhat contrived example, but it seems that this is exactly what happens in the multi-thread case.

nurpax referenced this issue in nurpax/sqlite-simple Jan 4, 2013

Merged

Expose statements as first-class entities. #22

Emm commented Jan 4, 2013

Interesting. What happens when you read from a and insert in b? Looking at the documentation, probably the same thing but you never know.

Owner

nurpax commented Jan 4, 2013

@Emm I didn't try accessing 'b' but it seems like restricting both test apps to use only a single db connection makes this work again! I didn't try this in any sort of exhaustive manner, but both the above examples seem to work fine in the single connection case.

Emm commented Jan 5, 2013

This makes sense: the documentation was talking about different "processes". I just read about WAL mode, which is not on per default (http://www.sqlite.org/wal.html). You might want to try your separate connections example again after running PRAGMA journal_mode=WAL;.

From what I read, the trade-offs compared to the rollback journal mode which is on per default are worth it in most situations, as long as you don't do anything crazy with your transactions.

Owner

nurpax commented Feb 14, 2013

Closing as in the current released version this doesn't happen as db accesses are serialized using a single connection in the app, access being protected by an MVar.

nurpax closed this Feb 14, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment