-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
large insert using persistent-sqlite has 2.5x performance penalty vs sqlite #1441
Comments
Properly fixing this adds a field to Fortunately, we have two work-arounds:
Checking this out locally. The SQL dump has a bunch of Test code embedded instead of attached: {-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DerivingStrategies #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE BangPatterns #-}
import Control.Monad.IO.Class (liftIO)
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Text
import Data.Text.IO
import Control.Monad
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Val
foo Text
deriving Show
|]
main :: IO ()
main = runSqlite "sqlitedb" $ do
runMigration migrateAll
forM_ [1..50000] $ \x -> do
f <- liftIO Data.Text.IO.getLine
_ <- insert $ Val f
return () This is going to do 50,000 single-inserts, each of which is followed by a Locally, I'm seeing these results:
Runs are pretty consistently around 0.24s. Doing the direct import here:
So, about twice as slow - but we're doing twice as many queries. To test this, I'll use If I change the code to use main :: IO ()
main = runSqlite "sqlitedb" $ do
runMigration migrateAll
forM_ [1..50000] $ \x -> do
f <- liftIO Data.Text.IO.getLine
- _ <- insert $ Val f
+ rawExecute "INSERT INTO val (foo) VALUES(?)" [PersistText f]
return ()
Well, it's slightly faster, which is - weird and not what I would have expected! It's possible that GHC is really optimizing this Even faster is a batch insert: main :: IO ()
main = runSqlite "sqlitedb" $ do
runMigration migrateAll
uuids <- forM [1..50000] $ \x -> do
f <- liftIO Data.Text.IO.getLine
pure f
insertMany_ $ Prelude.map Val uuids
Much faster, though we've traded memory for this time gain. 18MB total memory in use, though fewer total allocations.
|
Thanks for looking into this @parsonsmatt. In my situation, using
So is there any impediment to adding a field to |
Fortunately I think we can do this with a patch bump: |
I'm using persistent-sqlite to populate a table with 50,000 uuids, using the attached test.hs. This takes 1.16 seconds, which is 2.5 times slower than using the sqlite3 command-line utility to perform the same inserts. That is much more overhead than I would have expected; persistent-sqlite is building some SQL statements, while sqlite is doing a rather more complex thing involving writing to disk.
In profiling this, I noticed that there are 1 gigabyte of allocations. Since the list of uuids is ~2 megabytes, that's a lot of churning. I suspect that is responsible for a large part of the 2 seconds extra runtime.
test.hs.txt
The text was updated successfully, but these errors were encountered: