-
Notifications
You must be signed in to change notification settings - Fork 150
Slow, design approach #13
Comments
The test using the PRAGMA setting are good but not extremely good:
A little less than 5k transaction for second, which is roughly half of what I get using bulk insert. |
I believe I am making too many problems for the whole persistence deal... By default redis is not too persistence aware. Of course SQLite is born whit different use cases and it defaults make it clear. |
I decided that I would be better off using a pool and a FLUSH command. This keep the invariant of the persistence on the SQLite side, after a FLUSH you are always sure that everything is been written on disk. Before the FLUSH you are on the persistence security of Redis, which is more than enough in most occasion. |
I don't like the idea of batching in memory and then flush. Another interesting idea is to don't care too much about persistency, store everything in memory, and implement the rdb_save and rdb_load callback (to write the whole database inside the RDB file of redis), possible implement also another command REDISQL.SAVE that write an in memory database on a file passed as input (use the backup API). I tried a simple benchmark to understand how fast is the in memory database, it is fast enough for small dataset, however it get slower and slower after every insert.
4k insert for second is not that bad, but it is not even extremely good, considering that we are keeping everything in memory. To reproduce the results of my simulation is sufficient to start redis with rediSQL and then:
to create the database connection, the table and runs the first 100.000 insert. Then
to insert other 100.000 tuple. |
I wronte on the SQLite mailist to try to get a better insight on the whole issue. An extremely helpful user wrote a simple TCL benchmark script that let me benchmark SQLite by itself, the script is at the bottom of the message. It seems like the script is doing exactly the same thing than my implementation but the performance are extremely different, and I don't see a degradation of performance on the SQLite / TCL script Overall all the data are in the following table:
The graphs below show even clearer the difference in performance and the non-degradation of performance in using vanilla SQLite package require sqlite3
sqlite3 db :memory:
db eval { CREATE TABLE test (a INT, b INT, c INT); }
proc insert_n_rows {n} {
for {set i 0} {$i<$n} {incr i} {
db eval { INSERT INTO test VALUES(random(), random(), random()) }
}
}
set nStep 100000
for {set i 0} {$i < 100} {incr i} {
set us [lindex [time { insert_n_rows $nStep }] 0]
puts "[expr $i*$nStep] [format %.2f [expr (1000000.0 * $nStep) / $us]]/sec"
} |
Let me establish a simple baseline, on my machine a simple PING, which is the simplest and fastest command you may do, perform at little less than 80k requests per second.
Opening a key and increment its value takes the same time:
Overall I don't think I can go much faster than this numbers anyway... |
Finally some updates. I decide to run perf on redis running rediSQL and on the script TCL. Since the software are doing similar things the intuition lead me to think that the hostest path where the same. I was actually very wrong. I started a TCL section, get the pid digiting There is not so much about SQLite itself and even the hostes function is not so hot. Then I did a similar work for rediSQL. Start redis and load the module, get the pid from the log and stat perf as above, then run the benchmark. The result are in the image below. Here we see that most of the time is spent in Also, we see that, even if the second benchmark ran for a lot of more times we have less samples, ~600 against ~14k |
First performace improvementChange SQLite after mail list help. SQLite had a bug in version 3.15 and this caused the slowing down behaviour. Performance do not decrease anymore and are stable at around 34k inserts per second, after optimization we get to ~44k inserts per second. However Redis can keep up at ~90K operations per second while SQLite can write more than 200K tuples per second. We are leaving on the table around half of the potetial performance. Potential probelm in context switch, query are loadad into a queue that another thread consume and write on SQLite, processor switch very often and I don't see a stable 100% in any processor. Writing only on SQLite give me stable 100% in one processor. Disprove the hypothesis of the context switch problem.Write two small C program, one that write directly on SQLite and another that use a thread pool just like rediSQL. Two interesting fact.
The performance of both scripts are yes, slower than the TCL script but still higher than Redis, so we still have ground to top performace to the one of redis. Still haven't understand why we are going slower. TCL was cheatingThe TCL script was executing always the same insert which was heavely cached, it wasn't recompiled but it was simply loaded from the cache. Considering that the compiling operation was one of the slowest, this makes a big difference. I disable the cache and now the TCL script is down at roughly 50K inserts per second. Different Approach to message passingI would like to get a single CPU at 100%, I wrote a simple queue a wrap it in a mutex. Now I am able to saturate a whole CPU and also SQLite. Reached roughly 50K insert per second. |
Overall I find the whole module + sqlite quite slow, especially when it writes on disk.
On my machine I get roughly 8 inserts per second on disk using the most naive method, independently of the concurrency used (which makes sense since at least for now it is everything single threaded).
clearly this is not enough.
However the slow performance are expected, if we hit the disk for each insert, we will be slow no matter what.
The obvious solution is to don't hit the disk for every and each insert, but batch multiple insert and execute all together inside a single transaction.
This other approach provide a very higher throughput (2 order of magnitude) with very small effort.
Now, before to go blindly ahead and create a pool of statements and provide a new command FLUSH, to force a transaction, I am going to explore more deeply what PRAGMA option SQLite can offer.
Usually there is a trade off between velocity and safeness, however, living inside redis I can use the guarantees of redis to make up some of the safeness lost and see if I can reach good enough performance.
[1]: Statement, something that modify the internal status of the database, it is in contraposition with query that simply read the database (INSERT, UPDATE, DELETE vs SELECT)
The text was updated successfully, but these errors were encountered: