Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize performance #20

Closed
wei2912 opened this issue Jun 26, 2013 · 8 comments
Closed

Optimize performance #20

wei2912 opened this issue Jun 26, 2013 · 8 comments
Assignees

Comments

@wei2912
Copy link
Contributor

wei2912 commented Jun 26, 2013

gosexy/db tends to work slowly. I haven't benchmarked any results yet, but will do so as soon as possible. Right now, it appears to be one insert per second.

Using sqlite wrapper.

@xiam
Copy link
Contributor

xiam commented Jun 26, 2013

While gosexy/db in fact could be a little slower that raw queries, one second is excessive as we are not doing any cpu intensive task. I'd like to go to the root of this problem, please:

  • Use the sqlite test and watch the output. Is it also slow?
  • Create a .sql file with INSERTs to a SQLite table and watch the output. How fast it that?
  • Try to mount a (chunk of RAM as a disk)1 and put your database file there, this should speed things up as disk access is the most slow task for SQLite. Does this have any effect on the speed?

@ghost ghost assigned xiam Jun 26, 2013
@wei2912
Copy link
Contributor Author

wei2912 commented Jun 27, 2013

I modified the sqlite test to provide a benchmark for appending.

Code:

// [Benchmark] Appends maps and structs.
func BenchmarkAppend(b *testing.B) {
    sess, err := db.Open(wrapperName, settings)

    if err != nil {
        b.Fatalf(err.Error())
    }

    defer sess.Close()

    people := sess.ExistentCollection("people")
    people.Truncate()

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        people.Append(db.Item{"name": "randomname"})
    }
}

Switching off the debug output temporarily:

BenchmarkAppend       10     219560280 ns/op
ok      db/sqlite   25.182s

Since b.N has a value of 10, I decided to create a .sql file with 10 INSERTs.

Benchmarking the execution of a .sql file with INSERTs:

wei2912@wei-lm-desktop ~ $ touch test.db
wei2912@wei-lm-desktop ~ $ sqlite3 test.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
sqlite> CREATE TABLE test (
   ...> id int primary key,
   ...> string text
   ...> );
sqlite> .exit
wei2912@wei-lm-desktop ~ $ time sqlite3 test.db < sqlite_test.sql 

real    0m1.379s
user    0m0.000s
sys 0m0.008s

The .sql file:

INSERT INTO test VALUES (0, "a");
INSERT INTO test VALUES (1, "b");
INSERT INTO test VALUES (2, "c");
INSERT INTO test VALUES (3, "d");
INSERT INTO test VALUES (4, "e");
INSERT INTO test VALUES (5, "f");
INSERT INTO test VALUES (6, "g");
INSERT INTO test VALUES (7, "h");
INSERT INTO test VALUES (8, "i");
INSERT INTO test VALUES (9, "j");
INSERT INTO test VALUES (10, "k");

Looks like it's a problem on my side. I'll do more research on this issue.

I'm still not sure if it's due to disk access issues. Will attempt placing the DB in RAM.

EDIT:

Placing the database in RAM had a major effect:

wei2912@wei-lm-desktop ~ $ sudo mkdir /mnt/tmpfs
wei2912@wei-lm-desktop ~ $ sudo mount -osize=100m tmpfs /mnt/tmpfs -t tmpfs
wei2912@wei-lm-desktop ~ $ cd /mnt/tmpfs/
wei2912@wei-lm-desktop /mnt/tmpfs $ touch test.db
wei2912@wei-lm-desktop /mnt/tmpfs $ sqlite3 test.db
SQLite version 3.7.15.2 2013-01-09 11:53:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> CREATE TABLE test (
   ...> id int primary key,
   ...> string text
   ...> );
sqlite> .exit
wei2912@wei-lm-desktop /mnt/tmpfs $ time sqlite3 test.db < ~/sqlite_test.sql 

real    0m0.004s
user    0m0.000s
sys 0m0.000s

The results were extremely fast, as shown above.

EDIT 2:

Since the benchmarks above were rather unreliable, doing only 10 inserts, I decided to use the previous benchmark tests.

I threw the whole sqlite test suite in RAM and built it, then benchmarked it.

BenchmarkAppend    50000         65145 ns/op
ok      _/mnt/tmpfs 3.984s

Appears to be a hard drive issue. I'll run a couple of scans on my drive and update you on the issue. Thanks!

EDIT 3:

Did a drive benchmark.

drive-benchmark

Looks fine to me.

xiam added a commit that referenced this issue Jun 27, 2013
@xiam
Copy link
Contributor

xiam commented Jun 27, 2013

What a fine and detailed contribution!

I added your benchmark function to the sqlite_test.go file.

Then I ran it on a SSD disk:

$ go test -test.bench=. -test.run=none
PASS
BenchmarkAppendRaw          1000           2348241 ns/op
BenchmarkAppendDbItem       1000           2699280 ns/op
BenchmarkAppendStruct       1000           2410441 ns/op
ok      menteslibres.net/gosexy/db/sqlite       8.168s

And then on a normal disk:

$ go test -test.bench=. -test.run=none
PASS
BenchmarkAppendRaw        10     158871802 ns/op
BenchmarkAppendDbItem         10     157771451 ns/op
BenchmarkAppendStruct         10     156629528 ns/op
ok      menteslibres.net/gosexy/db/sqlite   6.255s

Got the same awful performance...

Then I tried with a bench.sql file with 20 inserts directly on the same database gosexy/db uses for testing:

$ time sqlite3 gotest.sqlite3.db < bench.sql  
sqlite3 gotest.sqlite3.db < bench.sql  0.00s user 0.04s system 1% cpu 3.210 total
$ head -n 1 bench.sql 
INSERT INTO people (name) VALUES("john");

Same results, so I asked strace:

$ strace sqlite3 gotest.sqlite3.db < bench.sql

Huge output but I quickly noticed some odd file creations and deletions related to SQLite3's journal feature, so I tried to disable the journal feature with a line at the beginning of the bench.sql file.

PRAGMA journal_mode = OFF;
INSERT INTO people (name) VALUES("john");

Noticed an improvement!

$ time sqlite3 gotest.sqlite3.db < bench.sql
off
sqlite3 gotest.sqlite3.db < bench.sql  0.01s user 0.01s system 1% cpu 1.232 total

One second is still slow for just 20 rows but at least we know this problem seems to be on sqlite3 configuration. File creation and deletion could be slow, even if write operations are fast. This does not seem the be the case with SSD.

If you want to turn off journalizing on your go program you could use the db.Database.Driver() method and then db.Exec, I guess. Please see: https://menteslibres.net/gosexy/db/database

@wei2912
Copy link
Contributor Author

wei2912 commented Jun 28, 2013

I noticed the constant reupdating of the journal file and thought it might be the cause of the problem. Thanks for helping to debug this problem :)

BTW, more benchmarks coming soon from my fork.

I'll try disabling the journal file and let you know of the results, probably this afternoon.

EDIT 1:

http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html

Might be helpful. I've looked at these and the options which seem most promising to me are:

  1. http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html#transaction-wrappers Using BEGIN and END transactions.
  2. http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html#compact Compact the database.

As for file journaling:

For a writable database, it is possible to turn off journaling at the C source level. This should only be done as a last resort, but there are some applications such as games where speed can be more important than data integrity. In many cases you can achieve the same effect by using a TEMPORARY table.

In my use case, it seems fine to disable journaling, so I'll go ahead.

EDIT 2:

Applying the BEGIN and END transaction optimizations helped a lot.

Code is attached in a pull request at #22

PASS
BenchmarkAppendRaw        10     234559364 ns/op
BenchmarkAppendDbItem         10     178592011 ns/op
BenchmarkAppendDbItem_Transaction      50000         33807 ns/op
BenchmarkAppendStruct         10     211561835 ns/op
ok      db/sqlite   12.482s

A rather significant improvement in my opinion.

@wei2912
Copy link
Contributor Author

wei2912 commented Jun 28, 2013

I've done some tests and it seems that applying BEGIN and END transactions are very helpful to optimization. Note that this will only be useful for multi-INSERTs.

@xiam
Copy link
Contributor

xiam commented Jun 28, 2013

So we got to the root of the problem! I'm closing this issue as is related to the database itself, I'll add the link in the docs.

Thanks,

@xiam xiam closed this as completed Jun 28, 2013
@wei2912
Copy link
Contributor Author

wei2912 commented Jun 29, 2013

I will write a pull request applying the optimization.

@xiam
Copy link
Contributor

xiam commented Jun 29, 2013

That's actually a good idea, I think we could add Database methods named Begin and End. See #23, what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants