Important
I noticed my SQLite driver github.com/ncruces/go-sqlite3
was doing poorly in this benchmark that got posted on Hacker News
and reddit.
I traced it back to a new feature,
and a serious performance regression
(introduced to fix a compiler crash).
I implemented a fix
(a 4x larger, PLRU bit cache), released a new version, and ran the numbers again (different machine, configuration).
The results of the new experiment are below.
This work is sponsored by Monibot - Easy Server and Application Monitoring. Try out Monibot at https://monibot.io. It's free.
For benchmarks I used the following libraries:
-
craw,
crawshaw.io/sqlite
, a CGO-based solution. This is not adatabase/sql
driver. -
eaton,
github.com/eatonphil/gosqlite
, a CGO-based solution. This is not adatabase/sql
driver. -
mattn,
github.com/mattn/go-sqlite3
, a CGO-based solution. This library is (still) the de-facto standard and widely used. -
modernc,
modernc.org/sqlite
, a pure Go solution. This is a newer library, based on the SQLite C code re-written in Go. -
ncruces,
github.com/ncruces/go-sqlite3/driver
, a pure Go solution based on WASM. -
ncruces2,
github.com/ncruces/go-sqlite3
, a pure Go solution based on WASM. This is not adatabase/sql
driver. -
sqinn,
github.com/cvilsmeier/sqinn-go
, a solution without CGO. It usesgithub.com/cvilsmeier/sqinn
to access SQLite database files. -
zombie,
github.com/zombiezen/go-sqlite
, a rewrite of the crawshaw driver, using the modernc libraries. This is not adatabase/sql
driver.
The test setup is as follows:
- OS: Debian GNU/Linux rodete
- CPU: Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz
- RAM: 64GB
- Disk: 512 NVME SSD
- go version go1.21.5 linux/amd64
The benchmark was run on 2023-12-15, with then-current library versions. See go.mod for library versions. Each test was run once for warmup. The second run was then recorded. This is not very scientific.
A general note on benchmarks:
Do not trust benchmarks, write your own. This specific benchmark is modelled after my very own database usage scenarios. Your scenarios may be totally different.
The test database consist of the following tables and indizes:
PRAGMA journal_mode=DELETE;
PRAGMA synchronous=FULL;
PRAGMA foreign_keys=1;
PRAGMA busy_timeout=5000;
CREATE TABLE users (
id INTEGER PRIMARY KEY NOT NULL,
created INTEGER NOT NULL,
email TEXT NOT NULL,
active INTEGER NOT NULL);
CREATE INDEX users_created ON users(created);
CREATE TABLE articles (
id INTEGER PRIMARY KEY NOT NULL,
created INTEGER NOT NULL,
userId INTEGER NOT NULL REFERENCES users(id),
text TEXT NOT NULL);
CREATE INDEX articles_created ON articles(created);
CREATE INDEX articles_userId ON articles(userId);
CREATE TABLE comments (
id INTEGER PRIMARY KEY NOT NULL,
created INTEGER NOT NULL,
articleId INTEGER NOT NULL REFERENCES articles(id),
text TEXT NOT NULL);
CREATE INDEX comments_created ON comments(created);
CREATE INDEX comments_articleId ON comments(articleId);
Result times are measured in milliseconds. Lower numbers indicate better performance.
Insert 1 million user rows in one database transaction. Then query all users once.
insert query
-------------------------------------
craw 1769 ms 878 ms
eaton 1474 ms 987 ms
mattn 2230 ms 1704 ms
modernc 8044 ms 1473 ms
ncruces 4445 ms 1312 ms
ncruces2 3354 ms 779 ms
sqinn 1343 ms 775 ms
zombie 2947 ms 489 ms
Insert 200 users in one database transaction. Then insert 20000 articles (100 articles for each user) in another transaction. Then insert 400000 comments (20 comments for each article) in another transaction. Then query all users, articles and comments in one big JOIN statement.
insert query
-------------------------------------
craw 1034 ms 833 ms
eaton 921 ms 1056 ms
mattn 1266 ms 1689 ms
modernc 4705 ms 1668 ms
ncruces 2715 ms 1680 ms
ncruces2 2197 ms 1257 ms
sqinn 872 ms 1065 ms
zombie 2266 ms 769 ms
Insert N users in one database transaction. Then query all users 1000 times. This benchmark is used to simluate a read-heavy use case.
query/N=10 query/N=100 query/N=1000
--------------------------------------------------------
craw 33 ms 109 ms 604 ms
eaton 28 ms 98 ms 762 ms
mattn 39 ms 155 ms 1363 ms
modernc 41 ms 144 ms 1170 ms
ncruces 62 ms 182 ms 1298 ms
ncruces2 48 ms 115 ms 702 ms
sqinn 53 ms 120 ms 996 ms
zombie 18 ms 49 ms 303 ms
Insert 10000 users with N bytes of row content. Then query all users. This benchmark is used to simluate reading of large (gigabytes) databases.
query/N=50000 query/N=100000 query/N=200000
---------------------------------------------------
craw 385 ms 698 ms 1429 ms
eaton 280 ms 505 ms 1126 ms
mattn 249 ms 498 ms 1017 ms
modernc 378 ms 1521 ms 2724 ms
ncruces 305 ms 558 ms 1149 ms
ncruces2 304 ms 582 ms 1098 ms
sqinn 964 ms 1869 ms 3899 ms
zombie 282 ms 1376 ms 2434 ms
Insert one million users. Then have N goroutines query all users. This benchmark is used to simulate concurrent reads.
query/N=2 query/N=4 query/N=8
---------------------------------------
craw 801 ms 868 ms 1045 ms
eaton 1033 ms 1029 ms 1144 ms
mattn 1557 ms 1676 ms 1765 ms
modernc 2738 ms 7365 ms 19667 ms
ncruces 1438 ms 1469 ms 1563 ms
ncruces2 873 ms 884 ms 980 ms
sqinn 891 ms 1182 ms 1680 ms
zombie 452 ms 556 ms 689 ms
- We cannot declare a winner, it all depends on the use case.
- Crawshaw and Zombiezen are pretty fast.
- Mattn, although the de-facto standard, is not the best overall solution.
- SQLite without CGO is possible.
This work is sponsored by Monibot - Easy Server and Application Monitoring. Try out Monibot at https://monibot.io. It's free.