Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark experiements: WAL, Bulk Insert #7

Open
2 of 3 tasks
infogulch opened this issue Apr 9, 2024 · 1 comment
Open
2 of 3 tasks

Benchmark experiements: WAL, Bulk Insert #7

infogulch opened this issue Apr 9, 2024 · 1 comment

Comments

@infogulch
Copy link

I've had a few ideas for improvements and additional benchmarks and I added them to my fork. The code and commits are a bit messy which is why I opened an issue to discuss rather than a PR, though I'm open to cleaning it up enough to submit if desired:

master...infogulch:go-sqlite-bench:bench-wal

General improvements

  • Changed go-sqlite-bench to output binaries and results to ./bench and added bench to .gitignore, instead of dumping junk in the parent directory
  • Added flags to allow selection of which specific benchmarks to run instead of always running all of them.
  • I think it would be nice if benchmark results were automatically tabulated and charts regenerated at the end of the run.

Bench WAL

ncruces is working on a new version of ncruces/go-sqlite3 that implements mmaped shared memory which is necessary to enable WAL which I expect could improve performance for mixed workload applications. I think this is pretty cool, but go-sqlite-bench doesn't have a benchmark for WAL, so I tried to add one:

https://github.com/infogulch/go-sqlite-bench/blob/bench-wal/app/app.go#L443-L524

  • The results show a slight improvement but in general are a mixed bag.
  • I think the new benchmark is somewhat flawed, since it often times out with SQLITE_BUSY.
  • It's a bit annoying to compare benchmark results because the benchmark is designed to test a single version of all dependencies, but to compare ncruces/go-sqlite3:master to ncruces/go-sqlite3:wal it needs different versions of dependencies. Maybe a solution that uses a separate go.mod file for ./cmd/bench-ncruces-wal would work.

I'm open to ideas to improve the WAL benchmark, it doesn't seem to be good enough to PR at the moment.

Bench multiple row binding BulkInsert

I've seen a lot of colloquial advice to improve bulk insert performance by binding multiple rows of data at once, so I implemented a generic BulkInsert function, and a added copy of the simple benchmark that uses it:

https://github.com/infogulch/go-sqlite-bench/blob/bench-wal/app/sqldb.go#L32-L74

For mattn/go-sqlite3 this gives a 55% improvement in bulk insert performance on my machine. I implemented it in a bit of a hurry, so it doesn't integrate very well with the rest of the benchmark framework (i.e. it only works with Go sql.DB). I'm open to ideas here.

Idea: Bulk data transfer via JSONB

SQLite 3.45.0 just landed support for JSONB. The authors pretty explicitly state that they don't really want library users to use it directly... That said SQLite's jsonb format looks like a pretty efficient way to encode structural data. I wonder if encoding Go data as jsonb, passing the encoded data as a single blob parameter, and processing it into normal inserts etc with sqlite functions like JSON_EACH would be faster.

In particular I'd expect TEXTRAW, TRUE, FALSE, NULL, ARRAY, and OBJECT types would be very fast to encode, however integer and floating point types are serialized into strings so you'd be paying for number encoding/parsing on each (though INT5 allows encoding ints as hex?).

@ncruces
Copy link
Contributor

ncruces commented Apr 11, 2024

WAL mode ncruces/go-sqlite3#71 was merged. I'll tag a release as soon as wazero tags a release.

Released!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants