Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

DB.NoSync, other performance considerations #612

Closed
DavidVorick opened this issue Oct 28, 2016 · 1 comment
Closed

DB.NoSync, other performance considerations #612

DavidVorick opened this issue Oct 28, 2016 · 1 comment

Comments

@DavidVorick
Copy link

We're using bolt on some spinning disks and we really need some substantial performance boosts compared to what we are currently getting. Some things are write intensive, and others just require us to be doing a bunch of serial transactions per second. = atomic?

What are the risks of using DB.NoSync? Obviously you aren't guaranteed that the data has hit the platter, but as long as we're taking that into consideration are there other footguns? Can we assume that DB.NoSync will still provide fully atomic transactions?

For the write-heavy loads, will it be faster to sort our entries before we insert them into the database? Is that something that bolt would consider (or already does) implementing on the backend?

Will it help to move some of the bulkier objects (>64kb) into their own files? At that point we have to add a lot of overhead, because we need to keep things atomic & consistent.

Are there other things you can think of that will help us squeeze more performance out of the database? We're at the point where for some of the code we've begun writing custom databases to handle things.

@benbjohnson
Copy link
Member

What are the risks of using DB.NoSync? Obviously you aren't guaranteed that the data has hit the platter, but as long as we're taking that into consideration are there other footguns? Can we assume that DB.NoSync will still provide fully atomic transactions?

NoSync is pretty dangerous. The order of writes can be rearranged by the OS/disk to optimize the write throughput. However, that means that the meta page could be written before the data pages and in the event of the computer crashing you would corrupt your database. It's really just meant for bulk loading where you can load all your data, ensure your computer hasn't crashed, and then re-enable sync.

For the write-heavy loads, will it be faster to sort our entries before we insert them into the database? Is that something that bolt would consider (or already does) implementing on the backend?

Sorting can help in some instances but usually not unless you're doing appends to your bucket.

Will it help to move some of the bulkier objects (>64kb) into their own files? At that point we have to add a lot of overhead, because we need to keep things atomic & consistent.

If you can shard your data into multiple bolt instances then I would expect that to improve your write performance. The sync() blocks an individual Commit() but you would effectively batch those if you were syncing across multiple databases. That's not always an option though.

We're using bolt on some spinning disks and we really need some substantial performance boosts compared to what we are currently getting.

Honestly, spinning disks aren't going to have very good performance with Bolt. Bolt uses a lot of random writes which works well on SSDs but is pretty rough for spinning disks. If write performance is a big issue then you may need to look at a write-optimized data store like RocksDB.

Are there other things you can think of that will help us squeeze more performance out of the database?

If you can batch your transactions together then I would expect that to improve performance on a spinning disk.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants