Skip to content
This repository has been archived by the owner on Mar 9, 2019. It is now read-only.

Add DVID to projects using Bolt #107

Merged
merged 1 commit into from
Mar 31, 2014
Merged

Conversation

DocSavage
Copy link
Contributor

DVID added Bolt as an optional storage engine.

DVID added Bolt as an optional storage engine.
@benbjohnson
Copy link
Member

@DocSavage Nice! How was it integrating into DVID?

benbjohnson added a commit that referenced this pull request Mar 31, 2014
Add DVID to projects using Bolt
@benbjohnson benbjohnson merged commit 4364c2f into boltdb:master Mar 31, 2014
@DocSavage
Copy link
Contributor Author

Took less than a day, which included some batch write simplifications I needed to make to my generic key-value interface that was primarily tailored to leveldb variants. The "driver" is a small file. I'm probably doing some stupid things in there.

A couple of questions:

  • I noticed both for leveldb-go and your database the size of the database is "int" instead of "int64". Are there advantages to using int? Seems like the variability of its actual max size as well as signed nature would make uint64 better, e.g., Stat.MmapSize.
  • The original lmdb has some bulk write optimizations. Is that something that could be added?

One benefit of the pure Go storage engine is being able to copy the executable without any libraries tagging along.

It's already clear from my initial use that I'm the random small key reads and writes necessary at one point of the code is considerably slower than leveldb. These are essentially random key-only puts, which previous benchmarks indicate are faster in leveldb than lmdb but not by 2x. Looks like it's 30x or more difference with the Go version. Have you benched your random writes vs the C lmdb? I'm not batching my puts.

Thanks for open-sourcing your project :)

@benbjohnson
Copy link
Member

@DocSavage Good to hear that it was pretty easy.

I noticed both for leveldb-go and your database the size of the database is "int" instead of "int64". Are there advantages to using int? Seems like the variability of its actual max size as well as signed nature would make uint64 better, e.g., Stat.MmapSize.

I generally prefer int for external APIs unless there's a good reason to use something else. It seems like a uint probably makes a better choice here since the DB size can't be mmap'd to anything larger than 4GB on 32-bit systems. Bolt needs better testing for 32-bit systems though.

The original lmdb has some bulk write optimizations. Is that something that could be added?

LMDB has some unsafe optimizations such as MDB_APPEND that I'm not interested in adding in. I don't want to make it possible to corrupt the DB. Currently there's a bug (#94) where nodes don't split during multiple inserts for a single transaction so bulk loading is slow. This will be fixed soon though.

The goal thus far has been to make Bolt solid and API stable and optimize for sequential reads (which is my personal use case). However, Bolt is starting to get pretty solid so I'm going to start working on optimizations soon. I have no doubt that it's significantly slower than LevelDB & LMDB right now but hopefully it'll get a lot closer in performance to both of them.

Thanks for checking out the project!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants