-
Notifications
You must be signed in to change notification settings - Fork 646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow write performance when initializing with many keys #392
Comments
What's the value of NoFreelistSync when opening the db? If freelist isn't synced to disk, it may take a while to load all the freelist on startup. Note only one goroutine can write at a time, so batching writing is right to me. |
To optimize writing, consider setting higher 'AllocSize' option to minimize number of file resizes during write. |
Unfortunately, we don't expose the AllocSize in Options. It's hard coded to It seems we should add an item into Options. |
Thank you guys, for the input! I really appreciate it. PageSize default, NoFreelistSync false - took 321.526453s. PageSize 64kb, NoFreelistSync true - took 301.830299s. During usage the database is opened in read-only mode. |
Interesting. Collecting stacktrace using |
An e2e test case includes two stages, the first stage is to generate the sample db file, the second stage is to perform whatever test on the file. Please run the following two commands,
|
Sorry for the late response, I moved on to a custom solution, which works well. When I run the test, I do the following: bucketName := []byte(idx.wal.Bucket())
idx.db.Batch(func(tx *bolt.Tx) error {
bucket, err := tx.CreateBucketIfNotExists(bucketName)
if err != nil {
return packageError(fmt.Sprintf("error synchronizing WAL, unable to create the level bucket %s", bucketName), err)
}
for idx.wal.Len() > 0 {
key, value := idx.wal.Pop()
err = bucket.Put(key, value)
if err != nil {
return fmt.Errorf("error synchronizing WAL at bucket %s and key %s: %v", bucketName, key, err)
}
}
return nil
})
} to write the WAL entries into the DB. The resulting database file is 268.435.456 bytes. During creation I use: options := &bolt.Options{
Timeout: 10 * time.Second,
NoFreelistSync: true,
PageSize: 64000, //64kb page size
} I tried to play around with the PageSize values and also turned NoFreelistSync on an off but the difference was only very little. To fill the database it needs around 25 minutes.
Maybe it is just not a good use case. And sorry for this minimalistic description. I got a lot of things going on currently. |
This is a known issue. bbolt get very slow when inserting many keys in a single transaction. Code for demonstrating the issue: f75d1e0 Output:
As you can see the numbers do not grow linearly with the key count. IIRC there was some operation in B+ tree that shifts a large slice every time a page is written to disk. Need to capture pprof to make sure. Until this get fixed, you can commit the transaction intermittently as a workaround. Examples: |
It can be seen from the above flame graph that most of the time is spent on
Still not ideal but better than previous results. |
@marcus-wishes You can try setting @ahrtr @ptabor We can document it in README and close this issue. What do you think? |
Hi,
maybe I am using BoltDB for a not fitting purpose, but the I have the following situation. I need a high performance key/value store for reading data. The store is initialized with a large amount of keys and small payloads (but in sum several GB) and after initialization only used in read-only mode (the store is closed a opened again sometimes, but not very frequently).
My problem is that the initialization is taking a huge amount of time.
What I tried so far:
I am using both Windows and Linux, and the database files are also copied between different machines with both OSs.
Is there anything I am missing, or is this normal behavior of the tree rebalancing?
The text was updated successfully, but these errors were encountered: