WriteBatchWithIndex seems slow #608

adamretter · 2015-05-17T18:19:05Z

I am trying to atomically write 5,305,512 key/value pairs into RocksDB. The performance of this feels slow to me, at about 60 seconds.

I am writing to a single WriteBatchWithIndex through calling org.rocksdb.WriteBatchWithIndex#put(byte[], int, byte[], int, long), this takes some ~50 seconds. Whereas subsequently writing that batch into Rocks using org.rocksdb.RocksDB#write(org.rocksdb.WriteOptions, org.rocksdb.WriteBatchWithIndex) takes just ~8 seconds.

Is WriteBatchWithIndex known to be slow? I was under the impression that it was all in-memory in a skip-list and so would be quite quick.

The text was updated successfully, but these errors were encountered:

yhchiang · 2015-05-18T07:14:28Z

cc @igorcanadi, @siying, and @agiardullo who know more about WriteBatchWithIndex().

siying · 2015-05-18T18:01:47Z

@adamretter WriteBatchWithIndex is slower than WriteBatch since it needs to insert into the index too, which is a binary search. If you are build a super large batch in WriteBatchWithIndex, it can be even slower than inserting to DB as what you see, since it needs to build a binary index on the fly.

For your case, it is possible that your slowness is caused by resizing the underlying buffer in WriteBatchWithIndex, which is implemented using std::string. You can try to preallocate a sufficiently large buffer for WriteBatchWithIndex when constructing it. WriteBatchWithIndex 's C++ constructor has a parameter reserved_bytes. Not sure about RocksJava.

adamretter · 2015-05-18T18:18:20Z

@siying It is very hard for me to know how much to preallocate in advance as I use the WriteBatchWithIndex for every update operation on the database. It might contain anywhere between 5 key/value pairs and tens of millions. Apart from pre-allocating, what other options are available for speeding this up?

siying · 2015-05-18T18:20:01Z

@adamretter do you have to put 5 million keys in one write batch? Can you write to smaller batches?

adamretter · 2015-05-18T18:23:58Z

@siying Afraid not. I need to write objects into Rocks atomically. To do that I shred them into key/value pairs. Depending on the size of the object and how many objects the end-user elects to store will determine how many key/value pairs go into the write batch. This is actually not a big one - just about 75MB of object data.

siying · 2015-05-18T18:29:48Z

@adamretter I can't think of a way to solve the problem for you now. std::string's resizing is not exponential. If we implement a WriteBatchWithIndex::ReserveBuffer() for you to call while the size grows, so that you can get size of the buffer by yourself and call ReserveBuffer() in an exponential way. Will that help you?

adamretter · 2015-05-25T18:15:57Z

@siying I am not sure I can estimate the size up-front before storing a complex object so manually resizing the buffer is probably not useful. However, I wonder if it would be more efficient to have a master WriteBatch and then sub-WriteBatches that could be merged to the master, each complex object I am storing could be written to its own sub-WriteBatch and then merged with the master WriteBatch before the final commit. Any idea if that might speed up the Index operations? i.e. would a bulk index update be faster than lots of incremental index updates?

siying · 2015-05-26T17:35:51Z

@adamretter if my previous guess that the bottleneck is string resizing is correct, then your approach will help you have a better estimation of final write buffer size, which you can pass through its constructor. The total time can be faster. You can give it a try and see whether you see a speed-up.

ghost · 2015-08-05T18:03:33Z

Thank you for reporting this issue and appreciate your patience. We've notified the core team for an update on this issue. We're looking for a response within the next 30 days or the issue may be closed.

adamretter · 2015-10-28T17:13:53Z

@siying (cc @igorcanadi @agiardullo) My latest idea is to put each big group of updates into their own WriteBatch and then append each of those to my WriteBatchWithIndex.

I am hopeful that because the size of a WriteBatch is known when we come to append it to a WriteBatchWithIndex this could help us correctly reallocate memory for the WriteBatchWithIndex and that as all of the items in the WriteBatch are already sorted by the comparator, then updating the index could be done just once as all keys in the WriteBatch are already in order and it looks to me like this path is already optimised at "// fast path for sequential insertion".

To do this, I am trying to add the following method to WriteBatchWithIndex:

void WriteBatch::Append(WriteBatch* other_batch)

However I am struggling a bit. My implementation looks something like this so far:

void WriteBatchWithIndex::Append(WriteBatch* other_batch) {
  AppendHandler appendHandler(rep);
  rep->write_batch.Append(other_batch);
  other_batch->Iterate(&appendHandler);
}

class AppendHandler : public WriteBatch::Handler {
 private:
  WriteBatchWithIndex::Rep* rep;
  size_t last_entry_offset;

 public:
  AppendHandler(WriteBatchWithIndex::Rep* wbwi_rep) {
    rep = wbwi_rep;
    last_entry_offset = rep->write_batch.GetDataSize();
  }

  virtual Status PutCF(uint32_t column_family_id, const Slice& key,
                         const Slice& value) override {
      rep->SetLastEntryOffset(last_entry_offset);

      if (column_family_id == 0) {
        rep->AddOrUpdateIndex(key);
      } else {
        rep->AddOrUpdateIndex(column_family_id, key);
      }

      //last_entry_offset = //TODO(AR) we need to calculate what this will be, something like last_entry_offset += sizeof(this_put_cf)

      return Status::OK();
  }

   ...

I am not sure if I am on the right track? Any comments would be appreciated. My main concrete problem at the moment, is I cannot see how to go from a column_family_id to a ColumnFamilyHandle within my AppendHandler, any ideas?

agiardullo · 2015-10-28T18:38:19Z

Take a look at WriteBatchWithIndex::Rep::ReBuildIndex(). I added this function a couple months ago in order to support RollbackToSavePoint(). While this function isn't exactly what you need, it demonstrates one way of taking records from a WriteBatch and updating the index of a WBWI. It could probably be extended and reused as part of your solution.

adamretter · 2016-04-17T20:01:14Z

@siying I wrote a small benchmark for testing the three methods of insertion into the database:

Direct puts, i.e. RocksDB#put(cf, key, value)
WriteBatch
WriteBatchWithIndex

The results were gathered using the latest version of RocksDB as of today, i.e. with the micro-optimization for WriteBatchWithIndex https://reviews.facebook.net/D55539.

Unfortunately it seems that WBWI is still about 60% slower for puts than WriteBatch. Note I believe I am inserting my keys in an already ordered manner, so I would hope it could take the fast path of WriteBatchWithIndex which is optimized for inserting sequential keys (although I do use a custom comparator).

https://github.com/adamretter/rocksjava-write-methods-benchmark#results

Just wondering where to start digging in improving this, is it perhaps still a good idea to attempt: #608 (comment)

gfosco · 2018-01-10T22:13:03Z

Closing this via automation due to lack of activity. If discussion is still needed here, please re-open or create a new/updated issue.

siying · 2018-07-12T00:40:40Z

@adamretter are you saying it's 60% slower than DB::Put() or WriteBatch::Put()? It is much faster than DB::Put() when I try it. 60% slower than WriteBatch::Put() is not a bad result.

adamretter added question java-api labels May 2, 2016

gfosco added the abandoned-or-aged-out label Jan 10, 2018

gfosco closed this as completed Jan 10, 2018

adamretter reopened this May 14, 2018

chfast mentioned this issue Dec 6, 2018

RocksDB performs significantly worse than LevelDB in tests ethereum/aleth#5340

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WriteBatchWithIndex seems slow #608

WriteBatchWithIndex seems slow #608

adamretter commented May 17, 2015

yhchiang commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 25, 2015

siying commented May 26, 2015

ghost commented Aug 5, 2015

adamretter commented Oct 28, 2015

agiardullo commented Oct 28, 2015

adamretter commented Apr 17, 2016 •

edited

gfosco commented Jan 10, 2018

siying commented Jul 12, 2018

WriteBatchWithIndex seems slow #608

WriteBatchWithIndex seems slow #608

Comments

adamretter commented May 17, 2015

yhchiang commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 18, 2015

siying commented May 18, 2015

adamretter commented May 25, 2015

siying commented May 26, 2015

ghost commented Aug 5, 2015

adamretter commented Oct 28, 2015

agiardullo commented Oct 28, 2015

adamretter commented Apr 17, 2016 • edited

gfosco commented Jan 10, 2018

siying commented Jul 12, 2018

adamretter commented Apr 17, 2016 •

edited