Optimizations; driver is now about 250% faster #3

FooBarWidget · 2010-09-11T22:21:46Z

The Ruby driver was quite inefficient with handling data. Strings (read from the network or passed by the user) were being unpacked into arrays all over the place and vice versa. We've modified the driver to work with strings instead of byte arrays as much as possible. Most notably: ByteBuffer has been rewritten to use a binary string as underlying storage object instead of an array.

The Ruby 1.8 implementation of BSON::OrderedHash was inefficient: it uses a Set even though it's not necessary. We removed the dependency on Set and greatly improved OrderedHash's 1.8 performance.

The end result is a driver that's 274% faster on Ruby 1.8 and 204% faster on Ruby 1.9. We used the following benchmark:

requests = Mongo::Connection.new.db('foobar').collection('requests')
query = { :_id => "6d61bbc7e32795e7ace8b98e8e83961cb8e3ee53" }
40000.times do
    requests.find(query, :limit => -1) do |cursor|
        cursor.next_document
    end
end

Original runtime: 74.0s (Ruby 1.8) / 33.9s (Ruby 1.9)
New runtime: 25.8s (Ruby 1.8) / 16.7s (Ruby 1.9)

For reviewing and cherry-picking convenience we've split the optimizations in small commits. Some can be individually cherry-picked, others depend on earlier commits.

moonmaster9000 · 2010-09-11T22:40:36Z

nice work!

rodrigoalvesvieira · 2010-09-11T23:10:06Z

You guys from Phusion rock!

darkhelmet · 2010-09-12T05:37:16Z

Bad ass.

FooBarWidget · 2010-09-12T12:05:36Z

Doh, it was 0:30 AM when I sent the pull request. It's actually 186% and 104% faster, not 274% and 204% faster. Not as impressive as the original percentages, but still. :)

banker · 2010-09-12T16:46:31Z

Thanks for the code review and for the proposed patches. Excellent work!

I haven't reviewed the changes thoroughly yet, but I decided to pull them in just to benchmark. Please see this gist and compare the results from 9/11/10 to those from 9/1/10.

https://gist.github.com/c5e474b753d4878416a2

The code to run this is located in bin/standard_benchmark

As you can see, your changes have significantly improved query speed; however, inserts, for some reason, are in some cases half as fast. I'll have a chance to look more deeply at the code this week.

I'll definitely pull in many of these changes. For the moment, we should be a little more precise and say that query speed has been improved but that that certain operations, notably inserts, may have degraded as a result.

FooBarWidget · 2010-09-12T19:42:40Z

That's probably because I haven't bothered to benchmark inserts. There are some places in the driver that do things like this:

byte_buffer.put_array(some_data.unpack("C*"))

Code like this are still under the assumption that ByteBuffer uses an array as underlying storage. They should be replaced with calls to put_binary and shouldn't unpack the data.

I ran into similar problems while rewriting ByteBuffer. After the ByteBuffer rewrite my microbenchmark somehow became almost twice as slow as before. It was only after I fixed the put_array call that the microbenchmark increased its speed to faster than before.

FooBarWidget · 2010-09-12T19:52:05Z

What do "Passenger optimizations" in your gist refer to and how is it different from "Post-Phusion improvements"?

banker · 2010-09-12T20:01:34Z

Fixed. Both are Post-Phusion improvements, but one is 1.8.7 and the other is 1.9.2.

What you say about the insert performance makes sense. Again, will look more closely at this tomorrow.

FooBarWidget · 2010-09-12T20:35:06Z

Yeah it is exactly as I had suspected. Here's a commit which fixes not only inserts but a few other things like update, remove and last_error_message as well. They're now all significantly faster. In my own microbenchmark I used this to benchmark large inserts:

2000.times do |i|
    LARGE['x'] = i
        col.insert(LARGE)
        LARGE.delete(:_id)
    end
end

LARGE is taken directly from bin/standard_benchmark.

Original run time with original driver: 3.45s
Original runtime with orignal optimizations: 5.90s
New runtime with fixes: 1.08s (219% faster than original, 446% faster than last attempt)

FooBarWidget · 2010-09-12T20:35:40Z

My results BTW: http://gist.github.com/576397

banker · 2010-09-12T21:00:29Z

This is huge. I always thought that moving to a string representation would be faster, but some initial tests I did on the idea proved slow. Thanks for showing the right way to go about this. Impressive work. People will be thrilled.

Kyle

FooBarWidget · 2010-09-12T21:26:06Z

Looks like standard_benchmark has become almost 200% faster in user op/s. :)
Original avg user op/s: 2886
New avg user op/s: 8627

jamieorc · 2010-09-13T03:16:39Z

This is great news. You guys rock!

pius · 2010-09-13T03:26:25Z

Nice work!

bkeepers · 2010-09-13T13:58:29Z

Thank you, amazing work!

michaeldwan · 2010-09-15T17:15:07Z

Great work guys!

estolfo mentioned this pull request Sep 22, 2016

RUBY-1142 Allow certs to be passed as string in the configuration #817

Merged

This pull request was closed.

Optimizations; driver is now about 250% faster #3

Optimizations; driver is now about 250% faster #3

Uh oh!

Conversation

FooBarWidget commented Sep 11, 2010

Uh oh!

moonmaster9000 commented Sep 11, 2010

Uh oh!

rodrigoalvesvieira commented Sep 11, 2010

Uh oh!

darkhelmet commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

banker commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

banker commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

banker commented Sep 12, 2010

Uh oh!

FooBarWidget commented Sep 12, 2010

Uh oh!

jamieorc commented Sep 13, 2010

Uh oh!

pius commented Sep 13, 2010

Uh oh!

bkeepers commented Sep 13, 2010

Uh oh!

michaeldwan commented Sep 15, 2010

Uh oh!

Uh oh!