Skip to content

Optimizations; driver is now about 250% faster #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

Optimizations; driver is now about 250% faster #3

wants to merge 0 commits into from

Conversation

FooBarWidget
Copy link
Contributor

The Ruby driver was quite inefficient with handling data. Strings (read from the network or passed by the user) were being unpacked into arrays all over the place and vice versa. We've modified the driver to work with strings instead of byte arrays as much as possible. Most notably: ByteBuffer has been rewritten to use a binary string as underlying storage object instead of an array.

The Ruby 1.8 implementation of BSON::OrderedHash was inefficient: it uses a Set even though it's not necessary. We removed the dependency on Set and greatly improved OrderedHash's 1.8 performance.

The end result is a driver that's 274% faster on Ruby 1.8 and 204% faster on Ruby 1.9. We used the following benchmark:

requests = Mongo::Connection.new.db('foobar').collection('requests')
query = { :_id => "6d61bbc7e32795e7ace8b98e8e83961cb8e3ee53" }
40000.times do
    requests.find(query, :limit => -1) do |cursor|
        cursor.next_document
    end
end

Original runtime: 74.0s (Ruby 1.8) / 33.9s (Ruby 1.9)
New runtime: 25.8s (Ruby 1.8) / 16.7s (Ruby 1.9)

For reviewing and cherry-picking convenience we've split the optimizations in small commits. Some can be individually cherry-picked, others depend on earlier commits.

@moonmaster9000
Copy link

nice work!

@rodrigoalvesvieira
Copy link

You guys from Phusion rock!

@darkhelmet
Copy link

Bad ass.

@FooBarWidget
Copy link
Contributor Author

Doh, it was 0:30 AM when I sent the pull request. It's actually 186% and 104% faster, not 274% and 204% faster. Not as impressive as the original percentages, but still. :)

@banker
Copy link
Contributor

banker commented Sep 12, 2010

Thanks for the code review and for the proposed patches. Excellent work!

I haven't reviewed the changes thoroughly yet, but I decided to pull them in just to benchmark. Please see this gist and compare the results from 9/11/10 to those from 9/1/10.

https://gist.github.com/c5e474b753d4878416a2

The code to run this is located in bin/standard_benchmark

As you can see, your changes have significantly improved query speed; however, inserts, for some reason, are in some cases half as fast. I'll have a chance to look more deeply at the code this week.

I'll definitely pull in many of these changes. For the moment, we should be a little more precise and say that query speed has been improved but that that certain operations, notably inserts, may have degraded as a result.

@FooBarWidget
Copy link
Contributor Author

That's probably because I haven't bothered to benchmark inserts. There are some places in the driver that do things like this:

byte_buffer.put_array(some_data.unpack("C*"))

Code like this are still under the assumption that ByteBuffer uses an array as underlying storage. They should be replaced with calls to put_binary and shouldn't unpack the data.

I ran into similar problems while rewriting ByteBuffer. After the ByteBuffer rewrite my microbenchmark somehow became almost twice as slow as before. It was only after I fixed the put_array call that the microbenchmark increased its speed to faster than before.

@FooBarWidget
Copy link
Contributor Author

What do "Passenger optimizations" in your gist refer to and how is it different from "Post-Phusion improvements"?

@banker
Copy link
Contributor

banker commented Sep 12, 2010

Fixed. Both are Post-Phusion improvements, but one is 1.8.7 and the other is 1.9.2.

What you say about the insert performance makes sense. Again, will look more closely at this tomorrow.

@FooBarWidget
Copy link
Contributor Author

Yeah it is exactly as I had suspected. Here's a commit which fixes not only inserts but a few other things like update, remove and last_error_message as well. They're now all significantly faster. In my own microbenchmark I used this to benchmark large inserts:

2000.times do |i|
    LARGE['x'] = i
        col.insert(LARGE)
        LARGE.delete(:_id)
    end
end

LARGE is taken directly from bin/standard_benchmark.

Original run time with original driver: 3.45s
Original runtime with orignal optimizations: 5.90s
New runtime with fixes: 1.08s (219% faster than original, 446% faster than last attempt)

@FooBarWidget
Copy link
Contributor Author

My results BTW: http://gist.github.com/576397

@banker
Copy link
Contributor

banker commented Sep 12, 2010

This is huge. I always thought that moving to a string representation would be faster, but some initial tests I did on the idea proved slow. Thanks for showing the right way to go about this. Impressive work. People will be thrilled.

Kyle

@FooBarWidget
Copy link
Contributor Author

Looks like standard_benchmark has become almost 200% faster in user op/s. :)
Original avg user op/s: 2886
New avg user op/s: 8627

@jamieorc
Copy link

This is great news. You guys rock!

@pius
Copy link

pius commented Sep 13, 2010

Nice work!

@bkeepers
Copy link

Thank you, amazing work!

@michaeldwan
Copy link

Great work guys!

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants