sandal edited this page Dec 14, 2011 · 7 revisions

We've all heard the famous Knuth quote "Premature optimization is the root of all evil", but that's not to say that we can outright ignore the performance characteristics of our code. Instead, it's meant to serve as a reminder that performance optimization should only be done when there is an established need for speedier computations.

Assuming there is a real need for improving performance, the first step should be to produce some benchmarks which measure both the specific feature you are trying to optimize as well as the overall performance of the application. No performance tuning should be done without doing at least a minimum amount of measurement of the impact you're having on the system. In the most simple cases the time it takes to run the test suite for your project will give you a good hint at the impact you're having system-wide. For measuring the impact of a change to a single feature, either the Benchmark standard library or some simple manipulations of Time objects will do the trick.

Once measurements are in place, the first step should be to determine whether there are more efficient algorithms that can be used to approach your problem. Changes in algorithms can lead to order of magnitude improvements in performance which can make a huge difference that dwarfs the improvement seen by implementing the same inefficient algorithm in a lower level language such as C. That said, if the bottleneck in your code is primarily in simple computations rather than in complex algorithmic procedures, dropping down below the level of Ruby can make a big difference.

A classic example is that of PNG alpha channel splitting. The process of breaking out the alpha values from pixels is a simple one, but involves a lot of low level bit manipulation which is simply a whole lot faster in C than it ever would be in Ruby. This is a problem we encountered in the Prawn PDF generation library: a single relatively simple PNG with transparency layers can take up to a second to embed, and more complex images can take several seconds. Because it was a design goal of ours to implement Prawn in pure Ruby, the official library has a known bottleneck when it comes to PNGs with alpha layers. Perhaps unsurprisingly, a user came along and introduced an extension which replaces our PNG generation code with something which calls out to ImageMagick shortly after this problem was discovered in Prawn. While this code drags in a lot more complex dependencies, it increases the performance dramatically, rendering an image that takes 0.9s via our pure Ruby code in just 0.03s.

A couple years later, a standalone pure Ruby PNG processing library ChunkyPNG was built, which exhibited the same performance characteristics as Prawn's PNG code. Soon enough, history was repeated and a library called OilyPNG was released which keeps the same API as ChunkyPNG but rewrites some of the core methods in C. This pattern repeats itself all over the Ruby world, with everything from database adapters to JSON parsers having pure Ruby and C based alternatives that allow developers to choose whether they want a simpler to maintain codebase or a faster one. Because this is a real trade we need to face in Ruby, it's worth having at least a working knowledge of how to build C-Extensions or FFI bindings if you plan to do any performance intensive work in Ruby.

That having been said, Ruby is only "slow" for truly low-level computationally expensive operations. For pretty much everything else, learning how to write smarter code is the way to go.

Turn the page if you're taking the linear tour, or feel free to jump around via the sidebar.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.