Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Packed arrays #3988
This branch is an implementation of packed arrays for one and two-element Array instances.
There are three new classes: RubyArraySpecialized, a superclass for all the specialized instances, and RubyArrayOneObject and RubyArrayTwoObject, one and two-elt versions of RubyArray. Generally these are used when constructing a new array known to be one or two elements. If the size of the array needs to change from the packed size, it "unpacks" into an array and proceeds to use normal array-based RubyArray logic.
In addition to the many small arrays used by RubyGems, there are many pieces of code in stdlib and other libraries that create small arrays as transient data carriers. We also use small arrays in many places in the JRuby runtime for passing block arguments, splatted arguments, and so on. These cases all can benefit from the packing.
The two primary benefits of this work are reduced memory usage for one and two-element arrays, and reduced dereferencing requirements for those arrays (since we don't have to hop through an IRubyObject to get them).
Memory-wise, the packed arrays add one and two references to the size of RubyArray, but eliminate the much-larger IRubyObject header required for arbitrarily-sized arrays. This translates to 60% savings in the one object case and around 40% savings in the two-object case.
Analysis of a Rails app
An analysis of a blank Rails app in development mode shows me the following, after this patch:
There are 909 live RubyArrayOneObject instances and 565 RubyArrayTwoObject instances, consuming about 102KB of memory. If these were all regular RubyArray with a perfectly-sized IRubyObject, that would be about doubled. If those RubyArray instances had the usual "base" size of 16 elements, it would be considerably higher.
Of the 3046 remaining RubyArray instances, 1626 have one element and 1045 have two elements. These are opportunities to improve our use of packing, but they will require deeper analysis. There are many ways to produce an Array, and the packing logic I've written here can't handle resizes...so many of these arrays may have started out as a different size. In the future we may look at speculatively allocating arrays we know will eventually be packable and adding a way to go from unpacked form to packed form (only the reverse exists now).
Of the 909 RubyArrayOneObject instances, only 2 ended up having to unpack. Of the 565 RubyArrayTwoObject instances, 37 unpacked. This tells me we're doing a good job deciding when to use a packed version that won't just end up unpacking.
There are only 190 three-element RubyArray and 229 four-element RubyArray in the remaining pool. The benefit of packing drops of quickly after one and two-element arrays.
A large app with more "data" may show greater benefit from this patch, especially if it uses many small arrays to represent any data (e.g. a large binary tree).
Doing an allocation profile of that same Rails app, booting up the server and hitting the info page and an error page, I see the following results:
There are a few risks with this patch: