Fixing asset number for asset_path with %d to be consistent in ruby 1.9 #2005

Merged
merged 1 commit into from Jul 7, 2011

Projects

None yet

5 participants

@acroca
acroca commented Jul 7, 2011

When using wildcard %d in assets path is suposed to have a random-like number between [0-3].
It works using String.hash instance method, but in ruby 1.9 this method is not returning always the same hash for the same string.
Just run "hi".hash twice in two different irb consoles. (In the same it works, I read that have some random seed...)

So in production, every server instance have a different asset number. Or with one server, every restart has a different number too.

The asset is still available but in the browser side is better to have the same asset ID because the browser can cache the asset just once and not 4 times.

I've changed this to use a String.bytes.sum instead of String.hash

@acroca acroca Using the sum of bytes instead the hash of the path when replacing th…
…e wildcard of the assets path because in ruby 1.9 is not consistent
f6a6b51
@josh
Member
josh commented Jul 7, 2011

Yeah, depending on #hash is a bad idea. I like this better.

@josevalim josevalim merged commit 453f222 into rails:master Jul 7, 2011
@fxn

Just wondering, is the sum better than just taking source.bytes % 4? I think source.bytes will generally be evenly distributed mod 4, a priori I see no bias in the number of bytes of an asset.

Member
fxn replied Jul 7, 2011

Well, the last char may have a bias, it will generally be "s", (css, js), g (png, jpg)... nah, forget the comment.

Member

Is this hash supposed to be unique?

@tenderlove

This is not a good assertion. Are we not allowed to use strings as hash keys in our implementation?

@tenderlove
Member

Why are we not using MD5 or a Zlib.crc32 for this? Summing the bytes seems like it might be slow, and it can cause collisions (though I'm not sure if we should worry about collisions).

@josh
Member
josh commented Jul 7, 2011

@tenderlove this has nothing to do with asset bodies. We just want a consistent value for each asset bundle. The hash is being called on a filename like "application.js".bytes.sum.

@tenderlove
Member

@josh gotcha. We may want to consider using Zlib.crc32 anyway:

irb(main):018:0> M = 100000
(irb):18: warning: already initialized constant M
=> 100000
irb(main):019:0> Benchmark.realtime { M.times { "foooo.rb".bytes.sum } }
=> 0.3236
irb(main):020:0> Benchmark.realtime { M.times { Zlib.crc32 "foooo.rb" } }
=> 0.047122
irb(main):021:0>
@josh
Member
josh commented Jul 7, 2011

thats hilarious. unpacking string bytes is that slow in ruby. using a simple crc seems good then.

@tenderlove
Member

@josh well, it's not just an unpack. We have to consider that the bytes.sum solution calls 2 + (n - 1) ruby methods (2 for the calls to bytes and sum and n - 1 for the calls to + across n elements of the array), vs the Zlib.crc32 solution that has one Ruby level method call and the rest are outside the VM.

@josh
Member
josh commented Jul 7, 2011

yeah, if the crc code is C, it can work directly with the bytes.

anyone want to patch it :D

@acroca
acroca commented Jul 7, 2011

Yes, this looks good. I was thinking about a easy and fast way to get a random-like number based on the string and I thought that MD5 or SHA1 was slower than just bytes.sum.
If you find any faster way to get this kind of number should be good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment