Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fixing asset number for asset_path with %d to be consistent in ruby 1.9 #2005

Merged
merged 1 commit into from Jul 7, 2011

Conversation

Projects
None yet
5 participants
Contributor

acroca commented Jul 7, 2011

When using wildcard %d in assets path is suposed to have a random-like number between [0-3].
It works using String.hash instance method, but in ruby 1.9 this method is not returning always the same hash for the same string.
Just run "hi".hash twice in two different irb consoles. (In the same it works, I read that have some random seed...)

So in production, every server instance have a different asset number. Or with one server, every restart has a different number too.

The asset is still available but in the browser side is better to have the same asset ID because the browser can cache the asset just once and not 4 times.

I've changed this to use a String.bytes.sum instead of String.hash

@acroca acroca Using the sum of bytes instead the hash of the path when replacing th…
…e wildcard of the assets path because in ruby 1.9 is not consistent
f6a6b51
Member

josh commented Jul 7, 2011

Yeah, depending on #hash is a bad idea. I like this better.

@josevalim josevalim added a commit that referenced this pull request Jul 7, 2011

@josevalim josevalim Merge pull request #2005 from acroca/master
Fixing asset number for asset_path with %d to be consistent in ruby 1.9
453f222

@josevalim josevalim merged commit 453f222 into rails:master Jul 7, 2011

Just wondering, is the sum better than just taking source.bytes % 4? I think source.bytes will generally be evenly distributed mod 4, a priori I see no bias in the number of bytes of an asset.

Owner

fxn replied Jul 7, 2011

Well, the last char may have a bias, it will generally be "s", (css, js), g (png, jpg)... nah, forget the comment.

Owner

tenderlove replied Jul 7, 2011

Is this hash supposed to be unique?

This is not a good assertion. Are we not allowed to use strings as hash keys in our implementation?

Owner

tenderlove commented Jul 7, 2011

Why are we not using MD5 or a Zlib.crc32 for this? Summing the bytes seems like it might be slow, and it can cause collisions (though I'm not sure if we should worry about collisions).

Member

josh commented Jul 7, 2011

@tenderlove this has nothing to do with asset bodies. We just want a consistent value for each asset bundle. The hash is being called on a filename like "application.js".bytes.sum.

Owner

tenderlove commented Jul 7, 2011

@josh gotcha. We may want to consider using Zlib.crc32 anyway:

irb(main):018:0> M = 100000
(irb):18: warning: already initialized constant M
=> 100000
irb(main):019:0> Benchmark.realtime { M.times { "foooo.rb".bytes.sum } }
=> 0.3236
irb(main):020:0> Benchmark.realtime { M.times { Zlib.crc32 "foooo.rb" } }
=> 0.047122
irb(main):021:0>
Member

josh commented Jul 7, 2011

thats hilarious. unpacking string bytes is that slow in ruby. using a simple crc seems good then.

Owner

tenderlove commented Jul 7, 2011

@josh well, it's not just an unpack. We have to consider that the bytes.sum solution calls 2 + (n - 1) ruby methods (2 for the calls to bytes and sum and n - 1 for the calls to + across n elements of the array), vs the Zlib.crc32 solution that has one Ruby level method call and the rest are outside the VM.

Member

josh commented Jul 7, 2011

yeah, if the crc code is C, it can work directly with the bytes.

anyone want to patch it :D

Contributor

acroca commented Jul 7, 2011

Yes, this looks good. I was thinking about a easy and fast way to get a random-like number based on the string and I thought that MD5 or SHA1 was slower than just bytes.sum.
If you find any faster way to get this kind of number should be good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment