-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gzip issue with jruby and Rack::ETag #571
Comments
This is happening during Rack::ETag's
This code assumes that What this means is:
Here is an example stand-alone script, so you can play around with JRuby and MRI's GzipStreams. Three solutions exist, that I see:
|
Good catch. I wonder if it is necessary for Rack::ETag to duplicate the body as a string in memory like that at all. It seems rather wasteful, especially with large responses. Would it be possible to incrementally update the md5 digest like is done here: http://www.artima.com/forums/flat.jsp?forum=123&thread=40956. Something like:
The only issue that I see with this is that digest_body currently returns both the digest and the parts array, which would need refactoring to just return the digest. The caller already has a ref to the body so it could just call .each again on it. |
MRI (and others, I'm sure) use a copy-on-write policy when dup'ing Strings. I don't think there is hardly any performance hit here ( except in the case of Deflater preceding ETag). I would like someone else to confirm though.
Right, here I will defer to a Rack maintainer to explain whether and why |
Oops, my bad. Solution 1 (
Which are way too round-about to commit to Rack::ETag. I think a bug needs to be opened against JRuby. |
Your understanding of this issue is much better than mine, perhaps best if you file the bug against Jruby? My current understanding of the issue is that with etags the header needs to go out before the body. So, if jruby modifies the parts while the etag is also consuming the parts and keeping a reference, I can easily see some conflict arising. That's why I brought up incremental hashing above, since there'd be no need for the logic to keep references to the parts and it would be alright for Jruby to do its thing (still assumes the etag logic sees the part first). Also, the current logic concatenates the parts into a new string after all parts have been yielded before calculating the hash, which sounds redundant to me even if ruby does it pretty efficiently. In any case, the issue not pressing for me anymore. I now handle compression in nginx. |
Reusing buffers in As far as some of the other proposals here, none of them address the issue that in this case the fault is outside of rack. |
Filed a bug against jruby for this: jruby/jruby#1371 |
See the following: rack/rack#571 jruby/jruby#1371
I'm running in a weird issue that seems to be jruby specific where there is some interaction between Rack::ETag and Rack::Deflater.
The application below returns invalid gzip content:
use Rack::ETag
use Rack::Deflater
class TestApp
def call(env)
[200, {},["hello\n"]]
end
end
run TestApp.new
I test this using
$ rackup
$ wget -q -O- --header 'accept-Encoding: gzip' 'http://localhost:9292/' | gzcat
gzip: stdin: not in gzip format
As soon as I remove the Rack::ETag line everything is fine. All works fine on mri, it's just jruby that's problematic.
I'm using jruby-1.7.4 and have these gems installed:
jruby-launcher (1.0.16 java)
rack (1.5.2)
rack-protection (1.5.0)
rake (10.0.3)
sinatra (1.4.2)
tilt (1.4.1)
The text was updated successfully, but these errors were encountered: