Skip to content
Permalink
Browse files

Rack::Lint no longer requires a Content-Length response header

  • Loading branch information...
rtomayko committed Mar 6, 2009
1 parent 1270a6d commit 86ddc7a6ec68d7b6951c2dbd07947c4254e8bc0d
Showing with 19 additions and 49 deletions.
  1. +19 −35 lib/rack/lint.rb
  2. +0 −14 test/spec_rack_lint.rb
@@ -374,59 +374,43 @@ def check_content_type(status, headers)

## === The Content-Length
def check_content_length(status, headers, env)
chunked_response = false
headers.each { |key, value|
if key.downcase == 'transfer-encoding'
chunked_response = value.downcase != 'identity'
end
}

headers.each { |key, value|
if key.downcase == 'content-length'
## There must be a <tt>Content-Length</tt>, except when the
## +Status+ is 1xx, 204 or 304, in which case there must be none
## given.
## There must not be a <tt>Content-Length</tt> header when the
## +Status+ is 1xx, 204 or 304.
assert("Content-Length header found in #{status} response, not allowed") {
not Rack::Utils::STATUS_WITH_NO_ENTITY_BODY.include? status.to_i
}

assert('Content-Length header should not be used if body is chunked') {
not chunked_response
}

bytes = 0
string_body = true

@body.each { |part|
unless part.kind_of?(String)
string_body = false
break
end
if @body.respond_to?(:to_ary)

This comment has been minimized.

Copy link
@dkubb

dkubb Mar 13, 2009

Contributor

Should this line of code be checking if it responds to #to_ary, and then using #each inside the block? I guess most of the time something that can respond to #to_ary will be able to handle #each, but it’s not really gauranteed.

Was the intention within the block to do something like @body.to_ary.each { … } ?

@body.each { |part|
unless part.kind_of?(String)
string_body = false
break
end

bytes += Rack::Utils.bytesize(part)
}

if env["REQUEST_METHOD"] == "HEAD"
assert("Response body was given for HEAD request, but should be empty") {
bytes == 0
bytes += Rack::Utils.bytesize(part)
}
else
if string_body
assert("Content-Length header was #{value}, but should be #{bytes}") {
value == bytes.to_s

if env["REQUEST_METHOD"] == "HEAD"
assert("Response body was given for HEAD request, but should be empty") {
bytes == 0
}
else
if string_body
assert("Content-Length header was #{value}, but should be #{bytes}") {
value == bytes.to_s
}
end
end
end

return
end
}

if [ String, Array ].include?(@body.class) && !chunked_response
assert('No Content-Length header found') {
Rack::Utils::STATUS_WITH_NO_ENTITY_BODY.include? status.to_i
}
end
end

## === The Body
@@ -222,13 +222,6 @@ def env(*args)
end

specify "notices content-length errors" do
lambda {
Rack::Lint.new(lambda { |env|
[200, {"Content-type" => "text/plain"}, []]
}).call(env({}))
}.should.raise(Rack::Lint::LintError).
message.should.match(/No Content-Length/)

[100, 101, 204, 304].each do |status|
lambda {
Rack::Lint.new(lambda { |env|
@@ -238,13 +231,6 @@ def env(*args)
message.should.match(/Content-Length header found/)
end

lambda {
Rack::Lint.new(lambda { |env|
[200, {"Content-type" => "text/plain", "Content-Length" => "0", "Transfer-Encoding" => "chunked"}, []]
}).call(env({}))
}.should.raise(Rack::Lint::LintError).
message.should.match(/Content-Length header should not be used/)

lambda {
Rack::Lint.new(lambda { |env|
[200, {"Content-type" => "text/plain", "Content-Length" => "1"}, []]

8 comments on commit 86ddc7a

@rtomayko

This comment has been minimized.

Copy link
Contributor Author

replied Mar 13, 2009

I’ve had the same question myself. I’ve seen examples of both ways. Not actually calling #to_ary seems to be more prevalent. I think it’s a bit confusing, though and am happy to change it.

Also, what’s your feeling on these changes in general? I was hoping to run it by you since you were involved with the Thin Content-Length vs. chunked discussion that led to the previous behavior.

@dkubb

This comment has been minimized.

Copy link
Contributor

replied Mar 13, 2009

rtokayko: I'm not sure. In general I think it should be consistent: the respond_to? check should match the method you're using. I always thought that the purpose of the to_ary method was to indicate the object is effectively an Array. In this case I'd consider pairing @body.respond_to?(:to_ary) with @body.to_ary.each { ... }, but I might be more inclined to pair it with Array(body).each { … } instead.

As far as these changes I am fine with them. At the time, I thought it was the best course of action to update Rack::Lint to enforce either Content-Length or Transfer-Encoding is returned. The new implementation, where the handlers add these headers is better than ensuring every Rack app do it. My primary concern is the optimal headers are sent to the client.

I was actually thinking it might be nice to have some sort of HTTP proxy (HTTP::Lint?) that could sit in front of a server and inspect the HTTP headers, logging when something doesn’t match the specs. Or perhaps a suite of test cases that could be used to verify a server implements the spec properly.

@raggi

This comment has been minimized.

Copy link
Member

replied Mar 13, 2009

I think it shouldn’t be required, proxying #each is trivial, and then requires only a single implementation of an enumerable interface.

Using both just means people might be tempted to do stuff like:

  def to_ary
    self
  end

Or the like. I’m interested to know why we need to coerce to a sepecific type – this seems very anti-ruby / duck typing.

@raggi

This comment has been minimized.

Copy link
Member

replied Mar 13, 2009

Oh, and -2 to Array(@body), that’ll do inconsistent things with non-array datastructures, that’ll lead to hard to debug code paths, e.g.

run lambda { |env| [200, {}, DeferrableBody.new] }
...
Array(@body) #=> [#<DeferrableBody ...>]

And that breaches the rack spec now, even though the body handed back conformed to “responds to #each and yields string objects”. I could subclass under Array, but strictly speaking, it’s a streaming interface, and does not have an array like nature.

@sporkmonger

This comment has been minimized.

Copy link

replied Mar 13, 2009

I’m inclined to say that checking for to_ary and then calling each is a mistake. Consider the edge case where someone writes a class that could potentially be treated as say, both an Array and a Hash, but the each method implementation performs like a Hash and sends back a key and a value.

I’d probably argue that such a class is a bad idea to begin with, but I think you’ll find that there are no shortages of bad ideas that have managed to find their way into actual code.

It’s much more obvious to anyone reading the code that the behavior is correct if you call @body.to_ary.each { ... }.

@sporkmonger

This comment has been minimized.

Copy link

replied Mar 13, 2009

@raggi: Duck typing is not a universal good. In widely-used public APIs where the desired object is one of String, Symbol, Array, Hash, or Proc, an implicit cast is generally preferred to a duck type. Duck typing has always been a powerful, but sharp tool, and if you’re not careful, you are absolutely in danger of cutting yourself with poor assumptions about how method names map to method behavior. The more widely-used the API, the greater the danger of introducing hard-to-debug issues. Implicit casts have their own share of danger, but they’re much, much easier to specify in an API.

As for to_ary returning self, anyone who does that deserves to have their code broken, unless self.kind_of?(Array). Doing a type check on to_ary’s return value would be total overkill, but in some cases (not here) might be justified, if for no other reason than to train people that to_ary should always return an Array.

@rtomayko

This comment has been minimized.

Copy link
Contributor Author

replied Mar 13, 2009

@raggi said: “I’m interested to know why we need to coerce to a sepecific type – this seems very anti-ruby / duck typing.”

We’re not coercing to a certain type – #to_ary is an implicit cast, not an explicit cast like #to_a. Objects that aren’t Arrays (or extremely Array like) must not respond to it. The reason we check for arrayness is that we can’t calculate the Content-Length on anything that isn’t of fixed/known size. We can’t iterate over the body unless we know it’s already completely available and restartable.

@raggi

This comment has been minimized.

Copy link
Member

replied Mar 13, 2009

@rtomayko: got it, agreed, and +1

:-)

Please sign in to comment.
You can’t perform that action at this time.