Skip to content

Fix default Content-Type (RFC2616 7.2.1) #644

Open
wants to merge 4 commits into from

5 participants

@yegortimoshenko

This patch fixes the behaviour of Content-Type header if it was not included in the request according to RFC 2616 section 7.2.1. It also fixes sinatra/sinatra#804 issue.

@yegortimoshenko

@raggi This patch seems to break quite a lot of tests, had no idea. Should I rewrite them so it will pass the test suite?

@rkh
Official Rack repositories member
rkh commented Feb 21, 2014

Getting this tests to pass again would be much appreciated.

@guillec
guillec commented Mar 1, 2014

I sent @somu a pr that fixes test.

@raggi
Official Rack repositories member
raggi commented Mar 6, 2014

I'm not certain this patch is what we want. The API essentially currently has the semantic that if no type is provided by the client, then we expose that information in the form of a nil value here. This could be important for users doing content autodetection through mime magic / magic hints, etc. I would have some agreement that application/octet-stream suggests the same should be done, but I'm not sure that my initial example of mime autodetection is the only use case here. We could also break basic REST apps for client that don't send content-type JSON in requests, where the server code uses the nil to fall-through to a JSON default.

@rtyler
rtyler commented Jul 1, 2014

@raggi Is there some form of this change that you might find more acceptable? Perhaps a Rack.strictly_rfc! or something like that where behavior following RFC could be turned on?

My team just spent a couple of days pulling their head out trying to understand why our Rack-based app was throwing a 500, which wasn't logging a traceback, when a client was POSTing binary data.

server logs:

127.0.0.1 - - [01/Jul/2014 09:17:10] "POST /api/public/event HTTP/1.1" 500 - 0.1840     

client logs

Status: 500
content-type: text/plain
transfer-encoding: chunked
connection: close
Body: invalid byte sequence in UTF-8

To me there are two problems here:

  1. Rack is not following the RFC and is choosing an incorrect default
  2. Rack is silently failing making it nigh impossible to debug.

I'm definitely willing to spend some weekend cycles coming up with a fix that you might find more acceptable

@raggi
Official Rack repositories member
raggi commented Jul 2, 2014

Can you explain how this alone was causing a 500 in your app?

In the current RFCs (7231, 3.1.1.5), the language is yet more clear:

   A sender that generates a message containing a payload body SHOULD
   generate a Content-Type header field in that message unless the
   intended media type of the enclosed representation is unknown to the
   sender.  If a Content-Type header field is not present, the recipient
   MAY either assume a media type of "application/octet-stream"
   ([RFC2046], Section 4.5.1) or examine the data to determine its type.

Please note the last part. If we accept this patch, it becomes cumbersome for applications to perform the latter.

Here's an example application that provides the content type of it's message bodies:

require 'mimemagic'
run lambda { |env|
  req = Rack::Request.new(env)
  content_type = req.content_type || MimeMagic.by_magic(req.body)
  [content_type ? 200 : 404, {}, [content_type]]
}
@raggi raggi added this to the Rack 1.6 milestone Jul 6, 2014
@rtyler
rtyler commented Jul 6, 2014

@raggi This repository contains a good reproduction case of the behavior we were seeing.

In our use-case we're posting Protobufs, which are binary object serializations, to a Sinatra application. Due to a client error (which was a whole other bag of frustration) the client was not properly setting it's Content-Type header to application/x-protobuf/

Below is a transcript of the server side of the error from the repository:

[20:42:34] tyler:rack-500 git:(master*) $ RACK_ENV=production rackup -p 9111
[2014-07-05 20:42:36] INFO  WEBrick 1.3.1
[2014-07-05 20:42:36] INFO  ruby 2.1.0 (2013-12-25) [x86_64-freebsd11.0]
[2014-07-05 20:42:36] INFO  WEBrick::HTTPServer#start: pid=1064 port=9111
LOG! [{"CONTENT_LENGTH"=>"27", "CONTENT_TYPE"=>"application/x-www-form-urlencoded", "GATEWAY_INTERFACE"=>"CGI/1.1", "PATH_INFO"=>"/crash", "QUERY_STRING"=>"", "REMOTE_ADDR"=>"127.0.0.1", "REMOTE_HOST"=>"localhost", "REQUEST_METHOD"=>"POST", "REQUEST_URI"=>"http://localhost:9111/crash", "SCRIPT_NAME"=>"", "SERVER_NAME"=>"localhost", "SERVER_PORT"=>"9111", "SERVER_PROTOCOL"=>"HTTP/1.1", "SERVER_SOFTWARE"=>"WEBrick/1.3.1 (Ruby/2.1.0/2013-12-25)", "HTTP_ACCEPT_ENCODING"=>"gzip;q=1.0,deflate;q=0.6,identity;q=0.3", "HTTP_ACCEPT"=>"*/*", "HTTP_USER_AGENT"=>"Ruby", "HTTP_CONNECTION"=>"close", "HTTP_HOST"=>"localhost:9111", "rack.version"=>[1, 2], "rack.input"=>#<StringIO:0x00000806c46ea0>, "rack.errors"=>#<IO:<STDERR>>, "rack.multithread"=>true, "rack.multiprocess"=>false, "rack.run_once"=>false, "rack.url_scheme"=>"http", "HTTP_VERSION"=>"HTTP/1.1", "REQUEST_PATH"=>"/crash", "rack.logger"=>#<Rack::NullLogger:0x00000806c3baa0 @app=#<AStupidMiddleware:0x00000806c3bac8 @app=#<LoggingMiddleware:0x00000806c3baf0 @app=#<MyApi:0x00000806c469f0 @default_layout=:layout, @preferred_extension=nil, @app=nil, @template_cache=#<Tilt::Cache:0x00000806c469c8 @cache={}>>, @logger=nil>>>, "sinatra.commonlogger"=>true, "rack.request.query_string"=>"", "rack.request.query_hash"=>{}, "rack.request.form_input"=>#<StringIO:0x00000806c46ea0>, "rack.request.form_hash"=>{"\n\u0019I'm a UTF-8 safe protobuf"=>nil}, "rack.request.form_vars"=>"\n\x19I'm a UTF-8 safe protobuf", "sinatra.route"=>"POST /crash"}, 200, {"Content-Type"=>"text/html;charset=utf-8", "Content-Length"=>"6"}, 2014-07-05 20:42:42 -0700]
I should raise ArgumentError but I can't process input either!
[2014-07-05 20:42:42] ERROR ArgumentError: invalid byte sequence in UTF-8
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/utils.rb:104:in `normalize_params'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/utils.rb:96:in `block in parse_nested_query'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/utils.rb:93:in `each'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/request.rb:373:in `parse_query'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/request.rb:211:in `POST'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/request.rb:225:in `params'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:893:in `call!'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:886:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/commonlogger.rb:33:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:217:in `call'
    /usr/home/tyler/source/github/rack-500/app.rb:10:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/nulllogger.rb:9:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/head.rb:11:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:180:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:2014:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1478:in `block in call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1788:in `synchronize'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/sinatra-1.4.5/lib/sinatra/base.rb:1478:in `call'
    /home/tyler/.rvm/gems/ruby-2.1.0@rubygems/gems/rack-1.5.2/lib/rack/handler/webrick.rb:60:in `service'
    /home/tyler/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/webrick/httpserver.rb:138:in `service'
    /home/tyler/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/webrick/httpserver.rb:94:in `run'
    /home/tyler/.rvm/rubies/ruby-2.1.0/lib/ruby/2.1.0/webrick/server.rb:295:in `block in start_thread'

Then from the client side:

[20:41:04] tyler:rack-500 git:(master*) $ bundle exec rake compile && ./test_post
protoc --ruby_out=./lib  -I proto ./proto/*.proto
POSTing to http://localhost:9111//crash
body: "\n\x19I'm a UTF-8 safe protobuf"
====response====
Status: 200
Body: Hello!
POSTing to http://localhost:9111//crash
body: "\n\x13I'm a bad protobuf!\x10\xF8U"
====response====
Status: 500
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<HTML>
  <HEAD><TITLE>Internal Server Error</TITLE></HEAD>
  <BODY>
    <H1>Internal Server Error</H1>
    invalid byte sequence in UTF-8
    <HR>
    <ADDRESS>
     WEBrick/1.3.1 (Ruby/2.1.0/2013-12-25) at
     localhost:9111
    </ADDRESS>
  </BODY>
</HTML>

The line ERROR ArgumentError: invalid byte sequence in UTF-8 is written to $stderr. In our case we are using raven-ruby for exception tracing into Sentry, along with a custom logging middleware. In our environment, I think somewhere along the line rack.errors is getting changed away from $stderr and is swallowing up any information about the error (perhaps some behavior in Puma is causing NullLogger to be used).

A couple of suggestions that I think might make Rack more resilient to failure here:

  • If the form processing/decoding code fails for whatever reason, logging to rack.errors and setting the request params to nil.
  • A flag to explicitly disable this auto-detection behavior
@raggi
Official Rack repositories member
raggi commented Jul 6, 2014

The form media type is being passed by the client in this example, which is why when Sinatra is calling Rack::Request#params we're trying to parse the params as form encoded data. This crash is valid. Sinatra should be catching the error and returning a 400.

@rtyler
rtyler commented Jul 6, 2014

@somu I've been chatting the issue over with @raggi quite a bit, and I've now come full circle. While I believe the right fix here would be to update Rack::Request#form_data? to be the following:

    def form_data?
      type = media_type
      meth = env["rack.methodoverride.original_method"] || env['REQUEST_METHOD']
      # NOTE: the change in the following conditional, if the media_type is nil, this method will now return false
      (meth == 'POST' && !type.nil?) || FORM_DATA_MEDIA_TYPES.include?(type)
    end

The problem is that this change cannot be made without requiring a major version bump of Rack.

That said, there is good news! This behavior can be easily, and safely, added with the a Rack middleware (what a great library):

class DefaultContentTypeMiddleware
  def initialize(app)
    @app = app
  end

  def call(env)
    # Default to the 'application/octet-stream' content type if one is not set.
    if env['CONTENT_TYPE'].nil? || env['CONTENT_TYPE'].empty?
      env['CONTENT_TYPE'] = 'application/octet-stream'
    end
    @app.call(env)
  end
end

@raggi I think you can close this pull request since we have a work-around, but I wouldn't mind this behavior-change being added to a Rack 2.0 wishlist :)

@raggi
Official Rack repositories member
raggi commented Jul 6, 2014

Great, thanks @rtyler I appreciate you taking the time.

I will mark this for the 2.0 milestone and leave open.

@raggi raggi modified the milestone: Rack 2.0, Rack 1.6 Jul 6, 2014
@yegortimoshenko

@rtyler That will do just nice. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.