Skip to content
This repository

Multipart filename can not be decoded. #323

Open
comron opened this Issue January 20, 2012 · 16 comments

5 participants

Comron Sattari James Tucker Brian Morearty Mitesh Jain octavpo
Comron Sattari

I'm using a simple form on a rails application to upload a file, it looks as if the browser is not encoding the filename in the multipart "Content-Disposition" header, and as a result rack cannot decode the filename if it has a % in it, as it expects this to be the start of an escape sequence.

<form accept-charset="UTF-8" action="/documents.json" enctype="multipart/form-data" method="post">
  <input name="utf8" type="hidden" value="✓">
  <input id="document_attachment" name="document[attachment]" type="file">
  <input type="submit">
</form>

Results in the following request

Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Content-Type:multipart/form-data; boundary=----WebKitFormBoundary2NHc7OhsgU68l3Al
Origin:http://localhost:3000
Referer:http://localhost:3000/
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.18 (KHTML, like Gecko) Chrome/18.0.1010.0 Safari/535.18

------WebKitFormBoundary2NHc7OhsgU68l3Al
Content-Disposition: form-data; name="utf8"

✓
------WebKitFormBoundary2NHc7OhsgU68l3Al
Content-Disposition: form-data; name="document[attachment]"; filename="100% of a photo.jpeg"
Content-Type: image/jpeg


------WebKitFormBoundary2NHc7OhsgU68l3Al--

As you can see the filename is not encoded, but the % causes rack to throw an error. Where exactly is the problem? It seems I have no control over the browser (Chrome and Firefox on OS X) encoding this name, and rack assumes it is always encoded, in multipart/parser.rb.

Thanks and sorry if this is the wrong forum for such a question.

Brian Morearty

That's funny, I was just about to file this same issue.

Here is a config.ru file that can be used for testing:

require 'rubygems'
require 'rack'
require 'rack/request'

class App
  def call(env)
    request = Rack::Request.new(env)
    params = request.params  # <== THIS BLOWS UP with percents in the filename

    html = <<-HTML
      <html>
        <head>
          <style>
            html { background-color: lightblue; }
            body { margin: 3em; padding: 3em; font-family: Verdana; background-color: white; border: 1px solid gray; }
            strong { background-color: yellow; }
            .error { color: red; }
          </style>
        </head>
        <body>
          <form accept-charset="UTF-8" action="/" method="POST" enctype="multipart/form-data">
            <p>
              Upload a file <strong>whose filename has a %</strong> but does not follow the URL-encoding pattern.
            </p>
            <p>
              The call to <tt>request.params</tt> fails with "invalid %-encoding" in IE8, IE9, FF9/Mac, Chrome 16/Mac,
              and Safari 5/Mac because the browsers are not URL-encoding the filename but <tt>Rack::Request</tt>
              expects it to be encoded.
            </p>
            <p>
              In development a stack trace is shown.
              When rack is running in production mode (<tt>rackup -E production config.ru</tt>),
              Safari 5 goes into an infinite loop of requests.  The other browsers show error pages.
            </p>
            <input name="upload" type="file" />
            <input type="submit" value="Submit and Explode"/>
          </form>
          <div class='error'>
            #{ "There was no % in that filename" if params['upload'] && !(params['upload'][:filename] =~ /%/) }
          </div>
        </body>
      </html>
    HTML

    [200, {"Content-Type" => "text/html"}, [html]]
  end
end

run App.new
James Tucker
Owner

Can you guys validate: a5f681e

Brian Morearty

@raggi That fixes most cases for me. Certainly much better than before and will work most of the time. But when I tried "a%b.txt" I still got the same error because it's followed by one valid hex letter but not by two. It might be good to make it robust enough to handle that case because Safari goes into an infinite-request loop if it gets this error.

James Tucker
Owner

Oh, i screwed up that regex. I'll make some more tests. Thanks for checking.

James Tucker
Owner

@BMorearty it's getting pretty nasty, but, 8d20282 should pass that example too. Any others you can think of?

James Tucker raggi closed this January 22, 2012
Brian Morearty

@raggi Here's one that still fails: '100%' (filename ends with percent--has no extension).

I'm starting to question whether unescaping percents should be done at all. As we discussed before this ticket was filed, RFC 1867 says:

...if the file name of the client's
operating system is not in US-ASCII, the file name might be
approximated or encoded using the method of RFC 1522.

RFC 1552 encoding, though, does not use percents and does not look like URL-encoding (RFC 1738).

Looking at Rack's history, the original call to Utils.unescape for the filename was in a36ac97 to handle "UAs that don't correctly escape Content-Disposition filenames." Looking at the code and the tests, this was introduced because browsers apparently are inconsistent in their treatment of an uploaded filename with quotes in it. (RFC 1867 says nothing about what to do here.)

I believe a36ac97 ended up with a bad side-effect: it fixed filenames with quotes but broke filenames with percents.

Here's what I found with various browsers (Chrome 16/Mac, FF 9/Mac, Safari 5/Mac, IE8/Win)

  • The only ASCII character any browser escapes in the filename attribute of a file field is double-quote. I didn't scientifically try every single ASCII character, but I tried a bunch that you might think would be escaped: double quote ("), single quote ('), percent (%), ampersand (&), question mark (?), backslash (\).
  • The way double-quoted is escaped varies by browser:
    • Chrome and Safari: %22
    • Firefox: \"
    • IE: N/A (double-quote is not a valid filename character in NTFS, which is the filesystem used by almost all Windows installations).

Based on this research, here's my proposal for how Rack should handle filenames in a form post:

  • Change %22 into "
  • Change \" into "
  • Do not attempt any other conversions.

This satisfies the intent of the original commit (different UAs do different things with quotes) without having other side-effects.

James Tucker
Owner

Thanks for being so thorough!

Out of interest, webkit browsers don't escape % any way do they?

Brian Morearty

No problemo. :-)

No, no browser that I tried escapes % in any way.

But yeah, I know why you're asking. It's true that the server can't tell the difference between double quote and %22.

Brian Morearty

Updated test case that shows a little more information:

require 'rubygems'
require 'rack'
require 'rack/request'

class App
  def call(env)
    request = Rack::Request.new(env)
    params = request.params  # <== THIS BLOWS UP with percents in the filename

    html = <<-HTML
      <html>
        <head>
          <style>
            html { background-color: lightblue; }
            body { margin: 3em; padding: 3em; font-family: Verdana; background-color: white; border: 1px solid gray; }
            strong { background-color: yellow; }
            .error { color: red; }
          </style>
        </head>
        <body>
          <form accept-charset="UTF-8" action="/" method="POST" enctype="multipart/form-data">
            <p>
              Upload a file <strong>whose filename has a %</strong> but does not follow the URL-encoding pattern.
            </p>
            <p>
              The call to <tt>request.params</tt> fails with "invalid %-encoding" in IE8, IE9, FF9/Mac, Chrome 16/Mac,
              and Safari 5/Mac because the browsers are not URL-encoding the filename but <tt>Rack::Request</tt>
              expects it to be encoded.
            </p>
            <p>
              In development a stack trace is shown.
              When rack is running in production mode (<tt>rackup -E production config.ru</tt>),
              Safari 5 goes into an infinite loop of requests.  The other browsers show error pages.
            </p>
            <input name="upload" type="file" />
            <input type="submit" value="Submit and Explode"/>
          </form>
          <div>
            #{ "The last filename was #{params['upload'][:filename]}" if params['upload'] }
          </div>
          <div class='error'>
            #{ "There was no % in that filename (#{params['upload'][:filename]})" if params['upload'] && !(params['upload'][:filename] =~ /%/) }
          </div>
          <div>
            #{request.body.read}
          </div>
        </body>
      </html>
    HTML

    [200, {"Content-Type" => "text/html"}, [html]]
  end
end

run App.new
James Tucker raggi referenced this issue from a commit January 22, 2012
James Tucker Multipart percentage fail, round 3, the final character. Fixes string…
…s terminated with %. See #323. Revisit for 1.5.
7d3c3fd
James Tucker
Owner

This is awesome info. I'm going to reopen this and revisit for a big cleanup in a future release.

James Tucker raggi reopened this January 22, 2012
Brian Morearty

Thanks for getting on it so quick, dude.

James Tucker
Owner

It was mostly good timing. I'll back port stuff to 1.3.x when I do that release, maybe next weekend.

Mitesh Jain

I made the pull request

chneukirchen#24

please check this if it helps

Comron Sattari

@raggi @BMorearty Has the 1.4.1 release of rack solved this for you?

I think its still a little overzealous in decoding - it tries to decode any %[A-Fa-f0-0]{2}, instead of sticking to %22 and \". I know the case is silly but a file name like Foo%A0bar.jpg still causes problems because the filename is changed by rack to have some funky character in it.

James Tucker
Owner
raggi commented March 17, 2012

Gosh, the way this stuff gets encoded by these browsers is just silly. Issues still open, will consider exactly when / where is the appropriate time and place to address this.

octavpo

This issue is still not completely fixed. I found that if you have a + in a filename, it will be changed into a space. Like for instance 5x+2eq2x8 is changed into 5x 2eq2x8. However if I add a % to the filename, like 5x+2eq2x%8, then it goes through fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.