Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Fix for files and directories with + as a character in the file name b0rking Rack::Directory #265

Open
wants to merge 2 commits into from

7 participants

@VictorLowther

Having a + in file and path names is a perfectly cromulent thing, and unescaping the name magically turns those + es into es

VictorLowther added some commits
@VictorLowther VictorLowther Unescaping PATH_INFO transforms /file_name.file on the filesystem to …
…/file name.file from the POV of the web server, which is a Bad Thing.
245b420
@VictorLowther VictorLowther Unescaping PATH_INFO transforms /file+name.file on the filesystem to …
…/file name.file from the POV of the web server, which is a Bad Thing.
290cbe7
@josh

+s in urls mean spaces.

@josh josh closed this
@VictorLowther
@richmeyers

+ is a reserved character in URLs, whoever is generating a URL with unencoded + needs to encode +.

http://tools.ietf.org/html/rfc3986#section-2.2

@VictorLowther
@lgierth

Maybe that's your HTTP client not escaping + as it thinks it's a space-plus? Rack should definitely stick to the URI spec.

@richmeyers

@VictorLowther it does look like Rack::Directory is not encoding file names, and that would be the bug. The correct fix would be to have Rack::Directory encode file names in directory listings.

@VictorLowther

A but more background:

I encountered this bug when trying to use Rainbows to serve up an Ubuntu install tree for net installs. Out of all the web servers I tried, only Rainbows failed to act as a backing webserver when using Rack::Directory to serve up the directory tree -- the install failed the first time it tried to serve a filename with a + in the name.

To replicate:
sudo gem install rack rainbows
mkdir -p test/foo+bar
echo "here"> test/foo+bar/bar+baz
echo 'run Rack::Directory.new(".")' >test/config.ru
(cd test; rainbows -p 8080)

Then, in a different terminal:
wget -O - --save-headers http://localhost:8080/
wget -O - --save-headers http://localhost:8080/foo+bar/
wget -O - --save-headers http://localhost:8080/foo+bar/bar+baz

wget will print a directory index for the first pull, and get 404 errors for the second and third. The Rainbows output will show:

127.0.0.1 - - [19/Nov/2011 09:27:45] "GET / HTTP/1.0" 200 - 0.0153
127.0.0.1 - - [19/Nov/2011 09:28:14] "GET /foo+bar/ HTTP/1.0" 404 28 0.0024
127.0.0.1 - - [19/Nov/2011 09:31:40] "GET /foo+bar/bar+baz HTTP/1.0" 404 35 0.0023

With the changes in this patch, I get:

127.0.0.1 - - [19/Nov/2011 09:39:55] "GET / HTTP/1.0" 200 - 0.0046
127.0.0.1 - - [19/Nov/2011 09:40:00] "GET /foo+bar/ HTTP/1.0" 200 - 0.0046
127.0.0.1 - - [19/Nov/2011 09:40:07] "GET /foo+bar/bar+baz HTTP/1.0" 200 - 0.0023

Webrick, nginx, thttpd, and apache all do the expected thing. The issue is not whether the client is failing to properly encode requests to the server (unless wget has managed to get it wrong for all these years). The issue is that Rack appears to be doing the Wrong Thing when decoding the requests and matching them against the files I am asking it to serve.

@richmeyers

What happens when you have foo bar and foo+bar in the same directory?

Also, what is in the directory listings for foo'bar and foo"bar?

@VictorLowther
@raggi
Owner

I am patching rack to correct the output of the Directory Index from Rack::Directory.

I will not support not escaping file names at this time. Escaping is expected and required by rack in order to conform to RFC.

If you believe this is in error, feel free to continue the discussion. At this time, supporting reading files from uris that are not correctly escaped will not be supported.

Thank you for your bug report, even though we may not agree on the solution, I have fixed the bug that is apparent in Rack::Directory.

@jeremy
Owner
I will not support not escaping file names at this time. Escaping is expected and required by rack in order to conform to RFC.

Of course file names should be escaped. The bug is that valid URI paths are incorrectly unescaped.

For example, + is explicitly allowed in paths.

From RFC 3986:

      path          = path-abempty    ; begins with "/" or is empty
                    / path-absolute   ; begins with "/" but not "//"
                    / path-noscheme   ; begins with a non-colon segment
                    / path-rootless   ; begins with a segment
                    / path-empty      ; zero characters

      path-abempty  = *( "/" segment )
      path-absolute = "/" [ segment-nz *( "/" segment ) ]
      path-noscheme = segment-nz-nc *( "/" segment )
      path-rootless = segment-nz *( "/" segment )
      path-empty    = 0<pchar>

      segment       = *pchar
      segment-nz    = 1*pchar
      segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
                    ; non-zero-length segment without any colon ":"

      pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Where sub-delims is "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

See also http://bugs.ruby-lang.org/issues/1680

@jeremy
Owner

This is still a bug. Worse, it's been worked around in Rails in an additionally buggy way, to great confusion and gnashing of teeth: rails/rails#11816 (comment)

The proposed fix here (removing unescaping) is wrong. The right fix is to unescape as pchar instead of the blanket Utils.unescape as a query component.

@spastorino spastorino reopened this
@spastorino spastorino added this to the Rack 1.6 milestone
@raggi
Owner

This is fundamentally the CGI heritage, which we should drop, but if we drop it, we have to do that as a major change, as we should drop it across the board, both in escape and unescape.

@spastorino spastorino modified the milestone: Rack 2.0, Rack 1.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 16, 2011
  1. @VictorLowther

    Unescaping PATH_INFO transforms /file_name.file on the filesystem to …

    VictorLowther authored
    …/file name.file from the POV of the web server, which is a Bad Thing.
  2. @VictorLowther

    Unescaping PATH_INFO transforms /file+name.file on the filesystem to …

    VictorLowther authored
    …/file name.file from the POV of the web server, which is a Bad Thing.
This page is out of date. Refresh to see the latest.
Showing with 2 additions and 2 deletions.
  1. +1 −1  lib/rack/directory.rb
  2. +1 −1  lib/rack/file.rb
View
2  lib/rack/directory.rb
@@ -56,7 +56,7 @@ def call(env)
def _call(env)
@env = env
@script_name = env['SCRIPT_NAME']
- @path_info = Utils.unescape(env['PATH_INFO'])
+ @path_info = env['PATH_INFO']
if forbidden = check_forbidden
forbidden
View
2  lib/rack/file.rb
@@ -32,7 +32,7 @@ def call(env)
F = ::File
def _call(env)
- @path_info = Utils.unescape(env["PATH_INFO"])
+ @path_info = env["PATH_INFO"]
parts = @path_info.split SEPS
return fail(403, "Forbidden") if parts.include? ".."
Something went wrong with that request. Please try again.