Skip to content

Commit

Permalink
PATH_INFO should be decoded by servers
Browse files Browse the repository at this point in the history
  • Loading branch information
miyagawa committed Sep 29, 2009
1 parent ff8cc91 commit b2f90a3
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 1 deletion.
22 changes: 22 additions & 0 deletions FAQ.pod
Original file line number Diff line number Diff line change
Expand Up @@ -313,6 +313,28 @@ to implement an adapter to support PSGI. For instance,
L<Catalyst::Engine::PSGI> only differs in a few dozens of lines from
L<Catalyst::Engine::CGI> and was written in less than an hour.

=head3 Why is PATH_INFO URI decoded?

To be compatible to CGI spec (RFC 3875) and most web servers
implementations like Apache and lighttpd.

I understand it could be inconvenient that you can't distinguish
C<foo%2fbar> from C<foo/bar> in the trailing path, but CGI spec
clearly says C<PATH_INFO> should be decoded by servers, and that web
servers can deny such requests containing C<%2f> since that way
there's an information loss in PATH_INFO. Leaving those reserved
characters undecoded (partial decoding) would make things worse, since
then you can't tell C<foo%2fbar> from C<foo%252fbar> and could be a
security hole with double encoding or decoding.

For the record, WSGI (PEP-333) defines both C<SCRIPT_NAME> and
C<PATH_INFO> be decoded and Rack leaves it implementation dependent,
while L<fixing> most of PATH_INFO left encoded in Ruby web server
implementations.

L<http://www.python.org/dev/peps/pep-0333/#url-reconstruction>
L<http://groups.google.com/group/rack-devel/browse_thread/thread/ddf4622e69bea53f>

=head1 SEE ALSO

WSGI's FAQ clearly answers lots of questions about how some API design
Expand Down
2 changes: 1 addition & 1 deletion PSGI.pod
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ C<PATH_INFO>: The remainder of the request URL's "path", designating
the virtual "location" of the request's target within the
application. This may be an empty string if the request URL targets
the application root and does not have a trailing slash. This value
may be percent-encoded when it originates from a URL.
should be URI decoded by servers to be compatible to RFC 3875.

=item *

Expand Down

0 comments on commit b2f90a3

Please sign in to comment.