Browse files

PATH_INFO should be decoded by servers

  • Loading branch information...
1 parent ff8cc91 commit b2f90a3409238f07eee480a69af9e8e5d0adb35c @miyagawa miyagawa committed Sep 29, 2009
Showing with 23 additions and 1 deletion.
  1. +22 −0 FAQ.pod
  2. +1 −1 PSGI.pod
22 FAQ.pod
@@ -313,6 +313,28 @@ to implement an adapter to support PSGI. For instance,
L<Catalyst::Engine::PSGI> only differs in a few dozens of lines from
L<Catalyst::Engine::CGI> and was written in less than an hour.
+=head3 Why is PATH_INFO URI decoded?
+To be compatible to CGI spec (RFC 3875) and most web servers
+implementations like Apache and lighttpd.
+I understand it could be inconvenient that you can't distinguish
+C<foo%2fbar> from C<foo/bar> in the trailing path, but CGI spec
+clearly says C<PATH_INFO> should be decoded by servers, and that web
+servers can deny such requests containing C<%2f> since that way
+there's an information loss in PATH_INFO. Leaving those reserved
+characters undecoded (partial decoding) would make things worse, since
+then you can't tell C<foo%2fbar> from C<foo%252fbar> and could be a
+security hole with double encoding or decoding.
+For the record, WSGI (PEP-333) defines both C<SCRIPT_NAME> and
+C<PATH_INFO> be decoded and Rack leaves it implementation dependent,
+while L<fixing> most of PATH_INFO left encoded in Ruby web server
=head1 SEE ALSO
WSGI's FAQ clearly answers lots of questions about how some API design
@@ -104,7 +104,7 @@ C<PATH_INFO>: The remainder of the request URL's "path", designating
the virtual "location" of the request's target within the
application. This may be an empty string if the request URL targets
the application root and does not have a trailing slash. This value
-may be percent-encoded when it originates from a URL.
+should be URI decoded by servers to be compatible to RFC 3875.
=item *

0 comments on commit b2f90a3

Please sign in to comment.