New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guice servlet: HttpServletRequest.getPathInfo not decoded #745

Closed
gissuebot opened this Issue Jul 7, 2014 · 10 comments

Comments

Projects
None yet
5 participants
@gissuebot
Copy link

gissuebot commented Jul 7, 2014

From matt@hillsdon.net on April 11, 2013 06:52:21

getPathInfo in the anonymous HttpServletRequestWrapper in ServletDefinition.doService does not perform URL decoding, but it is required to do so by the servlet spec.

The inputs, getRequestURI and getContextPath, are provided by the container and are (correctly) provided without decoding.

The HttpServletRequest docs for getPathInfo say "a String, decoded by the web container".  Tomcat decodes it, and provides container-level configuration to control the character encoding used.  See http://wiki.apache.org/tomcat/FAQ/CharacterEncoding "How do I change how GET parameters are interpreted?"

I switched an existing application to use Guice to create my servlets and tests for character encoding failed as the raw string was returned from getPathInfo.

Original issue: http://code.google.com/p/google-guice/issues/detail?id=745

@gissuebot

This comment has been minimized.

Copy link

gissuebot commented Jul 7, 2014

From t.cservenak on February 13, 2014 12:53:27

Looks like a problem at this place, as it wraps original Request but calculates the pathInfo (should be decoded) using requestUrl (is not decoded): https://code.google.com/p/google-guice/source/browse/extensions/servlet/src/com/google/inject/servlet/ServletDefinition.java?r=389cb718a70cd11fdf9c336209246ebcfe944755#205

@gissuebot

This comment has been minimized.

Copy link

gissuebot commented Jul 7, 2014

From t.cservenak on February 14, 2014 03:40:34

Proposed patch sonatype/sisu-guice#8

@gissuebot

This comment has been minimized.

Copy link

gissuebot commented Jul 7, 2014

From ukjung.kim on February 19, 2014 20:01:47

I have same issue with it.
Gitiles gerrit plugin is not working well because of this issue.

Please refer following discussion. https://groups.google.com/forum/#!topic/repo-discuss/px6X8_HOMPQ

@oberlies

This comment has been minimized.

Copy link

oberlies commented Oct 9, 2014

BTW in the HttpServletRequest instances provided by Guice, getPathInfo() consists of an URI-encoded path and the query string. The spec requires that getPathInfo() only returns a de-coded path.

@sameb

This comment has been minimized.

Copy link
Contributor

sameb commented Oct 9, 2014

@oberlies , would you be able to supply a pull request with the fix+test? Thanks!

@mcculls

This comment has been minimized.

Copy link
Contributor

mcculls commented Oct 9, 2014

@sameb FYI there's a working patch+tests here: https://github.com/sonatype/sisu-guice/blob/master/PATCHES/GUICE_745_getpathinfo_not_decoded.patch which we've used successfully in production (this fix was originally suggested by Tamas a few comments above in #745 (comment))

@oberlies

This comment has been minimized.

Copy link

oberlies commented Oct 9, 2014

Sorry, no time for this now, but I can share the code & test how to get from the current getPathInfo() value to the correct one. It is probably a lot easier to apply the same approach to the Guice code than it would be for me.

public class HttpServletRequestWithDecodedPathInfo extends HttpServletRequestWrapper {

    private final String correctedPathInfo;

    /**
     * Creates a valid {@link HttpServletRequest} from the instance that Guice passes to servlets
     * (see Guice issue 745 - https://github.com/google/guice/issues/745)
     *
     * @param originalRequest
     *            The {@link HttpServletRequest} passed to
     *            {@link HttpServlet#service(ServletRequest, javax.servlet.ServletResponse)} by
     *            Guice.
     * @return An {@link HttpServletRequest} which returns only returns the decoded path from
     *         {@link HttpServletRequest#getPathInfo()}, as required by the spec.
     */
    public static HttpServletRequest createRepairedRequest(final HttpServletRequest originalRequest) {
        String originalPathInfo = originalRequest.getPathInfo();
        if (originalPathInfo == null) {
            return originalRequest;
        }
        if (!originalPathInfo.startsWith("/")) {
            // never observed
            throw new IllegalArgumentException("Invalid pathInfo: " + originalPathInfo);
        }

        return requestWithDecodedPathInfo(originalRequest, originalPathInfo);
    }

    private static HttpServletRequest requestWithDecodedPathInfo(final HttpServletRequest originalRequest,
            final String originalPathInfo) {
        String decodedPathInfo = extractAndDecodePath(originalPathInfo);

        if (decodedPathInfo.equals(originalPathInfo)) {
            return originalRequest;
        } else {
            return new HttpServletRequestWithDecodedPathInfo(originalRequest, decodedPathInfo);
        }
    }

    private static String extractAndDecodePath(final String uriEncodedPathAndQuery) {
        try {
            return new URI(uriEncodedPathAndQuery).getPath();
        } catch (URISyntaxException e) {
            // TODO use an exception type which produces a "bad request" responses
            throw new IllegalArgumentException(e);
        }
    }

    private HttpServletRequestWithDecodedPathInfo(final HttpServletRequest originalRequest,
            final String correctedPathInfo) {
        super(originalRequest);
        this.correctedPathInfo = correctedPathInfo;
    }

    @Override
    public String getPathInfo() {
        return correctedPathInfo;
    }
}

Test:

import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.CoreMatchers.nullValue;
import static org.junit.Assert.assertThat;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.when;
import javax.servlet.http.HttpServletRequest;
import org.junit.Test;

public class HttpServletRequestWithDecodedPathInfoTest {

    private HttpServletRequest subject;

    @Test
    public void testEmptyPathInfo() throws Exception {
        subject = HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo(null));
        assertThat(subject.getPathInfo(), is(nullValue()));
    }

    @Test
    public void testSimplePathInfo() throws Exception {
        subject = HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo("/simple/path"));
        assertThat(subject.getPathInfo(), is("/simple/path"));
    }

    @Test
    public void testEncodedPathInfo() throws Exception {
        subject =
            HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo("/uri+encoded%20path/with/percent-literal:%2520/"));
        assertThat(subject.getPathInfo(), is("/uri+encoded path/with/percent-literal:%20/"));
    }

    @Test(expected = IllegalArgumentException.class)
    public void testIncorrectlyEncodedPathInfo() throws Exception {
        subject =
            HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo("/pipe|is-illegal-in-URLs"));
    }

    @Test
    public void testDecodingKeepsPathIntact() throws Exception {
        subject =
            HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo("/http://may%20be%20part%20of%20the%20path"));
        assertThat(subject.getPathInfo(), is("/http://may be part of the path"));
    }

    @Test
    public void testPathInfoDoesntIncludeQuery() throws Exception {
        subject =
            HttpServletRequestWithDecodedPathInfo.createRepairedRequest(requestWithPathInfo("/path/with/%3Fquestionmark?querykey=val"));
        assertThat(subject.getPathInfo(), is("/path/with/?questionmark"));
    }

    private HttpServletRequest requestWithPathInfo(final String pathInfo) {
        HttpServletRequest request = mock(HttpServletRequest.class);
        when(request.getPathInfo()).thenReturn(pathInfo);
        return request;
    }
}
@sameb

This comment has been minimized.

Copy link
Contributor

sameb commented Oct 9, 2014

@cstamas -- could you sign the Google Contributor CLA @ https://developers.google.com/open-source/cla/individual (see the 'Sign Electronically' at the bottom) so I can incorporate your patch from #745 (comment) (also contributed to sisu @ sonatype/sisu-guice#8)? If you'd like to open a new pull request to Guice w/ the patch after signing the CLA, I can merge it directly from the PR, or I can manually patch your patch... but both need your CLA. Thanks!

cstamas added a commit to cstamas/guice that referenced this issue Oct 9, 2014

@cstamas

This comment has been minimized.

Copy link
Contributor

cstamas commented Oct 9, 2014

CLA signed, PR created:
#860

sameb added a commit that referenced this issue Oct 10, 2014

@sameb

This comment has been minimized.

Copy link
Contributor

sameb commented Oct 10, 2014

PR merged.

@sameb sameb closed this Oct 10, 2014

openstack-gerrit pushed a commit to openstack-infra/gerrit that referenced this issue Oct 17, 2014

Extract path info from requests without decoding
The fix for Guice issue #745[1] causes getPathInfo() within the
GuiceFilter to return decoded values, eliminating the difference
between "foo/bar" and "foo%2Fbar". This is in spec with the servlet
standard, whose javadoc for getPathInfo[2] states that the return
value be "decoded by the web container".

Work around this by extracting the path part directly from the request
URI, which is unmodified by the container. This is copying the Guice
behavior prior to the bugfix. We do this with a static method rather
than using our own HttpServletRequestWrapper as we don't want to
produce a request wrapper that is not in spec.

[1] google/guice#745
[2] http://docs.oracle.com/javaee/7/api/javax/servlet/http/HttpServletRequest.html#getPathInfo()

Change-Id: I48fb32710d45f573fb167c6344ad76541672858c

hubot pushed a commit to eclipse/jgit that referenced this issue Dec 10, 2014

David Pletcher
Extract path info from requests without decoding
Gitiles malfunctions in conjunction with jgit and guice
because of a recent Guice bug fix. Work around the problem
by parsing the URI directly, bypassing the unescaping
performed by the getPathInfo method.

This rest of this message is copied from
https://gerrit-review.googlesource.com/#/c/60820/ :

The fix for Guice issue #745[1] causes getPathInfo() within the
GuiceFilter to return decoded values, eliminating the difference
between "foo/bar" and "foo%2Fbar". This is in spec with the servlet
standard, whose javadoc for getPathInfo[2] states that the return
value be "decoded by the web container".

Work around this by extracting the path part directly from the request
URI, which is unmodified by the container. This is copying the Guice
behavior prior to the bugfix.

[1] google/guice#745
[2] http://docs.oracle.com/javaee/7/api/javax/servlet/http/HttpServletRequest.html#getPathInfo()

Change-Id: I7fdb291bda377dab6160599ee537962d5f60f1e8
Signed-off-by: David Pletcher <dpletcher@google.com>

uwolfer pushed a commit to gerrit-review/gerrit that referenced this issue Apr 7, 2015

Saša Živkov
Workaround Guice bug "getPathInfo not decoded"
Due to the Guice issue [1] cloning of a repository with a space in its
name was impossible. If the space wasn't encoded:

  $ git clone 'http://gerrit/My Project'

then such a request failed already in Jetty code.
If the space was encoded:

  $ git clone 'http://gerrit/My+Project'

or:

  $ git clone 'http://gerrit/My%20Project'

then the HttpServletRequest.getPathInfo in the GitOverHttpServlet would
return a non-decoded string i.e. "My%20Project' and the project wouldn't
be found. The HttpServletRequest.getPathInfo is expected to return a
decoded string [2].

Unfortunately, there is no Guice release which contains the bugfix for
the issue [1].

Workaround: decode the projectName in the GitOverHttpServlet.Resolver
explicitly.

[1] google/guice#745
[2] http://docs.oracle.com/javaee/7/api/javax/servlet/http/HttpServletRequest.html#getPathInfo()

Change-Id: I8872cd2d1bab3f8e454d0d7399b35eb850068042

dpursehouse added a commit to gerrit-review/gerrit that referenced this issue Oct 19, 2015

Revert "Workaround Guice bug "getPathInfo not decoded""
This reverts commit 304130b.

This was added as a workaround for Guice issue #745 [1].

Since then, a fix [2] has been merged and is included in Guice
version 4.0, which we are now using.

[1] google/guice#745
[2] cstamas/guice@41c126f

Change-Id: I8048a9e8897d8b4e42a0f518d2deed5e17264bcf

lucamilanesio pushed a commit to GerritCodeReview/plugins_x-docs that referenced this issue Dec 10, 2015

Fix retrieval of path from URL
The XDocServlet needs the endcoded URL path, but it used
HttpServletRequest.getPathInfo() to retrieve it which returns the
decoded path. So far this worked because there was a bug in Guice [1]
that made HttpServletRequest.getPathInfo() return the encoded path.
Now this bug was fixed in Guice and Gerrit was updated to use the new
Guice version. Hence the XDocServlet started to fail and needs to be
adapted.

[1] google/guice#745

Change-Id: Ia14908ed0970e9e5eda0569e345c14fb5f255709
Signed-off-by: Edwin Kempin <ekempin@google.com>
(cherry picked from commit 10c5263)

lucamilanesio pushed a commit to GerritCodeReview/plugins_x-docs that referenced this issue Dec 10, 2015

Fix retrieval of path from URL
The XDocServlet needs the endcoded URL path, but it used
HttpServletRequest.getPathInfo() to retrieve it which returns the
decoded path. So far this worked because there was a bug in Guice [1]
that made HttpServletRequest.getPathInfo() return the encoded path.
Now this bug was fixed in Guice and Gerrit was updated to use the new
Guice version. Hence the XDocServlet started to fail and needs to be
adapted.

[1] google/guice#745

Change-Id: Ia14908ed0970e9e5eda0569e345c14fb5f255709
Signed-off-by: Edwin Kempin <ekempin@google.com>

lucamilanesio pushed a commit to GerritCodeReview/plugins_imagare that referenced this issue Dec 21, 2015

Fix retrieval of path from URL
The ImageServlet needs the encoded URL path, but it used
HttpServletRequest.getPathInfo() to retrieve it which returns the
decoded path. So far this worked because there was a bug in Guice [1]
that made HttpServletRequest.getPathInfo() return the encoded path.
Now this bug was fixed in Guice and Gerrit was updated to use the new
Guice version. Hence the ImageServlet started to fail and needs to be
adapted.

[1] google/guice#745

Change-Id: Ie44786acaa70b9612572727dad856d3fdcf472ae
Signed-off-by: Edwin Kempin <ekempin@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment