Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LinkBuilderSupport.toUri() double-encodes request parameters #1722

Closed
jochenberger opened this issue Nov 23, 2021 · 20 comments
Closed

LinkBuilderSupport.toUri() double-encodes request parameters #1722

jochenberger opened this issue Nov 23, 2021 · 20 comments
Assignees
Labels
in: core Core parts of the project type: bug
Milestone

Comments

@jochenberger
Copy link

We're experiencing a change in behavior regarding path variable expansion.
Consider the following Java code and output:

    public static void main(String[] args) {
        Link l = Link.of("/foo/{data}");
        String data = Base64.getUrlEncoder()
                .encodeToString("Hello".getBytes(StandardCharsets.UTF_8));
        System.out.println("data: " + data);
        System.out.println(l.expand(data).getHref());
    }

Output:

data = SGVsbG8=
/foo/SGVsbG8%3D

In 1.4.0, the "=" is encoded as "%3D", in 1.3.x, it wasn't. This causes issues with Spring Security because a request to the resulting URL is rejected with org.springframework.security.web.firewall.RequestRejectedException: The request was rejected because the URL contained a potentially malicious String "%25".
This is probably related to db1cd5d (#1485).
Are we doing something wrong?

@odrotbohm
Copy link
Member

It's likely to be #1583, which implements encoding according to the rules defined in RFC 6570. That lists the equals sign (=) as reserved character. The section for simple String expansions ({var}) the defines:

For each defined variable in the variable-list, perform variable expansion, as defined in Section 3.2.1, with the allowed characters being those in the unreserved set.

I read this as: as = is part of the reserved set, it needs to be encoded on expansion. @rwinch, can you elaborate on why the exception is triggered and the encoded = is considered malicious?

@odrotbohm odrotbohm self-assigned this Nov 23, 2021
@odrotbohm odrotbohm changed the title Legal characters in path variables are URL-encoded in 1.4.0 Unexpected character encoding in path variables in 1.4 Nov 23, 2021
@jochenberger
Copy link
Author

We're using the resulting URL to perform a query with MockMvc. Since Spring Security complains about %25 and not about %3D, maybe they're encoding the URL again?

@dpasek-senacor
Copy link

@jochenberger We are struggling with the same problem.

Yes, MockMvc performs an additional encoding on URIs. So, if you are directly taking an href from a HATEOAS link that contains an encoded char it will perform a double encoding.
Our example

Variable value is in the format clientx:123456
URI template is /acounts/{variable}

WebMvcLinkBuilder in Spring HATEOAS 1.3.x created an URI
/accounts/clientx:123456 which is a valid URI. Colons are allowed for path segments.
This URI could directly be used by MockMvc to perform a request.

WebMvcLinkBuilder in Spring HATEOAS 1.4.x creates now URI /accounts/clientx%3A123456
If we now use the URI with WebMVC it will be encoded a second time: /accounts/clientx%253A123456 which is now longer a valid URI on our server and might break security rules as on the server it will be decoded to /accounts/clientx%3A123456 again.

A workaround is to decode the links from HATEOAS for MockMVC, e.g.: mockMvc.perform(put(decode(putAccountLink.getHref(), UTF_8)) I do not like this, but it works.

We are still in the analysis if this new behavior breaks other clients on our side, e.g. calls done by RestTemplate with variable expansion. or our Angular/React single page applications.

I think the root cause can be found in the implementation of the encoding of the `TemplateVariables, see:
https://github.com/spring-projects/spring-hateoas/blob/main/src/main/java/org/springframework/hateoas/TemplateVariable.java#L512

public String encode(String value) {
    switch (this) {
        case DOT:
        case SEGMENT:
        case PATH_SEGMENT:
        case PATH_STYLE_PARAMETER:
        case REQUEST_PARAM:
        case REQUEST_PARAM_CONTINUED:
        case SIMPLE:
            return UriUtils.encode(value, StandardCharsets.UTF_8);
        case FRAGMENT:
        default:
            return UriUtils.encodePath(value, StandardCharsets.UTF_8);
    }
}

The different variable types are encoded either by UriUtils.encode(...) or UriUtils.encodePath(...) but the mapping is not right in my opinion.

  • SEGMENT, PATH_SEGMENT, SIMPLE and PATH_VARIABLE should be encoded using UriUtils.encodePath(...)
  • REQUEST_PARAM and REQUEST_PARAM_CONTINUED should be encoded using UriUtils.encodeRequestParam(...)
  • FRAGMENT should be encoded using UriUtils.encodeFragment(...)
    and so on.

@odrotbohm
Copy link
Member

That's useful feedback, thank you. One thing you have to be careful about is to note that UriUtils does not claim to follow the encoding rules for UriTemplates. The rules for the latter are defined here and the unit tests for our UriTemplate expansion have been based on exactly the examples given in that RFC. To get all of those green, the implementation has to use UriUtils the way it uses it right now. I've been quite surprised by that oddity, too. Feel free to tweak the switch statement to rather use the apparently more matching methods of UriUtils and see unit tests for the template encoding fail left and right. I assume changing the encoding rules in UriUtils has quite a ripple effect, that's unlikely to be bearable for Spring applications.

I'll take this back to the Spring team and see what they think.

@dpasek-senacor
Copy link

@odrotbohm Thx for your insights. The RFC for UriTemplates is really odd as it does not seems to reflect any context information regarding the URI component the variable is used in. Always only allowing the "unreserved characters" independent of the URI component seems to be quite restrictive to build URIs via templates.

Nevertheless the encoded URI are valid URIs and would work wenn used in a browser or other clients but in combination with Spring MVC test framework (aka MockMvc) we now have a problem as links extracted from HAL resources can no longer directly be used in MockMvc if a reserved character with would be valid for a path is used for a variable value.

Should I create a simple demonstration project / test case for this behaviour?

@odrotbohm
Copy link
Member

Anything small that helps us see the problem is appreciated.

@dpasek-senacor
Copy link

dpasek-senacor commented Jan 21, 2022

Simple test project to clarify the unexpected / changed behavior:
spring-hateoas-mockmvc.zip

Using Spring Boot 2.5.x with Spring HATEOAS 1.3.x all test will be successful.

@dpasek-senacor
Copy link

I have extended the sample project to show case that it might also break Spring RestTemplate.
RestTemplate uses own URI template handler causing double encoding.

spring-hateoas-mockmvc.zip

@dpasek-senacor
Copy link

dpasek-senacor commented Jan 21, 2022

After some further tests I need some clarification how HAL/HATEOAS links schould be handled by a client as this problem also occurs with URLs containing request parameters that contain character like a space ' '.

  • Imagine an endpoint for searching resources. This endpoint support a single query param for search terms: /find?term={search_term}
  • The search term might include the space or other characters to be URI encoded.
  • This endpoint creates a HAL response with a self link pointing to the search, e.g. /find?term=hello%20world
  • The client now extract this URL from JSON document and executes the call via a spring RestTemplate using the overloaded method accepting plain URI strings. This will lead to a double encoding, e.g. the HTTP request will be /find?term=hello%2520world. On erver side URI will be decoded to hello%20world which is not the original search term.

So my question is:
Is it mandatory for a client of HAL endpoints to decode or parse all URIs extracted from HAL content before using?
Clients which provide an API of plain text strings will perform URI encoding for these strings leading to this double encoding.
Is this the way Spring / Spring HATEOAS should be used or not?
Should all operations in Spring API with take URL strings be as unencoded URI templates, i.e. we must decode any URI strings provided by HAL links.

I also checked the Traverson client implementation and they avoid this issue as they internally always directly transform the plain text URIs to parsed URIs before using the RestOperations for performing the client call.

@dpasek-senacor
Copy link

Some further insights: the documentation of Spring is not very consistent in regard of the various REST clients and the behaviour of URI-String- and URI-based methods.

The documentation of MockMvc is good and clarifies the different methods:
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/test/web/servlet/request/MockMvcRequestBuilders.html#get-java.lang.String-java.lang.Object...-

urlTemplate - a URL template; the resulting URL will be encoded

https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/test/web/servlet/request/MockMvcRequestBuilders.html#get-java.net.URI-

uri - the URL

So no confusion here. It should be clear that you should use URI-based method if you have a complete URL and the String-based method should be used if expanding and encoding is still necessary.

The documentation of RestOperations is inconsistent across the method overloading:
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/web/client/RestOperations.html
The terms "URI template" and so on are used not in a proper way, for example in the exchange(...) methods.

So it seems to me, that Spring HATEOAS does not have a bug, but the usage patterns in combination with the different Spring.-based clients might need better, more consistent documentation with examples.

@ThanksForAllTheFish
Copy link

I am not sure this is fully related, but I created a repo with a very very basic test and it seems there is indeed some issue in spring hateoas: https://github.com/ThanksForAllTheFish/doubleencoding/blob/main/src/test/kotlin/com/t4atf/doubleencoding/DoubleEncodingTest.kt

@Test
fun doubleEncoding() {
	Assertions.assertEquals(
		URI.create("/root?param=${UriUtils.encode("I+will:be+double+encoded", Charset.defaultCharset())}"),
		linkTo<MyController> { test("I+will:be+double+encoded") }.toUri()
	)
}

fails with

org.opentest4j.AssertionFailedError: 
Expected :/root?param=I%2Bwill%3Abe%2Bdouble%2Bencoded
Actual   :/root?param=I%252Bwill%253Abe%252Bdouble%252Bencoded

toUriComponentBuilder() is affected as well.

The issue is the the backing uri components has its encodeState set to RAW, which then causes a seconding encoding pass when this.components.toUri() is called

@odrotbohm @dpasek-senacor (please apologies if it was not ok to tag you)

@odrotbohm
Copy link
Member

Thanks for the sample, but I'm not gonna debug a project written in Kotlin if the issue to be discussed does not involve Kotlin in the first place. Please avoid anything that distracts from the actual problem to investigate.

@ThanksForAllTheFish
Copy link

I converted the project to java, relevant test is https://github.com/ThanksForAllTheFish/doubleencoding/blob/java/src/test/java/com/t4atf/doubleencoding/DoubleEncodingTest.java

Again, doubleEncoding fails with

org.opentest4j.AssertionFailedError: 
Expected :/root?param=I%2Bwill%3Abe%2Bdouble%2Bencoded
Actual   :/root?param=I%252Bwill%253Abe%252Bdouble%252Bencoded

@odrotbohm
Copy link
Member

Thanks! I'll have a look ASAP.

@odrotbohm
Copy link
Member

There's a mismatch in the way the underlying UriComponents produce the URI. I guess we need to take a step back and debate whether what we see, is really a bug. I wonder why exactly you consider the method, that is not re-encoding, correct? The reason I ask is this: if you issue a request to /root?param=I+will:be+double+encoded, the controller will not see an encoded value in the first place. That means, it would never try to produce the link using already encoded values in the first place. That's why the values given as parameters for the dummy method invocation have to be encoded as there's no way we can find out whether they are encoded already or not.

@ThanksForAllTheFish
Copy link

Are you saying I am using WebMvcLinkBuilder.linkTo(WebMvcLinkBuilder.methodOn(MyController.class).test("I+will:be+double+encoded")).toUri() incorrectly in my example?

Sorry if this is a dumb question, but in my eyes the value used is not encoded (I+will:be+double+encoded) so I expect the result to be encoded only once, but it gets encoded twice, once in WebMvcLinkBuilder.linkTo and once in toUri. Meaning that (assumption, probably wrong) https://github.com/spring-projects/spring-hateoas/blob/main/src/main/java/org/springframework/hateoas/server/core/WebHandler.java#L267 is encoding too aggressively; this could be the mismatch you mention, right?

I am not sure about https://github.com/spring-projects/spring-hateoas/blob/main/src/main/java/org/springframework/hateoas/server/core/WebHandler.java#L166, because the query parameter at that point is encoded, but buildAndExpand set the encoding flag to RAW, which seems wrong (?)

Again, sorry for babbling, I have a small example here, but I am sure that on a larger scale there are more considerations in place that I am not seeing

@odrotbohm
Copy link
Member

You're essentially right, but we're running into an (apparent) limitation of UriComponentsBuilder. We would like to set this up from a template containing variables, but equip it with fully-encoded values for both parameters and variable expansion. I currently cannot find a way to achieve this and took this back to the Framework team. In the meantime, I've tweaked the LinkBuilderSupport.toUri() to rather call ….toUriString(…) that skips the encoding, which is what the code path creating links does anyway. A call to LBS.toUri() of course expects the underlying arrangement to be fully resolved to produce a proper URI, but that would've failed with the previous version, too.

I'll push a fix in a minute that basically has the gist of it in its commit message, too.

@odrotbohm odrotbohm added type: bug in: core Core parts of the project labels May 16, 2022
@odrotbohm odrotbohm added this to the 2.0 M4 milestone May 16, 2022
@odrotbohm odrotbohm changed the title Unexpected character encoding in path variables in 1.4 LinkBuilderSupport.toUri() double-encodes request parameters May 16, 2022
odrotbohm added a commit that referenced this issue May 16, 2022
odrotbohm added a commit that referenced this issue May 16, 2022
odrotbohm added a commit that referenced this issue May 16, 2022
odrotbohm added a commit that referenced this issue May 16, 2022
…rt.toUri(…).

We unfortunately cannot use UriComponentsBuilder in a way that we can populate it with encoded parameters *and* expand encoded path variable values. We currently use ….buildAndExpand(…) which considers the values provided unencoded but we never actually call ….toUri() on the resulting UriComponents instance.

We unfortunately cannot fix the problem at its root, as the only alternative would be to call ….build(true) indicating values already encoded but that stumbles above the template variables still present in the original template.

The only workaround right now is never calling UriComponents.toUri() but ….toUriString() as that doesn't apply the pending encoding that's not actually needed as we start with fully encoded values in the first place.
@odrotbohm
Copy link
Member

That's in place now. Fixed for 2.0, 1.5 and 1.4. Feel free to give the snapshots a try.

@Integer93
Copy link

Do you know when this will be available in a release? We need that fix for our project.

@nealeu
Copy link

nealeu commented Jan 1, 2024

A footnote for anyone arriving here is to watch out for passing Link.getHref() to RestTemplate.getForEntity() etc because passing a string triggers the restTemplate to double encode and therefore you get the %25 issue. Changing to passing Link.toUri() as that argument solved the issue for me.

It is also possible to configure the restTemplate to only encode values via org.springframework.web.util.DefaultUriBuilderFactory.EncodingMode#VALUES_ONLY

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Core parts of the project type: bug
Projects
None yet
Development

No branches or pull requests

6 participants