Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boundary info in Content-Type multipart request header not parsed correctly because of charset [SPR-17030] #21568

Closed
spring-issuemaster opened this issue Jul 11, 2018 · 13 comments
Assignees
Milestone

Comments

@spring-issuemaster
Copy link
Collaborator

@spring-issuemaster spring-issuemaster commented Jul 11, 2018

bissorc opened SPR-17030 and commented

I’m using Spring Rest Template to upload a file. Code looks as follows:

MultipartBodyBuilder builder = new MultipartBodyBuilder();
builder.part("file", "2;3;4".getBytes());
MultiValueMap<String, HttpEntity<?>> body = builder.build();
HttpHeaders headers = new HttpHeaders();
headers.setAccept(Collections.singletonList(MediaType.ALL));
headers.setContentType(MediaType.MULTIPART_FORM_DATA);
HttpEntity<MultiValueMap<String, Object>> requestEntity = new HttpEntity(body, headers);
RestTemplate restTemplate = new RestTemplate();
restTemplate.postForLocation("***", requestEntity, String.class);

This leads to the following raw request:

POST *** HTTP/1.1
Accept: */*
Content-Type: multipart/form-data;boundary=059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY;charset=UTF-8
User-Agent: Java/1.8.0_121
Host: ***:***
Connection: keep-alive
Content-Length: 187

--059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY
Content-Disposition: form-data; name="file"
Content-Type: text/plain;charset=UTF-8
Content-Length: 5

2;3;4
--059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY—

The problem is the following line:

Content-Type: multipart/form-data;boundary=059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY;charset=UTF-8

Spring Rest Template adds ;charset=UTF-8 to the content type and this leads to problems when. I found no possibility to remove the charset. From my point of view this is a bug since the charset should not be part of the content type?!


Affects: 5.0.7

Issue Links:

  • #21599 Multipart: Invalid boundary with RestTemplate ("is duplicated by")

Referenced from: commits f89511e, 390bb87

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 11, 2018

Brian Clozel commented

Could you elaborate about the issue you're seeing here?

What's happening? What were you expecting instead? What makes you think this charset declaration is at fault here? Is the RFC forbidding that?

In general, a small sample showing the issue should help us work on this issue.

Thanks!

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 11, 2018

bissorc commented

I’m not an expert in the RFC stuff. I just find out that other clients do not append ;charset… to the Content-Type if it is multipart/form-data. There is also no examples in the RFCs where additional parameters are appended after the boundary. RFC (https://tools.ietf.org/html/rfc7578#section-8) states: Type name: multipart, Subtype name: form-data, Required parameters: boundary, Optional parameters: none. You can also see at https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type that multipart/form-data has just the boundary whereas other content types such as text/html have the charset. Moreover, you find there some links to RFCs.

I came across the problem when trying to post a file for the H2O (http://www.h2o.ai) server using http://localhost:54321/3/PostFile endpoint.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 11, 2018

Brian Clozel commented

Thanks for the additional information.

https://tools.ietf.org/html/rfc7578#section-8 seems to be talking about the mime type itself, and not the Content-Type header, which (as far as I understand), has a charset directive no matter what. The second link you provided shows exactly that.

Also, I'm wondering if this charset information was not provided for a specific purpose (I need to dig into the code history for that). If we don't need that charset information, removing it might be a good idea anyway.

What happened when you sent that request to H2O?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 12, 2018

bissorc commented

The H2O endpoint interprets not only the actual content (2;3;4) as content, but also the boundary (xxx). In other words, when sending with other clients (Apache, Postman) or with Spring Rest Template (after the charset is removed), 5 bytes (2;3;4) are interpreted. When sending with Spring Rest Template (charset is there), ~ 45 bytes (
2;3;4--059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY--) are interpreted.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 12, 2018

Brian Clozel commented

Thanks for the follow-up bissorc!

I've tracked down that charset parameter - it was introduced in #19769 (especially in 75117f4). I think this piece of information is the "form-charset", as explained in RFC-7578 Section 5.1.2. The commit itself suggests that this piece of information may be required by the server to properly decode the request body itself. Given that information, I'm not really in favor of removing this bit from the HTTP request.

Now the behavior you're explaining here - H2O not detecting the boundary if there's another directive in the Content-Type header - suggests that this might be a parsing issue in H2O itself. If that charset information was incorrect, the server would probably decode the request body using the wrong charset. In this case, it seems H2O is completely missing the boundary information. Could you raise an issue on their side about this?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 12, 2018

bissorc commented

Thanks, Brian Clozel. I will do that. Just one follow-up question: in the case I would like to change the charset for the multipart/form-data, how can I do that? Thanks.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 19, 2018

Joe Amoros commented

Brian Clozel - Thank you for your response in #21599. I will begin my investigation on httpd's side. However, 

Shouldn't adding the charset after the boundary be optional? Or at least be able to omit it? So that the client can still connect to servers that have not caught up with the standard? Or at least, just allow backwards compatibility.

It seems to me that after upgrading to Spring 5 my options are:

  1. Investigate whats wrong with httpd, hope that its a bug and has been fixed and patch the server. In this case I do have access to the apache server, however others might not.
  2. Rewrite my client without using RestTemplate.

At this point, time is not in my favor, so option two is looking better. At the moment I can whitelist the request URI in modsecurity but that can't last long since we do not want to risk any issues.

Thank you!

 

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 19, 2018

Brian Clozel commented

Joe Amoros,

It already is optional, as the FormHttpMessageConverter is supporting several encoding strategies. For example, and if you're in control of the files' encoding, you can try and configure your own FormHttpMessageConverter on the RestTemplate, and set its multipart charset. Let me know how this works.

Now while I get the unpleasant effect of upgrading and seeing things failing, I'm still quite surprised so many servers are failing at parsing an HTTP header (again, I might be missing something here in the RFCs). I don't know if it is a special case for multipart requests only, but adding directives to request headers isn't such a rare use case.

We could also consider moving the charset directive before the boundary one, working around the parsing bugs in some servers. Could you try to manually send such a request to your server and see if it works? Something like

Content-Type: multipart/form-data;charset=UTF-8;boundary=059h2BBM-KlM_XP2rY8W1X3_jnzFLcYY
 
@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 19, 2018

bissorc commented

Brian Clozel: for the H2O server this will work:

multipart/form-data;charset=UTF-8;boundary=5dbb2562-8a5d-48e2-bac6-3539964d75e5
Response(status=200, payload={destination_frame=upload_a794bb7e848fb4f86b31169963ca41fb, total_bytes=5}, headers=ResponseHeaders(headers={null=[HTTP/1.1 200 OK], X-h2o-context-path=[/], X-h2o-cluster-good=[true], X-h2o-rest-api-version-max=[3], X-h2o-cluster-id=[1532014457068], X-h2o-build-project-version=[3.20.0.2], Content-Length=[85], Content-Type=[application/json;charset=ISO-8859-1]}, contentLength=85, contentType=application/json;charset=ISO-8859-1, contentETag=null, lastModified=1970-01-01T00:00:00Z, contentDisposition=null, location=null), serverUnreachable=false)
@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 20, 2018

Brian Clozel commented

I've just pushed a fix that switches the ordering of the Content-Type directives.

Since this solves the issue for H2O, hopefully this could solve it as well for other servers.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 20, 2018

Joe Amoros commented

Brian Clozel - Sorry, had a hectic couple of days. I was not able to get it to work messing around the messageAdapter, and I'm not able to replicate it sending in a raw body via postman. I tried

POST /documentUpload/multipleUpload?j_username=***&j_password=***&ert=json&uploaderEmail=*** HTTP/1.1
Accept: application/json, application/*+json
Content-Type: multipart/form-data;boundary=TZxXnSfSXQlmvRM3BDU-pz1xDJvno-acZxKIA;charset=UTF-8
User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_112)
Host: ***
Connection: Keep-Alive
Content-Length: 265--TZxXnSfSXQlmvRM3BDU-pz1xDJvno-acZxKIA--
Content-Disposition: form-data; name=files; filename=file
Content-Type: application/pdf
Content-Length: 1

1
--TZxXnSfSXQlmvRM3BDU-pz1xDJvno-acZxKIA--
 

But Apache2 isn't complaining. I look forward to your fix in 5.0.8 to try via code.

Thanks!

 

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 20, 2018

Brian Clozel commented

Strange, with that request body (with the charset in the end), you should have triggered the issue you've created here. Does it mean this is not linked to that charset after all?

You can try it right away as it's available as snapshots, just use the 5.0.8.BUILD-SNAPSHOT version and add the spring snapshot repositories like this:

 

<repositories>
	<repository>
		<id>spring-snapshots</id>
		<name>Spring Snapshots</name>
		<url>https://repo.spring.io/snapshot</url>
		<snapshots>
			<enabled>true</enabled>
		</snapshots>
	</repository>
</repositories>
@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Jul 20, 2018

Joe Amoros commented

Brian Clozel - Hmm bummer. Just gave it a try. I see the change, but still getting the error. Here's the logging and below, the error. I removed the body from the logging to save space. I'll keep digging. Thanks again!

 

[org.apache.http.wire] - < >> "POST /documentUpload/multipleUpload?j_username=***&j_password=***&ert=json&uploaderEmail=*** HTTP/1.1[\r][\n]">
[org.apache.http.wire] - < >> "Accept: application/json, application/*+json[\r][\n]">
[org.apache.http.wire] - < >> "Content-Type: multipart/form-data;charset=UTF-8;boundary=DClQ6zimejfgBMQYPEmNpyOS0f6426[\r][\n]">
[org.apache.http.wire] - < >> "Content-Length: 32084[\r][\n]">
[org.apache.http.wire] - < >> "Host: ***[\r][\n]">
[org.apache.http.wire] - < >> "Connection: Keep-Alive[\r][\n]">
[org.apache.http.wire] - < >> "User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_112)[\r][\n]">
[org.apache.http.wire] - < >> "[\r][\n]">
[org.apache.http.headers] - <>> POST /documentUpload/multipleUpload?j_username=***&j_password=***&ert=json&uploaderEmail=*** HTTP/1.1>
[org.apache.http.headers] - <>> Accept: application/json, application/*+json>
[org.apache.http.headers] - <>> Content-Type: multipart/form-data;charset=UTF-8;boundary=DClQ6zimejfgBMQYPEmNpyOS0f6426>
[org.apache.http.headers] - <>> Content-Length: 32084>
[org.apache.http.headers] - <>> Host: ***>
[org.apache.http.headers] - <>> Connection: Keep-Alive>
[org.apache.http.headers] - <>> User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_112)>
[org.apache.http.wire] - < >> "--DClQ6zimejfgBMQYPEmNpyOS0f6426[\r][\n]">
[org.apache.http.wire] - < >> "Content-Disposition: form-data; name="files"; filename="99999888_inv_LP1003.99999888_invStmt_20161001_20161231_invStmt_.pdf"[\r][\n]">
[org.apache.http.wire] - < >> "Content-Type: application/pdf[\r][\n]">
[org.apache.http.wire] - < >> "Content-Length: 31830[\r][\n]">
[org.apache.http.wire] - < >> "[\r][\n]">
[[BODY]]
[org.apache.http.wire] - < >> "--DClQ6zimejfgBMQYPEmNpyOS0f6426--[\r][\n]">
[org.apache.http.wire] - < << "HTTP/1.1 400 Bad Request[\r][\n]">
[org.apache.http.wire] - < << "Date: Fri, 20 Jul 2018 20:12:04 GMT[\r][\n]">
[org.apache.http.wire] - < << "Server: ***[\r][\n]">
[org.apache.http.wire] - < << "Content-Length: 226[\r][\n]">
[org.apache.http.wire] - < << "Connection: close[\r][\n]">
[org.apache.http.wire] - < << "Content-Type: text/html; charset=iso-8859-1[\r][\n]">
[org.apache.http.wire] - < << "[\r][\n]">[org.apache.http.impl.conn.DefaultClientConnection] - <Receiving response: HTTP/1.1 400 Bad Request>
[org.apache.http.headers] - <<< HTTP/1.1 400 Bad Request>
[org.apache.http.headers] - <<< Date: Fri, 20 Jul 2018 20:12:04 GMT>
[org.apache.http.headers] - <<< Server: ***>
[org.apache.http.headers] - <<< Content-Length: 226>
[org.apache.http.headers] - <<< Connection: close>
[org.apache.http.headers] - <<< Content-Type: text/html; charset=iso-8859-1>

 

From Apache/ModSecurity

[Fri Jul 20 16:12:04.012821 2018] [:error] [pid 31042:tid 140693741733632] [client 10.139.48.106] ModSecurity: Multipart parsing error (init): Multipart: Invalid boundary in C-T (malformed). [hostname "***"] [uri "/restless/documentUpload/multipleUpload"] [unique_id "W1JCFH8AAAEAAHlC1@8AAABM"]

[Fri Jul 20 16:12:04.247460 2018] [:error] [pid 31042:tid 140693741733632] [client 10.139.48.106] ModSecurity: Access denied with code 400 (phase 2). Match of "eq 0" against "REQBODY_ERROR" required. [file "/etc/modsecurity/modsecurity.conf"] [line "57"] [id "200001"] [msg "Failed to parse request body."] [data "Multipart: Invalid boundary in C-T (malformed)."] [severity "CRITICAL"] [hostname "***"] [uri "/restless/documentUpload/multipleUpload"] [unique_id "W1JCFH8AAAEAAHlC1@8AAABM"] 

 

A little extra info, this is the config that is failing in modsecurity

1. Verify that we've correctly processed the request body.
1. As a rule of thumb, when failing to process a request body
1. you should reject the request (when deployed in blocking mode)
1. or log a high-severity alert (when deployed in detection-only mode).
#
SecRule REQBODY_ERROR "!@eq 0" \
"id:'200001', phase:2,t:none,log,deny,status:400,msg:'Failed to parse request body.',logdata:'%{reqbody_error_msg}',severity:2" 

 

One last bit, if I add a whitelist in the modsecurity config like so:

SecRule REQUEST_URI "/multipleUpload" "allow,id:999999"

The file uploads, but we still get a warning:

 [Mon Jul 23 11:56:06.654191 2018] [:error] [pid 15106:tid 139627239241472] [client 10.139.48.106] ModSecurity: Multipart parsing error (init): Multipart: Invalid boundary in C-T (malformed). [hostname "****"] [uri "/restless/documentUpload/multipleUpload"] [unique_id "W1X6ln8AAAEAADsC@EIAAAAT"]

[Mon Jul 23 11:56:06.702321 2018] [:error] [pid 15106:tid 139627239241472] [client 10.139.48.106] ModSecurity: Access allowed (phase 2). Pattern match "/multipleUpload" at REQUEST_URI. [file "/etc/modsecurity/modsecurity.conf"] [line "20"] [id "999999"] [hostname "***"] [uri "/restless/documentUpload/multipleUpload"] [unique_id "W1X6ln8AAAEAADsC@EIAAAAT"]

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.