Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StandardMultipartHttpServletRequest cannot decode multipart Content-Disposition header encoded by FormHttpMessageConverter [SPR-15205] #19769

Closed
spring-issuemaster opened this issue Jan 30, 2017 · 16 comments

Comments

Projects
None yet
2 participants
@spring-issuemaster
Copy link
Collaborator

commented Jan 30, 2017

Miroslav Holubec opened SPR-15205 and commented

There is wrong behavior when encoding/decoding Content-Disposition header.
When I do setMultipartCharset() on AllEncompassingFormHttpMessageConverter, it emits header like this:

Content-Disposition: form-data; name="uploadedFile"; filename="=?UTF-8?Q?Declara=C3=A7=C3=A3o.pdf?="

StandardMultipartHttpServletRequest cannot decode such a name, as it expects that "filename" is followed by asterisk.

I'm a bit lost in RFCs, as not sure if RFC 6266 should be applied here (means asterisk required) or not.


Affects: 4.3.6

Issue Links:

  • #19959 Possibility to configure encoding other than ascii in FormHttpMessageConverter.MultipartHttpOutputMessage ("is duplicated by")
  • #16724 FormHttpMessageConverter writes ASCII encoded so that a multipart form data can not contain filenames with German Umlaute
  • #17904 File upload with Servlet 3 multipart requests should support filename with charset encoding (RFC 2231)
  • #18979 Introduce HttpHeaders getContentDisposition() and setContentDisposition()
  • #19115 HttpHeaders.setContentDispositionFormData() doesn't encode non-acsii characters correctly
  • #20097 Unable to emulate upload filename browser behaviour with RestTemplate

Referenced from: commits 390bb87, bb684ce, 75117f4, f89511e, 3009e29

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Juergen Hoeller commented

As far as I see, StandardMultipartHttpServletRequest checks for both filename= and filename*=. Could you please double-check why it isn't parsing correctly in your case?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Miroslav Holubec commented

StandardMultipartHttpServletRequest does extractFilenameWithCharset(String) only in case that extractFilename(String) returns null (means no filename= was present). Also I'm not sure if extractFilenameWithCharset(String) can parse this encoding (RFC-6266 seems to use different encoding without question marks).

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Miroslav Holubec commented

Finally found required RFCs. So it seems to be that FormHttpMessageConverter uses RFC-2047 encoding, which is unsupported by StandardMultipartHttpServletRequest (that supports only RFC-6266).

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Miroslav Holubec commented

possible workaround using commons-fileupload

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Rossen Stoyanchev commented

In 4.1.1 #16724 added support for writing encoded file names to FormHttpMessageConverter based on RFC-2047 which RFC-6266 says should not be used vs RFC-5987 which defines encoding for header field parameters specifically in HTTP. The second issue is that "filename*" should be used instead of "filename" as per RFC-6266, section 4.3.

In 4.2.1 #17904 added support for "filename*" in StandardMultipartHttpServletRequest based on RFC 2231 which is compatible with RFC-5987 but defining a subset of it for use in HTTP. We should however make an improvement to check for "filename*" first as recommended in RFC-6266 section 4.3.

In 4.3 #19115 added support for RFC-5987 encoding to HttpHeaders. So it looks like FormHttpMessageConverter could now be updated to use the setContentDisposition(String, String, charset) and that should produce a "filename*" attribute with RFC-5987 encoding. Hopefully commons-fileupload supports that, we'll have to check.

In 5.0 #18979 extracted content-disposition from HttpHeaders handling into a separate ContentDisposition type also moving along RFC-5987 related encoding and further adding support for decoding. Currently ContentDisposition refers to RFC-2183 (updated by RFC 2231 and hence compatible with RFC-5987) but should really be based on RFC-6266. Also RFC-5987 related encoding and decoding should probably move back into HttpHeaders since it applies more generally to any header parameters.

In 5.0 at least we can also have StandardMultipartHttpServletRequest delegate to ContentDisposition for the parsing.

/cc Brian Clozel, Sébastien Deleuze

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Rossen Stoyanchev commented

This was also recently discussed under #19072. Specifically this from RFC-7578 section 4.3:

In most multipart types, the MIME header fields in each part are
restricted to US-ASCII; for compatibility with those systems, file
names normally visible to users MAY be encoded using the percent-
encoding method in Section 2, following how a "file:" URI
[URI-SCHEME] might be encoded.

NOTE: The encoding method described in [RFC5987], which would add a
"filename*" parameter to the Content-Disposition header field, MUST
NOT be used.

So the most recent advice seems to be that you're supposed to encode file names like URIs but on the server side at least we can't avoid processing "filename*" as per #10655 it seems to be. And also continue to support it in FormHttpMessageConverter through the multipartCharset property.

Sébastien Deleuze, Brian Clozel what's your take on all this?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Miroslav Holubec commented

Hi Rossen, I think you are wrong about saying:

...based on RFC-2047 which RFC-6266 says should not be used. RFC-6266 states clearly:
This document does not apply to Content-Disposition header fields appearing in payload bodies transmitted over HTTP, such as when using the media type "multipart/form-data" ([RFC2388])..
Anyway, receiving side should ideally support all possible variants and by my recent opinion, RFC-2047 on sender side should be fine, as it follows MIME standard.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 30, 2017

Rossen Stoyanchev commented

Yes you're right that RFC-6266 explicitly excludes the multipart/form-data cases. The RFC-2388 it refers to is obsoleted by RFC-7578 which explains the quote from it, in my last comment.

On the receiving side we should support all options indeed. The question is about the sending side, i.e. in FormHttpMessageConverter and also in ContentDisposition / HttpHeaders. The former is RFC-2047, the latter RFC-5987. Or maybe the key is for ContentDisposition to separate out the case of multipart/form-data from other content-disposition types.

We'll have to experiment a bit with browser support too. Here is one article from Mozilla that seems to favor RFC-5987.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2017

Rossen Stoyanchev commented

Looking around at a few Java libraries and servers:

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2017

Rossen Stoyanchev commented

A quick experiment with browsers. Firefox and Chrome encode the filename based on the charset of the form (i.e. accept-charset form attribute). This is consistent with what the HTML5 spec says. Apache Commons FileUpload however does not decode that correctly.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2017

Miroslav Holubec commented

Yes, you are right. RFC-7578 seems to be valid document for implementation here. Unfortunately URI encoding is currently not widely supported on receiving sides. Because of that I would vote to support RFC-2047 encoding on sending site, at least until URI encoding support is implemented in commons-fileupload.

About receiving side, as far as I understood we should support RFC-2047, percent encoding and directly encoded header (in non ASCII charset). Support for filename*= can be dropped as RFC-7578 clearly forbids it.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 31, 2017

Juergen Hoeller commented

Let's turn this into a 5.0 topic then, possibly picking up a Commons FileUpload update as well.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented May 4, 2017

Rossen Stoyanchev commented

I've made one change already in FormHttpMessageConverter for the client side to encode part headers directly to the configured charset (UTF-8 by default), also setting the charset of the "Content-Type" header so the server knows how to decode, which is what Apache Commons FileUpload seems to do. The RFC 2047 filename encoding is still there but used only if you set the separate "multipartCharset" property.

I'm moving the ticket to RC2 to complete the server side work. My plan for the server side is to support direct encoding (based on the charset of the "Content-Type" header). I would also drop the RFC 5987 encoding that we currently support since RFC-7578 forbids it. I hesitate to support RFC 2047 mainly because it requires javax.mail as a dependency, which is less than ideal for this reason alone, and because I'm hoping direct encoding goes far enough to meet the need for encoded filenames. As for URI encoding I'm still not sure how would the server detect that is used, would it check for a scheme prefix for example?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 20, 2017

Rossen Stoyanchev commented

StandardMultipartHttpServletRequest now supports both RFC 5987 and RFC 2047 filename encoding although the latter works only if javax.mail is present.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 7, 2017

Juergen Hoeller commented

rstoyanchev, it just occurred to me that Servlet 3.1 has a getSubmittedFileName() method on Part: Since Spring Framework 5.0 is Servlet 3.1+ anyway, shouldn't we simply be delegating to that method? Or at least have an option to delegate to that?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 7, 2017

Rossen Stoyanchev commented

Good point.

The Javadoc (and spec) are scant on information about what the method should do.

The Tomcat implementation seems to respect RFC-6266 which we already discussed above shouldn't be used. The Jetty implementation doesn't support any decoding, just unquotes.

We would probably have to have an option to delegate to that, if required, to call it as an alternative to what we have now which supports RFC 5987 and 2047.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.