Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

According to RFC 2046 and RFC 7578, the request header of multipart/form-data should not include the charset. #6250

Open
dongfangtianyu opened this issue Mar 17, 2024 · 3 comments · May be fixed by #6251 or #6253
Labels

Comments

@dongfangtianyu
Copy link

Expected behavior

HTTP Header

Content-Type: multipart/form-data; boundary=2W1aSJ1TtJC_jRaGnbotI-RaHchFMAO

Actual behavior

HTTP Header

Content-Type: multipart/form-data; boundary=2W1aSJ1TtJC_jRaGnbotI-RaHchFMAO; charset=UTF-8

Steps to reproduce the problem

In JMeter 5.6.3, the request header Content-Type for multipart/form-data is required to include ; charset=.
image

On some web server implementations, including charset in the request header Content-Type of multipart/form-data can result in parsing errors of the boundary, leading to a failure in sending form content.

Example 1: spring-projects/spring-framework#21599
Example 2: https://bz.apache.org/bugzilla/show_bug.cgi?id=61384

According to the latest RFC specifications, such implementations are incorrect:

In RFC 2046 [4.1.2] :

  • For text and its subtypes (e.g., text/plain), the charset parameter should be passed in the Content-Type.
  • i.e., The HTTP body is entirely composed of characters specified in the charset.

multipart/form-data does not belong to the text subtype, and the HTTP body may contain both text and binary data.

In RFC 7578 [5.1.2], rules for form encoding (form-charset) are defined:

  1. If multipart/form-data specifies a charset, it should be located in the HTTP body rather than the HTTP header.
  2. If charset is not specified for multipart/form-data, UTF-8 is used by default.

Therefore, HTTP headers like the following are non-compliant with the specification (and cause errors in some web server behaviors):

Content-Type: multipart/form-data; boundary=2W1aSJ1TtJC_jRaGnbotI-RaHchFMAO; charset=UTF-8

Interestingly, this HTTP header is also non-compliant with the specification (but doesn't cause errors as it lacks a boundary):

Content-type: application/json; charset=utf-8

I am not yet familiar with JMeter. If my idea is wrong, please remind me and close this issue.

Thank you.

JMeter Version

5.6.3

Java Version

17

OS Version

No response

@vlsi
Copy link
Collaborator

vlsi commented Mar 18, 2024

I've filed the issue for HttpClient: https://issues.apache.org/jira/browse/HTTPCLIENT-2325

@vlsi vlsi removed the to-triage label Mar 18, 2024
@dongfangtianyu
Copy link
Author

If HTTPClient removes new BasicNameValuePair("charset", charsetCopy.name())
or moves it before new BasicNameValuePair("boundary", boundaryCopy),
it should be able to solve this problem.

However, I still don't understand the reason behind insisting on calling multipartEntityBuilder.setCharset in JMeter.
Please enlighten me, and I would be grateful.

@dongfangtianyu
Copy link
Author

I might understand now. Thanks again.

vlsi added a commit to vlsi/httpcomponents-client that referenced this issue Mar 18, 2024
…uests

Previusly, "charset" parameter was added to the Content-Type header, however adding "charset=..."
is not specified in RFC 7578, and it causes issues with (flawed?) HTTP servers.

The change does not modify ContentType.MULTIPART_FORM_DATA, and it might have backward compatibility
side-effects.

See
* apache/jmeter#6250
* owasp-modsecurity/ModSecurity@6e56950
* https://bz.apache.org/bugzilla/show_bug.cgi?id=61384
* akka/akka-http#338
vlsi added a commit to vlsi/httpcomponents-client that referenced this issue Mar 18, 2024
…uests

Previusly, "charset" parameter was added to the Content-Type header, however adding "charset=..."
is not specified in RFC 7578, and it causes issues with (flawed?) HTTP servers.

The change does not modify ContentType.MULTIPART_FORM_DATA, and it might have backward compatibility
side-effects.

See
* apache/jmeter#6250
* owasp-modsecurity/ModSecurity@6e56950
* https://bz.apache.org/bugzilla/show_bug.cgi?id=61384
* akka/akka-http#338
vlsi added a commit to vlsi/jmeter that referenced this issue Mar 18, 2024
vlsi added a commit to vlsi/jmeter that referenced this issue Mar 18, 2024
ok2c pushed a commit to apache/httpcomponents-client that referenced this issue Mar 18, 2024
…uests

Previusly, "charset" parameter was added to the Content-Type header, however adding "charset=..."
is not specified in RFC 7578, and it causes issues with (flawed?) HTTP servers.

The change does not modify ContentType.MULTIPART_FORM_DATA, and it might have backward compatibility
side-effects.

See
* apache/jmeter#6250
* owasp-modsecurity/ModSecurity@6e56950
* https://bz.apache.org/bugzilla/show_bug.cgi?id=61384
* akka/akka-http#338
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants