Skip to content

Inconsistent/Incompatible handling of filename escaping in multipart/form-data compared to RFC 7578 and browsers #7789

Closed
@sleevi

Description

@sleevi

I did this

  1. Create a file of the form "foo.jpg (note the leading quote)
  2. Submit as a form using curl -i -F encoded_image=@\"foo.jpg https://www.google.com/searchbyimage/upload

I expected the following

Within the multipart/form-data that is generated, the filename parameter of the request will be in the following form: filename="\"foo.jpg" - using the quoted-string production of RFC 822.

However, I would expect that the filename would of the form filename="%22foo.jpg", using the Percent-Encoding Option of RFC 7578

curl/libcurl version

ToT

operating system

Further Details

The context here is that multipart/form-data encoding is handled by lib/formdata.c. The form-data support is implemented using the generic MIME encoder, in the family of curl_mime_* functions. For example, in the above example of attaching a file, this is handled by

result = curl_mime_filedata(part, file->contents);
calling curl_mime_filedata

curl_mime_filedata passes the filename onwards using curl_mime_filename, passing only the basename of the filename.

Later, when compiling the multipart message, the field name and filename for the Content-Disposition header are escaped using the escape_string function, as shown at

curl/lib/mime.c

Lines 1867 to 1886 in 52fab72

if(part->name) {
name = escape_string(part->name);
if(!name)
ret = CURLE_OUT_OF_MEMORY;
}
if(!ret && part->filename) {
filename = escape_string(part->filename);
if(!filename)
ret = CURLE_OUT_OF_MEMORY;
}
if(!ret)
ret = Curl_mime_add_header(&part->curlheaders,
"Content-Disposition: %s%s%s%s%s%s%s",
disposition,
name? "; name=\"": "",
name? name: "",
name? "\"": "",
filename? "; filename=\"": "",
filename? filename: "",
filename? "\"": "");

These is a complex area due to issues of character sets and filenames. Within this space, RFC 6266, Section 4.3 is relevant, and further expanded upon in RFC 6266, Appendix C.2. RFC 7578, Section 4.2 attempts to provide guidance here with respect to the filename handling inter-operably within a Content-Disposition: form-data part.

The HTML Living Standard Multipart form data specification places a normative dependency on RFC 7578, providing further guidance with respect to the encoding and escaping of field names and filenames for file fields, the two areas escaped via escape_string, to encode to the appropriate encoding and then replace { 0x0A, 0X0D, 0X22 } with { "%0A", "%0D", "%22" }, respectively. This is compatible with the guidance of RFC 7578, and reflects widespread use within browsers (e.g. Chromium/Chrome or WebKit/Safari)

This is an area where MIME attachments differ in practice than form data. It seems that it may be useful to at least align the form encoder to match the approach mentioned within RFC 7578, for greater harmonization.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions