Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Character encoding problem in Content-Disposition section in HTTP POST body (multipart/form-data requests) #95

Closed
saliksaly opened this issue May 25, 2022 · 18 comments

Comments

@saliksaly
Copy link

saliksaly commented May 25, 2022

Hi,

I have encountered interesting problem with proxying http post requests - when uploading files.

Content-Type: multipart/form-data is used and form values are each in it's own Content-Disposition section (the format is described here: https://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2).

In Content-Disposition header, characters in the filename value are malformed - different than in the original request.

Example from testing:

from the original request:

Content-Disposition: form-data; name="files"; filename="Abby Wynne - Jak se mít dobře.pdf"
Content-Type: application/pdf

from the proxied request:

Content-Disposition: form-data; name="files"; filename="Abby Wynne - Jak se m�t dobre.pdf"
Content-Type: application/pdf

There is problem when using non-ascii characters in file name. Something with character encoding...

Before, I used ProxyKit (https://github.com/proxykit/ProxyKit). It's similar library - there was not this problem. It is not supported so I switched to AspNetCore.Proxy. It works without problems in my project (all traffic from one "light" application goes to other one, instead of some local endpoints).

Could you please throw an eye on this issue? :)

Thanks

@twitchax
Copy link
Owner

Yeah, cool. This is likely an easy fix. Will be even easier if you could add a failing test, and I will debug it.

@saliksaly
Copy link
Author

Yeah, cool. This is likely an easy fix. Will be even easier if you could add a failing test, and I will debug it.

OK, I will add the test.

@saliksaly
Copy link
Author

Here is pull request for the failing test: #96

@twitchax
Copy link
Owner

Thanks: I should have time to take a look this weekend!

@twitchax
Copy link
Owner

twitchax commented Jun 1, 2022

@saliksaly, hey, thanks for the test!

However, after lots of debugging, it looks like the proxy server is behaving properly. I think this has something to do with the postman endpoint you are using. I have verified that the Content-Disposition remains completely unchanged right before it is sent over the wire.

More importantly, I have swapped out one line in your test. Instead of proxying the request through the proxy, I changed it to go straight to postman:

[Fact]
public async Task CanProxyControllerPostWithFormAndFilesRequest()
{
    var content = new MultipartFormDataContent();
    content.Add(new StringContent("123"), "xyz");
    content.Add(new StringContent("456"), "xyz");
    content.Add(new StringContent("321"), "abc");
    const string fileName = "Test こんにちは file.txt";
    const string fileString = "This is a test file こんにちは with non-ascii content.";
    var fileContent = new StreamContent(new System.IO.MemoryStream(Encoding.UTF8.GetBytes(fileString)));
    content.Add(fileContent, "testFile", fileName);

    //var response = await _client.PostAsync("api/multipart", content);

    /// //////////////////////////////////////////
    var response = await new HttpClient().PostAsync("https://postman-echo.com/post", content);
    /// ///////////////////////

    /// response.EnsureSuccessStatusCode();
    var responseString = await response.Content.ReadAsStringAsync();
    var json = JObject.Parse(responseString);

    var form = Assert.IsAssignableFrom<JObject>(json["form"]);
    Assert.Equal(2, form.Count);
    var xyz = Assert.IsAssignableFrom<JArray>(form["xyz"]);
    Assert.Equal(2, xyz.Count);
    Assert.Equal("123", xyz[0]);
    Assert.Equal("456", xyz[1]);
    Assert.Equal("321", form["abc"]);

    var files = Assert.IsAssignableFrom<JObject>(json["files"]);
    Assert.Single(files);
    var file = files.ToObject<Dictionary<string, string>>().Single();
    Assert.Equal($"data:application/octet-stream;base64,{Convert.ToBase64String(Encoding.UTF8.GetBytes(fileString))}", file.Value);
    Assert.Equal(fileName, file.Key);
}

It fails in the exact same way, so it appears postman is giving back the wrong file name.

@twitchax twitchax reopened this Jun 1, 2022
@saliksaly
Copy link
Author

Yes, it seems there is no problem in this test case. I tested the post request and filename is in correct format:

--1d5a1936-c3f3-4611-b4c5-0ca2bdd5deca
Content-Disposition: form-data; name=testFile; filename="=?utf-8?B?VGVzdCDjgZPjgpPjgavjgaHjga8gZmlsZS50eHQ=?="; filename*=utf-8''Test%20%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF%20file.txt

When the request is built so using MultipartFormDataContent object, the request is ok.

But, there is still a problem, when forwarding file request. I will create some repo to show the problem. I am unable to show it in the tests...

@saliksaly
Copy link
Author

Here is simple ASPNET Core application which makes multipart/form-data http post request with file content to postman-echo.com/post to examine the request.
https://github.com/saliksaly/proxy_http_post_multipard_bad_encoding_test

There are three html forms, each sends file in different way:

Try to uload file with non-ascii characters in name (Test こんにちは file.txt for example), you will see broken characters in the filename sent through AspNetCore.Proxy:

image

Side-note

Direct request from browser differs from request made in automaic tests (using MultipartFormDataContent class).
The filename is encoded with base64:

--9a6a972e-0c9c-4deb-b51a-b8aacdd8ab7a
Content-Disposition: form-data; name=testFile; filename="=?utf-8?B?VGVzdCDjgZPjgpPjgavjgaHjga8gZmlsZS50eHQ=?="; filename*=utf-8''Test%20%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF%20file.txt

@twitchax
Copy link
Owner

twitchax commented Jun 1, 2022

Thanks!

@twitchax
Copy link
Owner

twitchax commented Jun 1, 2022

Where are you capturing those files that you diffed?

@twitchax
Copy link
Owner

twitchax commented Jun 1, 2022

The reason I ask is because I cannot repro. When I forward the request to a an endpoint running in the same process by starting a background kestral server, the file name is correct.

image

@twitchax
Copy link
Owner

twitchax commented Jun 3, 2022

@saliksaly, thoughts?

@saliksaly
Copy link
Author

Where are you capturing those files that you diffed?

Hi,

For comparison of behaviour, in all three cases the requests are sent to https://postman-echo.com/post (defined in https://github.com/saliksaly/proxy_http_post_multipard_bad_encoding_test/blob/master/Constants.cs).

You cannot repro? Ty to upload the same file with unicode characters in name in all three forms and you will see the difference...

@twitchax
Copy link
Owner

twitchax commented Jun 5, 2022

I think the problem is with postman, as a I said before. Have you tried this with something other than postman?

@saliksaly
Copy link
Author

No no, it is not the postman's problem.
Try again the test app: https://github.com/saliksaly/proxy_http_post_multipard_bad_encoding_test.
It now sends http posts to local application: https://github.com/saliksaly/proxy_http_post_multipard_bad_encoding_test (http://localhost:5141/file).

Here are the differences:

  • Post file - direct - file name ok
    image
    image
  • Post file - through ProxyKit - file name ok
    image
    image
  • Post file - through AspNetCore.Proxy - file name BAD
    image
    image

Could you try it on your own, please, to see the difference?

@twitchax
Copy link
Owner

twitchax commented Jun 6, 2022

That's the confusing part: I did try it on my own, and the IFormFile has the correct file name.
image

@twitchax
Copy link
Owner

twitchax commented Jun 6, 2022

Finally figured it out. HttpClient uses an old school RFC for Content-Disposition, so I forced the header.

@twitchax twitchax closed this as completed Jun 6, 2022
@twitchax
Copy link
Owner

twitchax commented Jun 6, 2022

New release is here.

@saliksaly
Copy link
Author

It already works! :-)
Thank you, very much :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants