Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File corrupted on post when using feign-form with feign httpclient #18

Closed
RaphC opened this issue Apr 5, 2017 · 8 comments
Closed

File corrupted on post when using feign-form with feign httpclient #18

RaphC opened this issue Apr 5, 2017 · 8 comments
Labels

Comments

@RaphC
Copy link

RaphC commented Apr 5, 2017

Hi,

I need help about an issue I'm facing when I use feign-form with feign httpclient.

Basically, the MultipartEncodedDataProcessor set body with a UTF_8 charset.

    template.header("Content-Type", contentType);
    template.body(outputStream.toByteArray(), UTF_8);
    outputStream.close();

whereas feign/httpclient/ApacheHttpClient.java process it as the StringEntity and not a ByteArrayEntity. (I added some comment to explain the issue).

//request body
if (request.body() != null) {
HttpEntity entity = null;
// charset is UTF-8
if (request.charset() != null) {
// extract content-type with no charset because of Content-type header looks like Content-Type: multipart/form-data; boundary=AaB03x)
ContentType contentType = getContentType(request);
// convert body to string with UTF_8 charset
String content = new String(request.body(), request.charset());
// create a StringEntity with defaut charset ISO-8859-1 (referred to RFC) when no charset in content-type. So the entity can be corrupted.
entity = new StringEntity(content, contentType);
} else {
entity = new ByteArrayEntity(request.body());
}

Is it mandatory to set the charset in template.body(outputStream.toByteArray(), UTF_8);?

Could we just set body like this template.body(outputStream.toByteArray());

Regards,

@RaphC
Copy link
Author

RaphC commented Apr 6, 2017

Apparently, default charset is ISO-8859-1 (https://tools.ietf.org/html/rfc2616#section-3.4.1). Why feign-form uses UTF_8 ?

@xxlabaza
Copy link
Collaborator

xxlabaza commented Apr 6, 2017

ISO-8859-1 - is a default only for English, UTF-8 is more common.
Take a look at feign core sources, it always uses UTF-8 as default charset.

I'm not going to implement support for another charsets, because in my opinion it is unnecessarily to use something different from UTF-8/UTF-16 in 21 century...

But if you want, you could create a pull request with different charsets support and tests, I merge it ASAP

@RaphC
Copy link
Author

RaphC commented Apr 7, 2017

Thank you @xxlabaza for your response. My problem comes from the use of feign-httpclient (which uses ISO-8859-1 as default charset if it's not specified in content-type (e.g Content-type: multipart/form-data; boundary=xxxxxxxx)) and feign-form which set UTF-8 as request default charset. I'll find another way to fix it.

@RaphC RaphC closed this as completed Apr 7, 2017
@leveluptor
Copy link

Hi, I have a similar problem.

I try to send binary file (a PNG image) as a multipart/form-data using feign-form, looks like a pretty common case to me.
I don't specify any charset, but it's set anyway in FormEncoder#encode.
Because of the non-null charset, Feign's ApacheHttpClient encodes the data as a StringEntity and it's not a valid PNG image anymore.

So, is it really mandatory to set charset for such cases?

@xxlabaza
Copy link
Collaborator

@leveluptor could you tell which Feign-Form version do you use?
Also, it would be great, if you provide a small code example for reproducing your issue

@xxlabaza xxlabaza reopened this Nov 29, 2017
@leveluptor
Copy link

leveluptor commented Nov 29, 2017

I'm using feign-form 3.0.0, feign-core and feign-httpclient 3.5.1.

Here is a sample gradle project: https://github.com/leveluptor/feignforms-demo.
The sample image is a single red pixel.
Also I included my custom MultipartFormContentProcessorWithWorkaround, which is copypasted from MultipartFormContentProcessor but doesn't set charset to the request. FormEncoderWithWorkaround is used only to register this custom processor and doesn't do anything extra. With these workarounds everything works as I expected (however, I don't insist that it's the right way, it's just a quick dirty fix).

How to reproduce the problem:
I used http://httpbin.org. http://httpbin.org/anything sends back any request it receives, so I thought it could be useful to show request data.
Run the Demo class, take a look at the output (json representation of HTTP request) and find the "files" field.
For unmodified Feign client it will look like this:

...
  "files": {
    "image": "\ufffdPNG\r\n\u001a\n\u0000\u0000\u0000\rIHDR\u0000\u0000\u0000\u0001\u0000\u0000\u0000\u0001\b\u0002\u0000\u0000\u0000\ufffdwS\ufffd\u0000\u0000\u0000\u0001sRGB\u0000\ufffd\ufffd\u001c\ufffd\u0000\u0000\u0000\u0004gAMA\u0000\u0000\ufffd\ufffd\u000b\ufffda\u0005\u0000\u0000\u0000\tpHYs\u0000\u0000\u000e\ufffd\u0000\u0000\u000e\ufffd\u0001\ufffdo\ufffdd\u0000\u0000\u0000\fIDAT\u0018Wcx+\ufffd\u0002\u0000\u0003'\u0001.\u0015k\ufffd\ufffd\u0000\u0000\u0000\u0000IEND\ufffdB`\ufffd"
...

For Feign client with a workaround it will look like this:

...
  "files": {
    "image": ""
...

The first one is a weird mix of slightly invalid unicode symbols, but the second one is a good base64-encoded image. If I paste iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAAMSURBVBhXY3growIAAycBLhVrvukAAAAASUVORK5CYII= into https://codebeautify.org/base64-to-image-converter , I get back my nice red pixel.

There is another way to check, it's more reliable, but requires boring walking through the debugger.

  1. Run Demo and look at the first lines of the output:
--- first three bytes from source image ---
-119 80 78
  1. Open feign.httpclient.ApacheHttpClient and set the breakpoint here:
  @Override
  public Response execute(Request request, Request.Options options) throws IOException {
    HttpUriRequest httpUriRequest;
    try {
      httpUriRequest = toHttpUriRequest(request, options);
    } catch (URISyntaxException e) {
      throw new IOException("URL '" + request.url() + "' couldn't be parsed into a URI", e);
    }
    HttpResponse httpResponse = client.execute(httpUriRequest); // <-- BREAKPOINT
    return toFeignResponse(httpResponse).toBuilder().request(request).build();
  }
  1. Run Demo again.
    3.1. At the first breakpoint, check the request body in request:
    request_body_plain
    The actual image starts from 154th byte. The first three bytes of the image are correct: -119, 80, 78.
    3.2. At the same breakpoint, check the request body in httpUriRequest (also you can see that it's a StringEntity):
    request_body_httpurirequest
    The first bytes are wrong.
    3.3. Release the breakpoint and wait for the second invocation (with workaround). check the request body in request:
    request_body_plain_workaround
    The first bytes are correct.
    3.4. At the same breakpoint, check the request body in httpUriRequest (also you can see that it's a ByteArrayEntity):
    request_body_httpurirequest_workaround
    The first bytes are also correct.

Overall, this looks like an interesting problem involving both feign-form and feign-httpclient. However, from the point of view of feign-httpclient, checking charset to determine if the data is binary or non-binary looks OK. Right now I can't think of a better way to suggest. Maybe something involving Content-Transfer-Encoding: binary header, but I'm not sure if it's always set.

xxlabaza pushed a commit that referenced this issue Nov 30, 2017
@xxlabaza
Copy link
Collaborator

Fixed in version 3.0.3

@leveluptor
Copy link

@xxlabaza Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants