New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

utf-8 encoding issue when uploading a string #266

Closed
schorsch opened this Issue Sep 9, 2013 · 9 comments

Comments

Projects
None yet
2 participants
@schorsch

schorsch commented Sep 9, 2013

i am reading a file content an put it as string into the reuqest body and get UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

In connection.rb https://github.com/geemus/excon/blob/master/lib/excon/connection.rb#L176

if datum[:body].is_a?(String) # write out string body
  socket.write(request << datum[:body]) # write out request + headers + body

you concat the request string (encoded probably utf-8) with whatever encoding comes in from the body, in my case binary and this probably throws the error.

I think making two calls to socket.write could solve the problem. For now i'am saving the string to a tempfile and pass the file handle to excon .. not very elegant

@geemus

This comment has been minimized.

Show comment
Hide comment
@geemus

geemus Sep 10, 2013

Contributor

@schorsch - I believe if you just provide a file handle, ie File.open('path') as the value for body you pass in, that it should ensure that it does the proper invocations to make the encoding work. Could you try that and see if you have any better luck?

Contributor

geemus commented Sep 10, 2013

@schorsch - I believe if you just provide a file handle, ie File.open('path') as the value for body you pass in, that it should ensure that it does the proper invocations to make the encoding work. Could you try that and see if you have any better luck?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 10, 2013

@schorsch Instead of using a tempfile, you could use StringIO. Excon.put(url, :body => StringIO.new(my_str))

ghost commented Sep 10, 2013

@schorsch Instead of using a tempfile, you could use StringIO. Excon.put(url, :body => StringIO.new(my_str))

@schorsch

This comment has been minimized.

Show comment
Hide comment
@schorsch

schorsch Sep 11, 2013

@geemus yes this is what i am doing and it works, but i already have the binary string and dont want to create another tempfile
@burns thanks, i'll give it a try

I still think this is a bad workaround which could be prevented by not concatenating two strings. Btw i am using it together with fog S3.

schorsch commented Sep 11, 2013

@geemus yes this is what i am doing and it works, but i already have the binary string and dont want to create another tempfile
@burns thanks, i'll give it a try

I still think this is a bad workaround which could be prevented by not concatenating two strings. Btw i am using it together with fog S3.

@geemus

This comment has been minimized.

Show comment
Hide comment
@geemus

geemus Sep 11, 2013

Contributor

@schorsch - will it "just work" if the body is written separately from the rest? The current implementation was chosen for performance reasons (concatenation is a faster than writing to the socket). But if it solves this problem it seems like it might be worth the small overhead it imposes.

Contributor

geemus commented Sep 11, 2013

@schorsch - will it "just work" if the body is written separately from the rest? The current implementation was chosen for performance reasons (concatenation is a faster than writing to the socket). But if it solves this problem it seems like it might be worth the small overhead it imposes.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Sep 14, 2013

@geemus I don't think the overhead of a separate write would be an issue. But I'll let @schorsch confirm his use case.
However, I wanted to point out another solution. Instead of Excon treating a String body differently from an IO-like body, have Excon convert a String body to a StringIO. Note that StringIO.new(str) does not create a new string, but simply holds a pointer to str. I say this because I realized there was an issue with using a String body under ruby-1.8.7. Even with a relatively small string, all requests were timing out (see backup/backup@f6a8408). I decided not to bring it up at that time, since ruby-1.8.7 is EOL. But given this issue, I thought I'd mention it.

ghost commented Sep 14, 2013

@geemus I don't think the overhead of a separate write would be an issue. But I'll let @schorsch confirm his use case.
However, I wanted to point out another solution. Instead of Excon treating a String body differently from an IO-like body, have Excon convert a String body to a StringIO. Note that StringIO.new(str) does not create a new string, but simply holds a pointer to str. I say this because I realized there was an issue with using a String body under ruby-1.8.7. Even with a relatively small string, all requests were timing out (see backup/backup@f6a8408). I decided not to bring it up at that time, since ruby-1.8.7 is EOL. But given this issue, I thought I'd mention it.

@geemus

This comment has been minimized.

Show comment
Hide comment
@geemus

geemus Sep 16, 2013

Contributor

@burns interesting. Treating both more similarly seems desirable in it's own right. As long as that takes care of the encoding issue it certainly seems worth doing.

Contributor

geemus commented Sep 16, 2013

@burns interesting. Treating both more similarly seems desirable in it's own right. As long as that takes care of the encoding issue it certainly seems worth doing.

geemus added a commit that referenced this issue Sep 24, 2013

nelhage added a commit to nelhage/excon that referenced this issue Mar 4, 2016

Write the first body chunk in the initial packet.
When excon#266 was fixed, it caused a regression in the behavior from excon#233,
which deliberately merged the initial writes to the socket, for network
efficiency and to avoid triggering the pathological behavior caused by
the combination of Nagle's Algorithm and TCP delayed ACKs.

Restore that optimization, in a slightly more general way: Do a
nonblocking read of an initial chunk of data off the provided body, and
merge that with the headers before sending.
@geemus

This comment has been minimized.

Show comment
Hide comment
@geemus

geemus Mar 15, 2016

Contributor

I think this was fixed, but do let me know if you are still seeing issues.

Contributor

geemus commented Mar 15, 2016

I think this was fixed, but do let me know if you are still seeing issues.

@geemus geemus closed this Mar 15, 2016

@schorsch

This comment has been minimized.

Show comment
Hide comment
@schorsch

schorsch Mar 16, 2016

Just recently re-checked my hack and removed it, so the issue is resolved for me, thanks for closing

schorsch commented Mar 16, 2016

Just recently re-checked my hack and removed it, so the issue is resolved for me, thanks for closing

@geemus

This comment has been minimized.

Show comment
Hide comment
@geemus

geemus Mar 16, 2016

Contributor

Great, thanks for confirming.

Contributor

geemus commented Mar 16, 2016

Great, thanks for confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment