Multipart upload of STDIN #112

Open
kgilpin opened this Issue Aug 27, 2012 · 9 comments

Comments

Projects
None yet
3 participants

kgilpin commented Aug 27, 2012

I'd like to send STDIN as a multipart upload.

I'm starting like this:

      file = STDIN
      class << file
        attr_accessor :size
      end
      file.size = file_size.to_i
      body = { 'file' => file }

      require 'httpclient'
      client = HTTPClient.new
      res = client.post(upload_url, body)

But getting:

/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@client/gems/httpclient-2.2.4/lib/httpclient/http.rb:549:in `pos='
/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@client/gems/httpclient-2.2.4/lib/httpclient/http.rb:549:in `reset_pos'
/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@client/gems/httpclient-2.2.4/lib/httpclient/http.rb:475:in `block in dump'

or

Illegal seek - <STDIN>
/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@inscitiv-client/gems/httpclient-2.2.7/lib/httpclient/http.rb:557:in `pos'
/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@inscitiv-client/gems/httpclient-2.2.7/lib/httpclient/http.rb:557:in `remember_pos'
/home/kgilpin/.rvm/gems/ruby-1.9.2-p290@inscitiv-client/gems/httpclient-2.2.7/lib/httpclient/http.rb:639:in `block in build_query_multipart_str'
Owner

nahi commented Sep 13, 2012

Some IOs like STDIN, IO.pipe don't have proper 'pos' definition so Ruby raises ESPIPE for STDIN.pos. For now I don't think we can define how STDIN should be used for POST... Don't you have any workaround?

kgilpin commented Sep 13, 2012

gem multipart_post is able to upload STDIN

basically it looks like this

require 'net/http/post/multipart'

file = STDIN
class << file
  attr_accessor :length
end
file.length = file_size.to_i
UploadIO.new(STDIN, mime, file_name)

require 'uri'
url = URI.parse(upload_url)
req = Net::HTTP::Post::Multipart.new url.request_uri, "file" => upload_io, "size" => file_size
res = Net::HTTP.start(url.host, url.port, :use_ssl => url.scheme == 'https') do |http|
  http.request(req)
end
unless [ 200, 201 ].member?(res.code.to_i)
  raise "upload failed with HTTP status #{res.code} : #{res.body}", 1
end
Owner

nahi commented Sep 13, 2012

I guess multipart_post gem doesn't handle authentication properly that requires posting the body again. This could work (not tested) but I'm still not sure it's what you want to do.


class SizedIO
  def initialize(io, size)
    @io, @size = io, size
  end

  def read
    @io.read(@size)
  end
end

HTTPClient.post("http://dev.ctor.org/", :file => SizedIO.new(STDIN, 5))

Hi,

I'm interesting in this topic.

As far as I know, HTTPClient can handle non-seekable IOs by sending it as chunked encoding.
IO#pos is called in many places in HTTPClient with rescue statement, however the call for IO#pos is not rescue'd in HTTP::Message::Body#remember_pos.
The following patch adds an rescue statement to remember_pos.

diff --git a/lib/httpclient/http.rb b/lib/httpclient/http.rb
index 2945540..09b81c0 100644
--- a/lib/httpclient/http.rb
+++ b/lib/httpclient/http.rb
@@ -555,6 +555,7 @@ module HTTP
       def remember_pos(io)
         # IO may not support it (ex. IO.pipe)
         @positions[io] = io.pos if io.respond_to?(:pos)
+      rescue Errno::ESPIPE
       end

       def reset_pos(io)

By applying the above patch, you can POST the content of STDIN.

require 'httpclient'
body = { "test" => STDIN }
client = HTTPClient.new
res = client.post("http://somewhere", body)

BTW, please bear in mind that using non-seekable IO has some restrictions which is described in the source code of HTTPClient.

  • The upload content will be sent as chunked encoding, but some Web servers and Web applications doesn't support chunked encoding.
  • Seeking is needed for following HTTP redirection.

I think putting the content of STDIN into a temporary file and then upload that file is much better.
Here is a sample code:

require 'httpclient'
require 'tempfile'
temp = Tempfile.open("temp")
while data = STDIN.read(4096)
  temp << data
end
temp.pos = 0
body = { "test" => temp }
client = HTTPClient.new
res = client.post("http://somewhere", body)

Regards,

kgilpin commented Sep 26, 2012

My files are large (~5GB), and must be streamed not saved for security & compliance purposes.

People often assume that files can be saved to temp files and that just isn't always the case.

I understood.
Could you try the patch I posted? I hope it should satisfy what you need.

BTW, let me correct what I wrote previously.

The upload content will be sent as chunked encoding IF IO does not respond to .size.
If IO responds to .size as kgilpin does, chunked encoding will not be used.

Owner

nahi commented Oct 7, 2012

OK, so you want to post via IO, with specifying the size rather than closing the IO, right?

The wrapper I posted above needs more work for do chunked request.

@nahi nahi closed this in 13c3f87 Oct 10, 2012

Owner

nahi commented Oct 10, 2012

Now HTTPClient can post sized IO properly. Please reopen this ticket if it looks bad. Thanks for the discussion!

Owner

nahi commented Oct 10, 2012

The fix at 2.3.0 seems to cause compatibility issue so I reverted the fix and pushed 2.3.0.1. We need another fix...

@nahi nahi reopened this Oct 10, 2012

@nahi nahi added the FeatureRequest label Nov 3, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment