Streaming downloads using net/http, http.rb or wget
Clone or download
janko-m Make internal Tempfile of Down::ChunkedIO inaccessible to outside pro…

When opening a Down::ChunkedIO to a remote file, the user might want to
directly stream the contents to some other destination, for example an
AWS S3 bucket. Since Down::ChunkedIO is by default caching the retrieved
content to disk for rewindability, other programs can theoretically
retrieve the downloaded file content before the Tempfile is deleted, by
searching for filesystem entries.

This is not ideal in terms of security. To combat that, other libraries
like Rack (Rack::RewindableInput) and Unicorn (Unicorn::TeeInput) use a
technique where the file permissions are immediately stripped to
completely prohibit read & write, and in case of POSIX filesystems the
file entry is even removed from the filesystem. This ensures no one can
access the file other than the Ruby process that's holding the file

So, we copy the same technique here.
Latest commit cdc71c3 Dec 11, 2018


Down is a utility tool for streaming, flexible and safe downloading of remote files. It can use open-uri + Net::HTTP, HTTP.rb or wget as the backend HTTP library.


gem "down", "~> 4.4"


The primary method is, which downloads the remote file into a Tempfile:

require "down"

tempfile ="")
tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>


The returned Tempfile has some additional attributes extracted from the response data:

tempfile.content_type      #=> "text/plain"
tempfile.original_filename #=> "document.txt"
tempfile.charset           #=> "utf-8"

Maximum size

When you're accepting URLs from an outside source, it's a good idea to limit the filesize (because attackers want to give a lot of work to your servers). Down allows you to pass a :max_size option:"", max_size: 5 * 1024 * 1024) # 5 MB
# Down::TooLarge: file is too large (max is 5MB)

What is the advantage over simply checking size after downloading? Well, Down terminates the download very early, as soon as it gets the Content-Length header. And if the Content-Length header is missing, Down will terminate the download as soon as the downloaded content surpasses the maximum size.


By default the remote file will be downloaded into a temporary location and returned as a Tempfile. If you would like the file to be downloaded to a specific location on disk, you can specify the :destination option:"", destination: "/path/to/destination")

Basic authentication and will automatically detect and apply HTTP basic authentication from the URL:"")"")

Progress supports :content_length_proc, which gets called with the value of the Content-Length header as soon as it's received, and :progress_proc, which gets called with current filesize whenever a new chunk is downloaded. "",
  content_length_proc: -> (content_length) { ... },
  progress_proc:       -> (progress)       { ... }


Down has the ability to retrieve content of the remote file as it is being downloaded. The method returns a Down::ChunkedIO object which represents the remote file on the given URL. When you read from it, Down internally downloads chunks of the remote file, but only how much is needed.

remote_file ="")
remote_file.size # read from the "Content-Length" header # downloads and returns first 1 KB # downloads and returns next 1 KB

remote_file.eof? #=> false # downloads and returns the rest of the file content
remote_file.eof? #=> true

remote_file.close # closes the HTTP connection and deletes the internal Tempfile


By default the downloaded content is internally cached into a Tempfile, so that when you rewind the Down::ChunkedIO, it continues reading the cached content that it had already retrieved.

remote_file ="")*1024*1024) # downloads, caches, and returns first 1MB
remote_file.rewind*1024*1024) # reads the cached content*1024*1024) # downloads the next 1MB

If you want to save on IO calls and on disk usage, and don't need to be able to rewind the Down::ChunkedIO, you can disable caching downloaded content:"", rewindable: false)

Yielding chunks

You can also yield chunks directly as they're downloaded via #each_chunk, in which case the downloaded content is not cached into a file regardless of the :rewindable option.

remote_file ="")
remote_file.each_chunk { |chunk| ... }


You can access the response status and headers of the HTTP request that was made:

remote_file ="")[:status]   #=> 200[:headers]  #=> { ... }[:response] # returns the response object

Note that Down::NotFound error will automatically be raised if response status was 4xx or 5xx.


The performs HTTP logic and returns an instance of Down::ChunkedIO. However, Down::ChunkedIO is a generic class that can wrap any kind of streaming. It accepts an Enumerator that yields chunks of content, and provides IO-like interface over that enumerator, calling it whenever more content is needed.

require "down/chunked_io"
  • :chunksEnumerator that yields chunks of content
  • :size – size of the file if it's known (returned by #size)
  • :on_close – called when streaming finishes or IO is closed
  • :data - custom data that you want to store (returned by #data)
  • :rewindable - whether to cache retrieved data into a file (defaults to true)
  • :encoding - force content to be returned in specified encoding (defaults to Encoding::BINARY)

Here is an example of creating a streaming IO of a MongoDB GridFS file:

require "down/chunked_io"

mongo =
bucket = mongo.database.fs

content_length = bucket.find(_id: id).first[:length]
stream = bucket.open_download_stream(id)

io =
  size: content_length,
  chunks: stream.enum_for(:each),
  on_close: -> { stream.close },


Down tries to recognize various types of exceptions and re-raise them as one of the Down::Error subclasses. This is Down's exception hierarchy:

  • Down::Error
    • Down::TooLarge
    • Down::NotFound
      • Down::InvalidUrl
      • Down::TooManyRedirects
      • Down::ResponseError
        • Down::ClientError
        • Down::ServerError
      • Down::ConnectionError
      • Down::TimeoutError
      • Down::SSLError


By default Down implements and using the built-in open-uri + Net::HTTP Ruby standard libraries. However, there are other backends as well, see the sections below.

You can use the backend directly:

require "down/net_http""...")"...")

Or you can set the backend globally (default is :net_http):

require "down"

Down.backend :http # use the Down::Http backend"...")"...")

open-uri + Net::HTTP

gem "down", "~> 4.4"
require "down/net_http"

tempfile ="")
tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>

io ="")
io #=> #<Down::ChunkedIO ...> is implemented as a wrapper around open-uri, and fixes some of open-uri's undesired behaviours:

  • uses URI::HTTP#open or URI::HTTPS#open directly for security
  • always returns a Tempfile object, whereas open-uri returns StringIO when file is smaller than 10KB
  • gives the extension to the Tempfile object from the URL
  • allows you to limit maximum number of redirects

On the other hand is implemented using Net::HTTP directly, as open-uri doesn't support downloading on-demand.


Down::NetHttp#download turns off open-uri's following redirects, as open-uri doesn't have a way to limit the maximum number of hops, and implements its own. By default maximum of 2 redirects will be followed, but you can change it via the :max_redirects option:"")                   # 2 redirects allowed"", max_redirects: 5) # 5 redirects allowed"", max_redirects: 0) # 0 redirects allowed"")                       # 2 redirects allowed"", max_redirects: 5)     # 5 redirects allowed"", max_redirects: 0)     # 0 redirects allowed


An HTTP proxy can be specified via the :proxy option:"", proxy: "")"", proxy: "")


Timeouts can be configured via the :open_timeout and :read_timeout options:"", open_timeout: 5)"", read_timeout: 10)


Request headers can be added via the :headers option:"", headers: { "Header" => "Value" })"", headers: { "Header" => "Value" })

SSL options

The :ssl_ca_cert and :ssl_verify_mode options are supported, and they have the same semantics as in open-uri:"",
  ssl_ca_cert:     "/path/to/cert",
  ssl_verify_mode: OpenSSL::SSL::VERIFY_PEER)

Additional options

Any additional options passed to will be forwarded to open-uri, so you can for example add basic authentication or a timeout: "",
  http_basic_authentication: ['john', 'secret'],
  read_timeout: 5

You can also initialize the backend with default options:

net_http = 3)"")"")


gem "down", "~> 4.4"
gem "http", "~> 4.0"
require "down/http"

tempfile ="")
tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>

io ="")
io #=> #<Down::ChunkedIO ...>

Some features that give the HTTP.rb backend an advantage over open-uri + Net::HTTP include:

  • Low memory usage (10x less than open-uri/Net::HTTP)
  • Proper SSL support
  • Support for persistent connections
  • Global timeouts (limiting how long the whole request can take)
  • Chaninable builder API for setting default options

Additional options

All additional options will be forwarded to HTTP::Client#request:"", headers: { "Foo" => "Bar" })"", follow: { max_hops: 0 })

However, it's recommended to configure request options using http.rb's chainable API, as it's more convenient than passing raw options."") do |client|
  client.timeout(connect: 3, read: 3)

You can also initialize the backend with default options:

http = { "Foo" => "Bar" })
# or
http = { |client| client.timeout(connect: 3) }"")"")

Request method

By default Down::Http makes a GET request to the specified endpoint, but you can specify a different request method using the :method option:"", method: :post)"", method: :post)

down = :post)"")

Wget (experimental)

gem "down", "~> 4.4"
gem "posix-spawn" # omit if on JRuby
gem "http_parser.rb"
require "down/wget"

tempfile ="")
tempfile #=> #<Tempfile:/var/folders/k7/6zx6dx6x7ys3rv3srh0nyfj00000gn/T/20150925-55456-z7vxqz.jpg>

io ="")
io #=> #<Down::ChunkedIO ...>

The Wget backend uses the wget command line utility for downloading. One major advantage of wget is that it automatically resumes downloads that were interrupted due to network failures, which is very useful when you're downloading large files.

However, the Wget backend should still be considered experimental, as it wasn't easy to implement a CLI wrapper that streams output, so it's possible that I've made mistakes. Let me know how it's working out for you 😉.

Additional arguments

You can pass additional arguments to the underlying wget commmand via symbols:"", :no_proxy, connect_timeout: 3)"", user: "janko", password: "secret")

You can also initialize the backend with default arguments:

wget =, connect_timeout: 3)"")"")

Supported Ruby versions

  • MRI 2.2
  • MRI 2.3
  • MRI 2.4
  • JRuby


You can run tests with

$ bundle exec rake test

The test suite pulls and runs kennethreitz/httpbin as a Docker container, so you'll need to have Docker installed and running.