Skip to content

Past ideas (~2020) Shrine

Koichi Sasada edited this page Feb 19, 2021 · 1 revision

[Shrine] is a toolkit for file attachments in Ruby applications similar to CarrierWave, Paperclip, and Active Storage gems. The most common use case is to manage file uploads and processing in web applications. The Shrine team has also created additional gems to improve support for various aspects of managing and processing files in the application. Two examples of such gems are [Down] and [ImageProcessing] gems. This Summer, we have 2 project ideas for you to work on for [Down] and [ImageProcessing] gems.

1. Add resumable download feature to [Down] gem

We would like to add the capability of doing resumable downloads which will be very handy especially for large file downloads as the user can resume downloading in the event it was interrupted.

Problem

When communicating with a server over a TCP connection, temporary network "hiccups" can happen, where a specific TCP operation fails unexpectedly. For example, if you're downloading a large file over HTTP, some "read" operations can fail temporarily.

When a TCP operation fails, it's probably safe to just retry that TCP operation; I think the acknowledgements in the TCP protocol handles re-sending the same data multiple times. In context of an HTTP interaction, though, it might not always be useful to retry an individual TCP operation, but instead it's generally better to retry the whole request.

However, when downloading large files over HTTP, it's a bit wasteful to retry the whole download from the beginning. This unexpected overhead can cause unwanted delays in a bigger system.

Feature

The HTTP protocol supports [Range requests], which allows an HTTP client to ask only for a portion of the file (if the server supports it). This enables the client to resume an interrupted download, by checking how much content it has already downloaded, and asking the server for only the remainder of the file content.

The idea is to implement a generic component in Ruby which would wrap a regular HTTP download and automatically resume the download on any network failures or server errors. It needs to:

  • retry the download on 5xx HTTP status code
  • retry the download on TCP connection errors
  • allow setting the number of retries, after which the download will be considered permanently failed
  • allow waiting for a certain amount of time before retrying (exponential backoff)
  • delete the downloaded file in case of permanent failure
  • allow hooking the component into both Down backends (Down::NetHttp or Down::Http)

  • Prerequisites: Ruby
  • Programming areas include: Ruby
  • Estimated difficulty level: medium to hard
  • Potential mentors: @hmistry