Using Checksums in Direct Uploads

Janko Marohnić edited this page Jun 13, 2018 · 4 revisions

When doing direct uploads to your app or a cloud service such as AWS S3, it's good practice to have the upload endpoint verify the integrity of the upload by using a checksum. You can do this by calculating a base64-encoded MD5 hash of the file on the client side before the upload, and include it in the Content-MD5 request header (AWS S3, Google Cloud Storage, and Shrine's upload_endpoint support this).

You can calculate the base64-encoded MD5 hash of the file using the spark-md5 and chunked-file-reader JavaScript librarires. You can pull them from unpkg:

<html>
  <head>
    <script src="https://unpkg.com/spark-md5/spark-md5.js"></script>
    <script src="https://unpkg.com/chunked-file-reader/chunked-file-reader.js"></script>
    <!-- ... -->
  </head>
  
  <body>
    <!-- ... -->
  </body>
 </html>

Now you can create an fileMD5() function that calculates a base64-encoded MD5 hash of a File object and returns it as a Promise:

function fileMD5 (file) {
  return new Promise(function (resolve, reject) {
    var spark  = new SparkMD5.ArrayBuffer(),
        reader = new ChunkedFileReader();

    reader.subscribe('chunk', function (e) {
      spark.append(e.chunk);
    });

    reader.subscribe('end', function (e) {
      var rawHash    = spark.end(true);
      var base64Hash = btoa(rawHash);

      resolve(base64Hash);
    });

    reader.readChunks(file);
  })
}

Now, how you're going to include that MD5 checksum depends on whether you're uploading directly to the cloud service (with Shrine's presign_endpoint plugin), or to your app using the upload_endpoint plugin.

AWS S3, Google Cloud Storage etc.

When fetching upload parameters from the presign endpoint, Shrine storage's #presign function needs to know that you'll be adding the Content-MD5 request header to the upload request. For both AWS S3 and Google Cloud Shrine storage this is done by passing the :content_md5 presign option:

Shrine.plugin :presign_endpoint, presign_options: -> (request) do
  {
    content_md5: request.params["checksum"],
    method: :put # only for AWS S3 storage
  }
end

The above setup allows you to pass the MD5 hash via the checksum query parameter in the request to the presign endpoint. With Uppy it could look like this:

Uppy.Core({
    // ...
  })
  .use(Uppy.AwsS3, {
    getUploadParameters: function (file) {
      return fileMD5(file.data)
        .then(function (hash) { return fetch('/presign?filename='+ file.name + '&checksum=' + hash) })
        .then(function (response) { return response.json() })
    }
  })
  // ...

Upload endpoint

When uploading the file directly to your app using the upload_endpoint Shrine plugin, you can also use checksums, as the upload endpoint automatically detects the Content-MD5 header. With Uppy it could look like this:

fileMD5(file).then(function (hash) { 
  Uppy.Core({
      // ...
    })
    .use(Uppy.XHRUpload, {
      endpoint: '/upload',
      fieldName: 'file',
      headers: {
        'Content-MD5': hash,
        'X-CSRF-Token': document.querySelector('meta[name=_csrf]').content,
      }
    })
    // ...
})

See Also

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.