Skip to content

Multipart upload fails on empty files #364

@fcheung

Description

@fcheung

S3 multipart upload fails if the file is empty, because in that case reading the first chunk of data returns nil and the code at https://github.com/fog/fog-aws/blob/master/lib/fog/aws/models/storage/file.rb#L274 never runs. S3 considers it an error to complete an upload with no parts.

We need to ensure we always upload at least 1 part, even if that part is empty. I can't think of a very nice way of doing this other than checking if the part_tags array is still empty at the end of multipart_save and uploading the fake chunk at that point.

Of course ideally I wouldn't use multipart at all here, but I don't always know the length of the IO that has been passed to me.

Any ideas? I was thinking along the lines of

def multipart_save(options)
  # Initiate the upload
  res = service.initiate_multipart_upload(directory.key, key, options)
  upload_id = res.body["UploadId"]

  # Store ETags of upload parts
  part_tags = []

  # Upload each part
  # TODO: optionally upload chunks in parallel using threads
  # (may cause network performance problems with many small chunks)
  # TODO: Support large chunk sizes without reading the chunk into memory
  if body.respond_to?(:rewind)
    body.rewind  rescue nil
  end
  while (chunk = body.read(multipart_chunk_size)) do
    part_upload = service.upload_part(directory.key, key, upload_id, part_tags.size + 1, chunk, part_headers(chunk, options))
    part_tags << part_upload.headers["ETag"]
  end

  if part_tags.empty? #it is an error to have a multipart upload with no parts
    part_upload = service.upload_part(directory.key, key, upload_id, 1, '', part_headers('', options))
    part_tags << part_upload.headers["ETag"]
  end

rescue
  # Abort the upload & reraise
  service.abort_multipart_upload(directory.key, key, upload_id) if upload_id
  raise
else
  # Complete the upload
  service.complete_multipart_upload(directory.key, key, upload_id, part_tags)
end

but that's not the prettiest

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions