Glacier: Raise exception on empty archive upload attempt #1083

Closed
wants to merge 1 commit into
from
Jump to file or symbol
Failed to load files and symbols.
+12 −0
Split
@@ -44,3 +44,6 @@ def __init__(self, expected_responses, response):
class UploadArchiveError(Exception):
pass
+
+class EmptyArchiveError(UploadArchiveError):
+ pass
View
@@ -27,6 +27,7 @@
import math
import json
+from .exceptions import EmptyArchiveError
_ONE_MEGABYTE = 1024 * 1024
@@ -89,6 +90,10 @@ def compute_hashes_from_fileobj(fileobj, chunk_size=1024 * 1024):
linear_hash.update(chunk)
chunks.append(hashlib.sha256(chunk).digest())
chunk = fileobj.read(chunk_size)
+
+ if not chunks:
+ raise EmptyArchiveError()
@jamesls

jamesls Oct 29, 2012

Owner

Wouldn't it make more sense for this check to happen up a level? It just seems odd that compute_hashes_from_fileobj raises an EmptyArchiveError. It seems more appropriate for Vault.upload_archive to perform the empty file check and raise an exception if appropriate

@petertodd

petertodd Oct 30, 2012

It's almost a reasonable thing to do if Vault.upload_archive has been given a file, because with a file it can find out ahead of time how big the file it, but unfortunately it isn't a reasonable thing if it's been given a file descriptor instead. (like stdin) In that case the current code calls compute_hashes_from_fileobj() without being able to do any validation on the file object. To allow Vault.upload_archive() to raise the error you'd have to give compute_hashes_from_fileobj() it's own "empty file" error message, because tree_hash() just doesn't make sense without something to hash. (well, it should have just done sha256('').digest(), but that's not how Amazon defined it)

Even with a file of a known length, someone could still truncate the file between the check of it's length and when we upload it, so putting a similar check in the Writer class as opposed to upload_archive() still makes sense too.

IMO Amazon really should have allowed zero-length uploads...

+
return linear_hash.hexdigest(), bytes_to_hex(tree_hash(chunks))
@@ -156,6 +161,10 @@ def close(self):
return
if self._buffer_size > 0:
self.send_part()
+
+ if not self._tree_hashes:
+ raise EmptyArchiveError()
+
# Complete the multiplart glacier upload
hex_tree_hash = bytes_to_hex(tree_hash(self._tree_hashes))
response = self.vault.layer1.complete_multipart_upload(self.vault.name,