New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust chunksize if necessary #49
Conversation
This passes in chunksize instead of the entire config, allowing a custom chunksize to be provided.
What do you all think about not making the adjust chunk size "global"? That's one of the quirks I remember from the CLI/boto3 that this meant you had to update every single test and use "realistic" sized data in unit tests when they don't really matter (for those particular tests). What if these objects instead took a "part size adjuster". We could then have the tests use the "noop" version for unit tests that don't actually care and in the "real" code we could pass the actual chunk size adjuster. Also, just a heads up, the travis build failed in case you haven't seen it yet. |
Just wanted to capture offline conversation I had with @kyleknap . Let's go ahead and update this to be configurable. I think having a this validation be global was reasonable for the CLI, but I think for this library it makes sense to allow this to be configurable. @kyleknap let me know if I missed anything. |
As we discussed offline, there's some consideration on how we would want to open this up in the future. In the meantime, we'll settle for leaving it open to configuration by allowing an adjuster to be provided with |
self.min_size = min_size | ||
self.max_parts = max_parts | ||
|
||
def adjust_chunksize(self, current_chunksize, file_size=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we allow file_size to be None
? Do we actually need that behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We don't know the file size when uploading non-seekable files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. It would be good to document what None
implies here though.
I'll probably defer to @kyleknap on this, but after seeing this version, I'm a little hesitant about this new concept of default values in task submission. These are essentially params that will never be invoked by "real" code (i.e through the TransferManager public API). If we're not going to offer a public API right now, I'm not sure if there's much benefit in plumbing this in right now. I suppose that puts me at -0 but I can be convinced. |
# If we try to upload a 5TB file, we'll need to use 896MB part | ||
# sizes. | ||
new_size = self.adjuster.adjust_chunksize(chunksize, file_size) | ||
self.assertEqual(new_size, 896 * (1024 ** 2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also test for when the file size is None
.
Yeah I would prefer holding off on making the Otherwise it looks good. Just had comments on the tests. |
52774d2
to
8c04bdd
Compare
Updated |
num_parts = file_size / new_size | ||
self.assertLessEqual(num_parts, MAX_PARTS) | ||
|
||
def test_unknown_file_size(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add a test when we don't know the file size and we actually make an adjustment (either above/below the thresholds).
Looks good to me, just had a small request for a test. |
Tests added |
Thanks. Looks good. 🚢 Also make sure you add a changelog entry. |
This will adjust the chunksize of an upload or copy if it is too small, too large, or if it looks like max parts will be exceeded. This is unnecessary for downloads since there are no such restrictions for ranged downloading.
84254d2
to
45a6050
Compare
This will adjust the chunksize of an upload or copy if it is too small, too
large, or if it looks like max parts will be exceeded. This is unnecessary
for downloads since there are no such restrictions for ranged downloading.
cc @kyleknap @jamesls