JCLOUDS-894: Add portable multipart upload#762
Conversation
|
@danbroudy @kahing @zack-shoylev This pull request follows on to the earlier one exposing the component multipart operations. |
There was a problem hiding this comment.
We need a better strategy here -- we should pick a combination of minimum part size, maximum part size, and number of parts. A good combination will do less work when encountering network errors and allow better use of the uplink via parallel uploads.
There was a problem hiding this comment.
AWS Java S3 SDK does the following:
public static long calculateOptimalPartSize(PutObjectRequest putObjectRequest, TransferManagerConfiguration configuration) {
double contentLength = TransferManagerUtils.getContentLength(putObjectRequest);
double optimalPartSize = (double)contentLength / (double)MAXIMUM_UPLOAD_PARTS;
// round up so we don't push the upload over the maximum number of parts
optimalPartSize = Math.ceil(optimalPartSize);
return (long)Math.max(optimalPartSize, configuration.getMinimumUploadPartSize());
}AWS SDK defaults to a maximum of 10000 parts. The minimum default part size is 5MB. So, uploading a 51GB file, for example, would use 8500 6MB parts.
jclouds could use a similar mechanism. It would probably make sense to expose the configuration parameters to be able to change the default behavior.
There was a problem hiding this comment.
I reparented the S3 MultipartUploadSlicingAlgorithm to core so we have the same algorithm as before.
This unifies the provider multipart upload code paths and removes code duplication.
This unifies the provider multipart upload code paths and removes code
duplication.