Skip to content

Cannot upload large files using Multipart to an S3 bucket with both Object Locking and Encryption Enabled. #2851

@jfleming-ic

Description

@jfleming-ic

Describe the bug

We are attempting to upload large files to an object locked bucket that has encryption enabled. We use multipart uploads and encryption with an object locked bucket. We set the following configuration on the transferManager

setAlwaysCalculateMultipartMd5(true)

However we receive the following stacktrace when attempting to upload:

com.amazonaws.services.s3.model.AmazonS3Exception: Content-MD5 OR x-amz-checksum- HTTP header is required for Put Part requests with Object Lock parameters (Service: Amazon S3; Status Code: 400; Error Code: InvalidRequest; Request ID:; S3 Extended Request ID:; Proxy: null) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3887) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3872) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:323) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:226) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:147) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:115) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:45) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) Sep 19 01:30:35 ip-10-66-172-138 cassandra-backup[77160]: at java.base/java.lang.Thread.run(Thread.java:829)

Expected Behavior

Uploads should respect the following configuration when uploading parts in parallel when encryption is enabled.

transferManager.getConfiguration().setAlwaysCalculateMultipartMd5(true)

Uploads using multi part should work on an S3 bucket with both encryption and object locking.

Current Behavior

We are required to increase our multipart thresh hold to prevent multipart upload from being executed, which causes a steep performance penalty on large uploads

Reproduction Steps

1 - Create an Object locked S3 bucket with encryption enabled.
2 - Create a file with 50MB of content at /my-mega-file.txt.
3 - The following code should trigger the issue (I don't think I can share our actual production code that's hitting this, so this is a rough mockup of it)

public static void main(String[] args) throws IOException {

    String bucketName = "my-bucket-for-aws";
    String keyName = "my-file";
    File file = new File("/my-mega-file.txt");
    String kmsKeyId = "some-key-id";

    AWSKMS kmsClient = AWSKMSClientBuilder.standard()
            .withRegion(Regions.DEFAULT_REGION)
            .build();

    AmazonS3EncryptionV2 s3Encryption = AmazonS3EncryptionClientV2Builder.standard()
            .withRegion(Regions.DEFAULT_REGION)
            .withKmsClient(kmsClient)
            .withCryptoConfiguration(new CryptoConfigurationV2().withCryptoMode((CryptoMode.AuthenticatedEncryption)))
            .withEncryptionMaterialsProvider(new KMSEncryptionMaterialsProvider(kmsKeyId))
            .build();

    TransferManager tm = TransferManagerBuilder.standard()
            .withS3Client(s3Encryption)
            .withAlwaysCalculateMultipartMd5(true)
            .build();

    Upload upload = tm.upload(bucketName, keyName, file);

    try {
        upload.waitForCompletion();
    } catch (AmazonClientException | InterruptedException e) {
        System.out.println("Uh oh, this could be a bug");
    }
}

Possible Solution

When increasing the threshold size, we see the file upload successfully, however this is non-ideal in our use case.

Additional Information/Context

We did a bit of a dig around in the AWS SDK code base and found the following.

AWS checks the setAlwaysCalculateMultipartMd5 flag here

futures.add(threadPool.submit(new UploadPartCallable(s3, request, shouldCalculatePartMd5())));
while performing a multi-part upload

However, that execution path is taken if uploading parts in parallel

Unfortunately, because we use encryption, parallel part upload is not supported and uploadPartsInSeries()

gets executed. Sadly, that path doesn’t compute/include the MD5 hash and hence the error

AWS Java SDK version used

1.12.279

JDK version used

openjdk version "1.8.0_292"

Operating System and version

Debian 9

Metadata

Metadata

Assignees

Labels

bugThis issue is a bug.response-requestedWaiting on additional info or feedback. Will move to "closing-soon" in 5 days.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions