Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement(aws_s3 source): Send content_md5 when writing objects #7936

Merged
merged 2 commits into from Jun 21, 2021

Conversation

jszwedko
Copy link
Member

This is required when the bucket has object
locking

enabled, but also gives a bit more confidence that the object makes it
to S3 uncorrupted.

Unfortunately the test I added doesn't actually validate that Vector
works with object locked buckets due to a gap in
localstack
but
I did verify it manually be creating a bucket with object locking
enabled, observing the failure, making the implementation change, and
observing the success.

Closes: #7310

Signed-off-by: Jesse Szwedko jesse@szwedko.me

This is required when the bucket has [object
locking](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock.html)
enabled, but also gives a bit more confidence that the object makes it
to S3 uncorrupted.

Unfortunately the test I added doesn't actually validate that Vector
works with object locked buckets due to [a gap in
localstack](localstack/localstack#4166) but
I did verify it manually be creating a bucket with object locking
enabled, observing the failure, making the implementation change, and
observing the success.

Closes: #7310

Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
@jszwedko jszwedko requested review from spencergilbert and a team June 18, 2021 19:57
Signed-off-by: Jesse Szwedko <jesse@szwedko.me>
#[tokio::test]
async fn s3_insert_message_into() {
let cx = SinkContext::new_test();

let config = config(1000000).await;
let bucket = uuid::Uuid::new_v4().to_string();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored these tests a bit to always create a new bucket to avoid any pollution and separate the config creation from the bucket creation.

Copy link
Contributor

@spencergilbert spencergilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to have this toggle-able?

@jszwedko
Copy link
Member Author

Is there any reason to have this toggle-able?

The only thing I can think of is if it reduced performance, but it seems to be negligible. Using their benches:

test bench1_10    ... bench:          19 ns/iter (+/- 0) = 526 MB/s
test bench2_100   ... bench:         153 ns/iter (+/- 10) = 653 MB/s
test bench3_1000  ... bench:       1,509 ns/iter (+/- 74) = 662 MB/s
test bench4_10000 ... bench:      15,090 ns/iter (+/- 670) = 662 MB/s

I believe the HTTP requests will far outweigh the additional cost introduced here.

@jszwedko jszwedko merged commit 444a166 into master Jun 21, 2021
@jszwedko jszwedko deleted the aws-s3-object-lock branch June 21, 2021 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS S3 sink with Object Lock enabled
2 participants