-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(s3): add option to auto delete objects upon bucket removal #9751
Conversation
Great suggestion will wait for this one +1 :) |
+1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Chriscbr - this is great! :)
packages/@aws-cdk/aws-s3/lib/auto-delete-objects-handler/auto-delete-objects-handler.ts
Outdated
Show resolved
Hide resolved
packages/@aws-cdk/aws-s3/lib/auto-delete-objects-handler/auto-delete-objects-handler.ts
Outdated
Show resolved
Hide resolved
This PR has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled. |
Dont close |
How about using the CR framework from core here https://github.com/aws/aws-cdk/tree/master/packages/%40aws-cdk/core#the-corecustomresourceprovider-class? I think it will be much cleaner... |
Thanks! Definitely seems cleaner. @Chriscbr what do you think? will save a lot of boilerplate. |
Yeah, that sounds like a great idea - thanks for bringing this feature to my attention @jogold ! I'll try to rework the implementation as soon as I'm able. |
Not finished, but I've begun the main porting (code cov failing). Todo:
|
policyStatements: [ | ||
{ | ||
Effect: 'Allow', | ||
Resource: '*', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource: '*', | |
Resource: this.bucketArn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out this policy for this lambda needs to have access to all buckets in the stack, since it is a singleton that is shared with the entire stack. So I think specifying either * or arn:aws:s3:::* should work here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think this is a very important consideration we should think about. I'm not comfortable with creating a lambda function that has permissions to delete all objects from all buckets.
I think that it's better to create a provider per bucket, which i know is currently not possible, but maybe we should add it.
@rix0rrr What do you think about either making the constructor public, or allow passing a scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we use a bucket policy here instead? (a policy that would allow the role of the custom resource to act on the bucket)
Hi all - sorry it took me a while to get back to this PR. I've updated the unit tests to use @iliapolo mentioned:
I'm not totally sure to what degree the security risks are, but this sounds like a valid consideration. The one point I will mention is with the current implementation, even if we changed CustomResourceProvider to make this possible (since right now the @jogold mentioned:
I did look into this today, but because of CustomResourceProvider's restricted APIs there's no direct way to refer to the IAM role that would need to be included in the Bucket Policy's policy statement. Currently you can do something hacky like this: const provider = Stack.of(this).node.findChild(`${AUTO_DELETE_OBJECTS_RESOURCE_TYPE}CustomResourceProvider`);
const role = Node.of(provider).children.find((x) => (x as CfnResource).cfnResourceType === 'AWS::IAM::Role')!;
const roleArn = Token.asString((role as CfnResource).getAtt('Arn'));
this.addToResourcePolicy(new iam.PolicyStatement({
principals: [new iam.ArnPrincipal(roleArn)],
actions: [...perms.BUCKET_READ_ACTIONS, ...perms.BUCKET_DELETE_ACTIONS],
resources: [this.bucketArn, this.arnForObjects('*')],
})); But even when I tried this (and removed the blanket policy statement in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in a great shape now! Some small comments:
@@ -0,0 +1,43 @@ | |||
export async function handler(event: AWSLambda.CloudFormationCustomResourceEvent) { | |||
// eslint-disable-next-line @typescript-eslint/no-require-imports, import/no-extraneous-dependencies | |||
const s3 = new (require('aws-sdk').S3)(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not initialize this out of the handler
scope into the global one? This would avoid passing of the instance to every function and also the usage of any
type for the s3 client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, this is a remnant of how I was mocking the s3 client previously - I'll go fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could do something like:
import * as AWS from 'aws-sdk';
const s3 = new AWS.S3();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, for reasons I can't ascertain, when I hoist the const s3 = new (require(...
line to the file's outer scope, or even perform a regular import like you suggested, then jest stops mocking the module properly, causing all my tests to run actual aws-sdk API calls (the first error it runs into a NoSuchBucket
exception since the tests use dummy bucket names).
One thing to keep in mind is that the @aws-cdk/aws-s3 package doesn't actually have aws-sdk as a dependency. (That makes the above behavior odd - but the entire repo shares a single node_modules directory which does have aws-sdk so it's not that surprising).
Based on that, unless there is a way to fix this import behavior, I'll just keep passing the s3 client around. At the end of the day I think this is OK - I've tested that it works, and I'm not worried about any performance implications of instantiating the client inside the method since the "create" and "delete" operations only get once per bucket-and-custom-resource pair. But if you have any tips to fix this that is cool too. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jest mocks might not be working in this new scenario because of how the actual module was imported. Previously, since the module was imported inside the handler, the jest.mock
call could be placed in the describe
block.
Now that the module is imported at the file level, the jest.mock
should also be done at the top of the tests file (since the actual module gets imported as soon as you import the handler file).
Please give it a try once. If it works, great! Otherwise you can keep this in the current form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that made it work! Thank you for teaching something new. :) I did have to move the mock above the handler import in order for it to fully work, but I think this is ok.
packages/@aws-cdk/aws-s3/lib/auto-delete-objects-handler/index.ts
Outdated
Show resolved
Hide resolved
packages/@aws-cdk/aws-s3/lib/auto-delete-objects-handler/index.ts
Outdated
Show resolved
Hide resolved
BucketName: 'MyBucket', | ||
}, | ||
}; | ||
await invokeHandler(event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awaiting the invokeHandler
is the "WHEN" section here :)
@@ -0,0 +1,43 @@ | |||
export async function handler(event: AWSLambda.CloudFormationCustomResourceEvent) { | |||
// eslint-disable-next-line @typescript-eslint/no-require-imports, import/no-extraneous-dependencies | |||
const s3 = new (require('aws-sdk').S3)(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jest mocks might not be working in this new scenario because of how the actual module was imported. Previously, since the module was imported inside the handler, the jest.mock
call could be placed in the describe
block.
Now that the module is imported at the file level, the jest.mock
should also be done at the top of the tests file (since the actual module gets imported as soon as you import the handler file).
Please give it a try once. If it works, great! Otherwise you can keep this in the current form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing! LGTM 👍
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@Chriscbr can you share the resulting bucket policy? I don't see why it wouldn't work. Also if we go in this direction we should cleanly expose the role of the custom resource provider. The policy should look like this one: {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::123456789012:role/<CR-LAMBDA-ROLE>"
},
"Action": "s3:DeleteObject",
"Resource": "arn:aws:s3:::<BUCKET>/*"
}
]
} |
@iliapolo I want to take another look, but for the next week or so I am fairly occupied with university exams so it will be hard to find time. I'm okay with moving this to a draft or closed state, or letting anyone else take a stab at this. |
@Chriscbr cool, good luck with your exams! :) |
The role ARN can then be used in resource-based policies. See aws#9751 (comment)
|
Hi, I am new to this thread. I also looked into this issue and solved it with something like this: |
The role ARN can then be used in resource-based IAM policies. See #9751 (comment) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Use the custom resource provider from core to delete objects in the bucket. A bucket policy gives the correct permissions to the provider's Lambda function role. Credits to @Chriscbr for starting the work on this. Closes #3297 Closes #9751 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
The role ARN can then be used in resource-based IAM policies. See aws#9751 (comment) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Use the custom resource provider from core to delete objects in the bucket. A bucket policy gives the correct permissions to the provider's Lambda function role. Credits to @Chriscbr for starting the work on this. Closes aws#3297 Closes aws#9751 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Closes #3297
Implementation
To include the custom resource needed for this feature, I initially wanted to just use a nested sub-package like I saw used in aws-cdk/aws-dynamodb-global - this would allow me to write tests, separate the dependencies out, etc. However, when I tried creating a lambda function, I quickly discovered that we can't actually use lambda as a dependency, because that would cause a dependency cycle. After further research, I found this problem had been encountered before in the
BucketNotification
class, and it was solved by creating an fake object that behaved like a lambda function, so I copied a lot of this code.The second problem I found was that since I can't directly use
lambda.Code.fromAsset
, I can't easily include the sub-package as a zip file to the lambda. (In particular, my understanding is thatlambda.Code.fromAsset
works by hooking into the cdk deploy system and uploading the asset, so this would be complex to do without the aws-cdk/aws-lambda library). The solution to this is to just inline the function directly, so this is also the approach I went for.Once we migrate to a monolithic cdk library later this year, we can probably refactor this if desired.
Testing
I've added a few unit tests so far, and I tested that this works with versioned and encrypted buckets on my personal AWS account. Looking for constructive feedback / missing blind spots as always!
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license