Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack/BuildPipeline/ArtifactsBucketEncryptionKeyAlias fails to get recreated when cdk deploy follows cd destroy #4336

Closed
vgribok opened this issue Oct 2, 2019 · 22 comments · Fixed by #4400
Assignees
Labels
@aws-cdk/aws-codepipeline Related to AWS CodePipeline guidance Question that needs advice or information.

Comments

@vgribok
Copy link

vgribok commented Oct 2, 2019

A .NET CDK 1.9 (and then 1.10) project that builds a CodePipeline with CodeCommit and CodeBuild stages, fails when cdk deploy is run after cdk destroy with the error messages saying "XxxxxxxStack/BuildPipeline/ArtifactsBucketEncryptionKeyAlias (BuildPipelineArtifactsBucketEncryptionKeyAliasXXXXXXXX) alias/codepipeline-somepipelineb5eb558e already exists"
Since a KMS key can only be deleted only after at least 7 days later, the region where this error happens makes the region unusable for the CDK project for at least 7 days!

Reproduction Steps

  1. Build a stack including CodePipeline containing CodeCommit and CodeBuild. Deploy with cdk deploy
  2. Trigger a build that will check out source into an artifact.
  3. Run cdk destroy.
  4. Run cdk deploy again. Observe error above"
  5. Try to delete or rename the KMS key alias - cannot be done as it's not meant to be. KMS key can be scheduled to be deleted in nor earlier than 7 days.

Error Log

XxxxxxxStack/BuildPipeline/ArtifactsBucketEncryptionKeyAlias (BuildPipelineArtifactsBucketEncryptionKeyAliasXXXXXXXX) alias/codepipeline-somepipelineb5eb558e already exists

Environment

  • CLI Version : 1.10
  • Framework Version: 1.10
  • OS : Windows 10
  • Language : C#/.NET

Other

19/28 | 5:39:18 PM | CREATE_FAILED | AWS::KMS::Alias | CicdInfraAsCodeStack/BuildPipeline/ArtifactsBucketEncryptionKeyAlias (BuildPipelineA
rtifactsBucketEncryptionKeyAlias68E67B02) alias/codepipeline-cicdinfraascodestackbuildpipelineb5eb558e already exists
new Alias (C:\Users\username\AppData\Local\Temp\jsii-kernel-AAuzoS\node_modules@aws-cdk\aws-kms\lib\alias.js:69:26)
_ new Pipeline (C:\Users\username\AppData\Local\Temp\jsii-kernel-AAuzoS\node_modules@aws-cdk\aws-codepipeline\lib\pipeline.js:86:13)
_ C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6798:49
_ Kernel._wrapSandboxCode (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:7250:20)
_ Kernel._create (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6798:26)
_ Kernel.create (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6552:21)
_ KernelHost.processRequest (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6327:28)
_ KernelHost.run (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6270:14)
_ Immediate._onImmediate (C:\Users\username\AppData\Local\Temp\dcqv0cxm.fwo\jsii-runtime.js:6273:37)
_ processImmediate (internal/timers.js:439:21)


This is 🐛 Bug Report

@vgribok vgribok added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 2, 2019
@skinny85
Copy link
Contributor

skinny85 commented Oct 2, 2019

Hey @vgribok ,

thanks for opening the issue. I think there's a few things you can do here:

  1. Remove the Alias from the AWS CLI. I'm pretty sure an Alias can be removed without removing the underlying Key (only the Key has the 7 day grace period).
  2. The Key is there only by default. You can pass an explicit Bucket when creating the Pipeline, one that doesn't have a Key (or a different Key, possibly with a different Alias):
const myArtifactBucket = new s3.Bucket(this, 'Bucket', {
  // possibly pass a Key, maybe with an Alias, here?
  // ...
});

const pipeline = new codepipeline.Pipeline(this, 'Pipeline', {
  artifactBucket: myArtifactBucket.
  // ...
});

Let me know if this helps!

Thanks,
Adam

@skinny85
Copy link
Contributor

skinny85 commented Oct 2, 2019

Actually, I just thought of an even simpler way: just change the logical ID you have of your Pipeline (for example, if the current logical ID is 'Pipeline', change it to something like 'MyPipeline', or 'Pipeline2'). That will make it create an Alias with a different name, and avoid the conflict.

@vgribok
Copy link
Author

vgribok commented Oct 2, 2019

@skinny85 Thank you for suggested workarounds. I used aws kms delete-alias --alias-name alias/codepipeline-pipelineNameXxxxx and it did let me move past this issue. Hopefully, this issue will be fixed to achieve parity between cdk deploy and cdk destroy, so that it would not be necessary to do manual/scripted cleanups after cdk destroy before running next cdk deploy.

@skinny85
Copy link
Contributor

skinny85 commented Oct 2, 2019

Actually, it was customer feedback that made us add the Alias, and not remove it :) The complaint was that otherwise, we're generating KMS keys that the customers have no idea were used in a CodePipeline (because of the lack of a human-readable alias), and so they're not sure they can remove, accumulating junk in their accounts.

@vgribok
Copy link
Author

vgribok commented Oct 2, 2019

I can certainly see that, but then subsequent cdk deploy should not be failing, by either reusing the existing key or creation a new one with randomized suffix appended to the name. WDYT?

@skinny85
Copy link
Contributor

skinny85 commented Oct 2, 2019

That would be ideal, but unfortunately we can't do that :( Aliases cannot be automatically named by CloudFormation (you are required to provide them a name), and we must auto-generate stable names (if the names were different between 2 versions of the CDK-resulting template without any code changes, it would result in a replacement of the alias each time you did cdk deploy).

I understand this is not ideal, but I think this is the best we can with the tools at our disposal.

@skinny85 skinny85 self-assigned this Oct 2, 2019
@skinny85 skinny85 added guidance Question that needs advice or information. @aws-cdk/aws-codepipeline Related to AWS CodePipeline and removed needs-triage This issue or PR still needs to be triaged. bug This issue is a bug. labels Oct 2, 2019
@vgribok
Copy link
Author

vgribok commented Oct 4, 2019

@skinny85 Adding unencrypted S3 bucket to the PipelineProps resulted in "Amazon.JSII.Runtime.JsiiException System.Reflection.TargetParameterCountException: Parameter count mismatch" exception when running .NET CDK app (cdk synth) :-(.

That's using CDK 1.11 CLI and libs.

Regarding specifying key alias explicitly, would that not have the same behavior as it just would replace auto-generated Cfn key alias name with one specified explicitly?

Meanwhile, as a workaround, the key needs to be stripped of its alias name and then scheduled for deletion with shortest possible grace period of 7 days:

aws kms delete-alias --alias-name alias/codepipeline-pipelineNameXxxxx
aws kms schedule-key-deletion --key-id <value> --pending-window-in-days 7

After the above, cdk deploy can be run again to create the pipeline.

@skinny85
Copy link
Contributor

skinny85 commented Oct 4, 2019

@skinny85 Adding unencrypted S3 bucket to the PipelineProps resulted in "Amazon.JSII.Runtime.JsiiException System.Reflection.TargetParameterCountException: Parameter count mismatch" exception when running .NET CDK app (cdk synth) :-(.

That's weird... I know it's more of a JSII bug, but out of curiosity, can you post the entire stack trace that you get?

You can also add a bucket with a key, but without an alias - maybe that works?

Thanks,
Adam

@vgribok
Copy link
Author

vgribok commented Oct 4, 2019

Here's the stack:

Unhandled Exception: Amazon.JSII.Runtime.JsiiException: System.Reflection.TargetParameterCountException: Parameter count mismatch.
   at System.Reflection.RuntimeMethodInfo.InvokeArgumentsCheck(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
   at Amazon.JSII.Runtime.CallbackExtensions.InvokeMethod(InvokeRequest request, IReferenceMap referenceMap)
   at Amazon.JSII.Runtime.CallbackExtensions.InvokeCallback(Callback callback, IReferenceMap referenceMap, IFrameworkToJsiiConverter converter, String& error)
   at Amazon.JSII.Runtime.Services.Client.TryDeserialize[TResponse](String responseJson)
   at Amazon.JSII.Runtime.Services.Client.ReceiveResponse[TResponse]()
   at Amazon.JSII.Runtime.Services.Client.TryDeserialize[TResponse](String responseJson)
   at Amazon.JSII.Runtime.Services.Client.ReceiveResponse[TResponse]()
   at Amazon.JSII.Runtime.Deputy.DeputyBase..ctor(DeputyProps props)
   at CicdInfraAsCode.CicdInfraAsCodeStack.CreateCiCdPipeline(Repository dockerRepo, Repository gitRepo) in C:\!Projects\AWS\Samples\Unicorn-Store\infra-as-code\CicdInfraAsCode\src\CicdInfraAsCodeStack.cs:line 44
   at CicdInfraAsCode.CicdInfraAsCodeStack..ctor(Construct parent, String id, UnicornStoreCiCdStackProps settings) in C:\!Projects\AWS\Samples\Unicorn-Store\infra-as-code\CicdInfraAsCode\src\CicdInfraAsCodeStack.cs:line 26
   at CicdInfraAsCode.Program.Main(String[] args) in C:\!Projects\AWS\Samples\Unicorn-Store\infra-as-code\CicdInfraAsCode\src\Program.cs:line 15
Subprocess exited with error 3762504530

@skinny85
Copy link
Contributor

skinny85 commented Oct 4, 2019

Thanks. Can you open an issue in the JSII repository, showing the stack trace and the code that causes it?

@skinny85
Copy link
Contributor

skinny85 commented Oct 4, 2019

Thanks!

@PeterBengtson
Copy link

PeterBengtson commented Oct 6, 2019

It would be great if the aws-codepipeline documentation Overview page could mention that the KMS key for the pipeline is used for the artifact bucket only, as it otherwise may seem that there's no way of controlling how the Pipeline construct uses KMS. I certainly fell into that trap.

It's sometimes not clear what abstractions CDK provides by default. This is especially true with encryption. The source is there of course, but with something as stateful as KMS it would be nice with a few extra details in the docs.

However. If this situation indeed isn't a bug but a feature requiring manual steps – without which a redeploy after a destroy will fail – then this forces certain architectural decisions when using CDK. That's less ideal in my opinion; CDK should widen, not narrow down our options.

In my particular use case, I will have to resort to either using manually managed KMS keys (and set up procedures around them with devs and ops and write appropriate installation and maintenance sections in the relevant runbooks), a non-encrypted bucket, or a custom resource which handles the existence of the alias.

@PeterBengtson
Copy link

PeterBengtson commented Oct 6, 2019

And also, this is a new kind of statefulness which requires deployment-time manual steps. It is also a backwards-incompatible feature released in a minor version, which isn't ideal for a module marked stable. You changed the default, mind you.

Doesn't the same kind of question of aliases-for-clarity apply to all other places where CDK defaults to using KMS encryption? Why just with pipelines? To my mind consistency across the CDK is more important. On the usability level it's good to remember that many of us deploy and tear down pipelines several times a day – it's a very common occurrence.

This is a case of CDK trying to be too clever, thereby shooting its users in the foot. What you have done is prompt people to use unencrypted buckets because the abstractions you more or less randomly have added now make pipelines too cumbersome to use. We suddenly find us having painted ourselves into a corner without knowing it, as we now have aliases which need special care. That's a breaking change we didn't ask for, and in a minor release.

I think you should bring up the issue of reverting the alias feature with the team. It needs to be rethought, and there ought to be a discussion about the principles of CDK's use of extra abstraction in general.

@skinny85
Copy link
Contributor

skinny85 commented Oct 7, 2019

However. If this situation indeed isn't a bug but a feature requiring manual steps – without which a redeploy after a destroy will fail – then this forces certain architectural decisions when using CDK. That's less ideal in my opinion; CDK should widen, not narrow down our options.

In my particular use case, I will have to resort to either using manually managed KMS keys (and set up procedures around them with devs and ops and write appropriate installation and maintenance sections in the relevant runbooks), a non-encrypted bucket, or a custom resource which handles the existence of the alias.

Can you explain why CDK narrows your options here? Why not use something like this in your code:

const bucket = new s3.Bucket(this, 'ArtifactBucket', {
  encryption: s3.BucketEncryption.KMS, // encrypted bucket with a key, _without_ an alias! 
  // ...
});

const pipeline = new codepipeline.Pipeline(this, 'Pipeline', {
  artifactBucket: bucket,
  // ...
});

or even this:

// no alias, key destroyed on `cdk destroy`
const key = new kms.Key(this, 'Key', {
  removalPolicy: RemovalPolicy.DESTROY,
  // ...
});
// bucket destroyed on `cdk destroy`, but only if it's empty!
const bucket = new s3.Bucket(this, 'ArtifactBucket', {
  encryptionKey: key,
  removalPolicy: RemovalPolicy.DESTROY,
  // ...
});
const pipeline = new codepipeline.Pipeline(this, 'Pipeline', {
  artifactBucket: bucket,
  // ...
});

@PeterBengtson
Copy link

PeterBengtson commented Oct 7, 2019

You narrow down the available options by changing the default behaviour in a minor version. This is something we didn't discover until now, after a few versions have been released, and we're thus deep in the management of aliases without having asked for it. True, we can change the infrastructure by explicitly specifying the artefact bucket, but that wasn't part of the initial contract.

Your examples are great, and they should be part of the CDK docs for aws-codepipeline. As I've said above, it's not always obvious how the extra high-level abstractions added by CDK work, even to people with many years of AWS experience. For every undocumented high-level abstraction or feature, the learning curve becomes steeper. This is a real problem, and one which the CDK team should recognise.

It would have been far better had you introduced the alias feature using a new prop to Pipeline, such as useKmsAlias: true, with a default of false. Then we would have had a choice, and no code would have been broken.

Also, since the CDK team likes to provide advanced abstractions as part of high-level constructs, you could have added a custom resource to handle the deletion issue automatically and transparently when useKmsAlias: true, just like you have added extra infrastructure to manage retention times for CloudWatch logs. Or indeed, you could have chosen either of your solutions above as the new default. Instead you went with something that breaks redeployment by default.

I would also like to point out that you haven't reacted to my second posting above, the one about CDK consistency. Suddenly, the CDK handles KMS keys in pipelines completely differently than everywhere else. Not ideal. I fail to understand why you listened to reactions from CodePipeline users specifically, to the extent that you changed the default behaviour. And in a minor version. Of a stable module.

As you say above, "I understand this is not ideal, but I think this is the best we can with the tools at our disposal." While I agree on the first part, it definitely isn't the best you can do with the tools at hand.

Please revert this feature. Rethink it. Then release it in a major release.

@skinny85
Copy link
Contributor

skinny85 commented Oct 7, 2019

I didn't respond to every point, because I don't want to prolong the discussion too much. While I don't agree with the conclusions you've drawn (like that CDK narrows your options), it's clearly causing you pain, and you're the second person to call this out, so this warrants attention.

I believe I have an idea on a middle-ground solution that will keep the advantages of having an alias, while getting rid of the issues mentioned here. I will try to submit a PR today. Stay tuned.

skinny85 added a commit to skinny85/aws-cdk that referenced this issue Oct 7, 2019
Currently, the KMS key and alias used for the default CodePipeline artifact bucket are created with RemovalPolicy.RETAIN.
That is problematic when trying to re-deploy a stack after running `cdk destroy`,
as the alias name will already be taken.
Because of that, change the removal policy of both the key and the alias to RemovalPolicy.DESTROY -
there is a grace period of a few days on the key before it's removed permanently,
so that should be good enough if anyone needs it,
and it doesn't seem like directly reading the artifacts of the pipeline is an important use case anyway,
especially after it has been deleted.

Fixes aws#4336
@skinny85
Copy link
Contributor

skinny85 commented Oct 7, 2019

@PeterBengtson let me know what you think about the solution proposed in #4400.

@PeterBengtson
Copy link

PeterBengtson commented Oct 8, 2019

Sorry, Adam. It's the same thing, but with a time limit. CDK now by default assumes pipelines will not be redeployed more often than 7 days. It's now a quirk in the framework which users will have to learn from the documentation and/or discover after the fact. (My team sure did, two weeks before launch.)

Also, the issue of consistency is still there: CDK handles KMS keys differently in CodePipeline than everywhere else. This doesn't make it easier to learn CDK.

What's wrong with the way I suggested: an extra parameter to Pipeline turning this breaking behaviour on if the user so chooses? That solution would work for every user and every use case.

@skinny85
Copy link
Contributor

skinny85 commented Oct 9, 2019

It's the same thing, but with a time limit. CDK now by default assumes pipelines will not be redeployed more often than 7 days.

It does not. The key will be removed, as well as the alias; re-deploying will create a new key, with a different ID (as the ID is generated by KMS), as well as an alias with the same name, pointing at the new key.

Does it make sense now?

@PeterBengtson
Copy link

PeterBengtson commented Oct 10, 2019

The issue is not whether what you've done makes sense. The issue is consistency across the CDK, backward compatibility, introducing breaking changes in minor versions, and increasing the learning curve of CDK. It's the CDK team's cavalier attitude towards semantic versioning. It's not understanding that adding abstractions in an opaque way to projects which already are underway whilst at the same time disregarding the conventions of semantic versioning makes professional projects considerably more difficult to manage. It's not understanding that how the CDK team currently does things will lead to rejection of CDK as a tool in any kind of controlled environment.

I'll explain our use case in detail privately to you and Elad – security concerns prevent me from doing so publicly – for it is important that you understand who we are and where we're coming from.

@skinny85
Copy link
Contributor

Sorry for the bad experience. I assure you we take semantic versioning very seriously. It was not my intention for this change to be breaking. In my defense, I can say I was trying to take customer feedback into account, and to make the CodePipeline construct more usable. It resulted in breaking your workflow, which I did not anticipate (although, to his credit, @rix0rrr did), and for which I apologize. I hope the change made in #4400 , which will be released in 1.13.0, will allow you to get back to the workflow you had before this change happened.

@mergify mergify bot closed this as completed in #4400 Oct 10, 2019
mergify bot pushed a commit that referenced this issue Oct 10, 2019
)

Currently, the KMS key and alias used for the default CodePipeline artifact bucket are created with RemovalPolicy.RETAIN.
That is problematic when trying to re-deploy a stack after running `cdk destroy`,
as the alias name will already be taken.
Because of that, change the removal policy of both the key and the alias to RemovalPolicy.DESTROY -
there is a grace period of a few days on the key before it's removed permanently,
so that should be good enough if anyone needs it,
and it doesn't seem like directly reading the artifacts of the pipeline is an important use case anyway,
especially after it has been deleted.

Fixes #4336
@PeterBengtson
Copy link

PeterBengtson commented Oct 11, 2019

Adam, I just realised that you might be thinking that I'm criticising you specifically. I should perhaps have been clearer in pointing out that when I've used the pronoun "you" I mean the CDK team. I'm trying to talk in general terms as much as I can; the specifics around the KMS alias thing are really not the main point. So please forgive me if I've led you to believe this is some kind of personal attack. It's not.

I've been a professional programmer for 40-odd years, and I've been doing infrastructure for the past 10, so I know where you're coming from. I'm a passionate advocate for the CDK – it's my preferred way of working. I consider the CDK to be a revolution in its own right. But as I and my colleagues see it, there are also a few tendencies which really need to be adressed. We're concerned about a few things which seem to afflict the entire CDK team.

I'll elaborate in that letter I mentioned, but for now, let me just say that the CDK team is not using semantic versioning. Almost every release note has contained a section called BREAKING CHANGES. These do not properly belong in anything but major releases (1.0.0, 2.0.0, 3.0.0, etc). My team and I are working in a extremely strict FinTech environment with security demands far exceeding PCI. We're absolutely dependent on semantic versioning in order to do our job. That's not all we're concerned about, but again, I'll describe the situation in a letter to you and Elad shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-codepipeline Related to AWS CodePipeline guidance Question that needs advice or information.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants