Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(cli): cdk diff cdk-assume-role-credential-plugin auth issue introduced in v1.75.0 #11792

Closed
scarytom opened this issue Nov 30, 2020 · 16 comments · Fixed by #11966
Closed

(cli): cdk diff cdk-assume-role-credential-plugin auth issue introduced in v1.75.0 #11792

scarytom opened this issue Nov 30, 2020 · 16 comments · Fixed by #11966
Assignees
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. p1 package/tools Related to AWS CDK Tools or CLI

Comments

@scarytom
Copy link
Contributor

scarytom commented Nov 30, 2020

I recently upgraded from v1.74.0 to v1.75.0, and our cdk diff started failing with the an auth error. We use the cdk-assume-role-credential-plugin to allow cdk to run via a user in our central authentication account.

Strangely, cdk deploy works fine, so this only affects cdk diff.

Reproduction Steps

We bootstrap our account using the CloudFormation template (https://raw.githubusercontent.com/aws/aws-cdk/master/packages/aws-cdk/lib/api/bootstrap/bootstrap-template.yaml) with the following params:
"TrustedAccounts": "our auth account id",
"Qualifier": "infra",
"CloudFormationExecutionPolicies": "arn:aws:iam::aws:policy/AdministratorAccess"

What did you expect to happen?

cdk diff should output differences between the stack definitions and the current deployed stacks.

What actually happened?

We get the following error message:

Could not assume role in target account (did you bootstrap the environment with the right '--trust's?): User: arn:aws:sts::676932xxxxxx:assumed-role/cdk-infra-deploy-role-676932xxxxxx-us-east-1/676932xxxxxx-0-session is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::676932xxxxxx:role/cdk-infra-deploy-role-676932xxxxxx-us-east-1

Environment

  • CDK CLI Version : 1.75.0
  • Framework Version: ?
  • Node.js Version: v15.3.0
  • OS : Debian Buster
  • Language (Version): OpenJDK Runtime Environment (build 11.0.9+11-post-Debian-1deb10u1)

We run our CDK in the following docker container, if that helps at all:

https://hub.docker.com/repository/docker/strategicblue/cdk

FROM node:buster

ENV MVN_VERSION 3.6.3

RUN npm install -g aws-cdk cdk-assume-role-credential-plugin \
  && apt-get update \
  && apt-get install -y openjdk-11-jdk \
  && curl -fsSLO --compressed "https://downloads.apache.org/maven/maven-3/$MVN_VERSION/binaries/apache-maven-$MVN_VERSION-bin.tar.gz" \
  && tar -xzf "apache-maven-$MVN_VERSION-bin.tar.gz" -C /usr/local --strip-components=1 --no-same-owner \
  && rm "apache-maven-$MVN_VERSION-bin.tar.gz"

Other

The failure message is weird, because it suggests that cdk has already assumed the correct role in the target account, but is then trying to assume the role again, and that role doesn't have sts permissions to assume itself, so that fails.


This is 🐛 Bug Report

@scarytom scarytom added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 30, 2020
@SomayaB SomayaB changed the title diff: cdk-assume-role-credential-plugin auth issue introduced in v1.75.0 (cli): cdk diff cdk-assume-role-credential-plugin auth issue introduced in v1.75.0 Nov 30, 2020
@github-actions github-actions bot added the package/tools Related to AWS CDK Tools or CLI label Nov 30, 2020
@polothy
Copy link
Contributor

polothy commented Dec 1, 2020

Ran into this same problem with 1.75.0 and the cdk-assume-role-credential-plugin. Fails for me with CDK diff and deploy though. Oddly works fine when I run it in my pipeline in CodeBuild.

What seems to trigger it for me is cross account deployment. If I stick to the same account as my AWS profile, it works, but once I include dependent stacks that goes cross account, it fails to assume the role (whereas it worked before).

@polothy
Copy link
Contributor

polothy commented Dec 1, 2020

Reading through the 1.75.0 release notes, the #11350 PR looks like it could have introduced the change that is causing the problem.

If I had to guess, it might be due to the change to masterCredentials and it may no longer fallback to your default AWS credentials if you have a plugin. Just a guess though...

@jmortlock
Copy link
Contributor

I have hit this problem today as well, Anyone found a workaround to this problem?

@rix0rrr
Copy link
Contributor

rix0rrr commented Dec 7, 2020

To all in this thread, I want to stress that cdk-assume-role-credential-plugin is not a recommended plugin to assume roles created by cdk bootstrap. It might come in useful as a stopgap if you want to auth into roles that exist outside the CDK ecosystem and tooling, but we do not recommend the combination of CDK's built-in auth delegation and credential providers.

If CDK's built-in auth mechanisms are not always working (such as there being a mismatch between deploy and diff, then that's a bug we should be fixing).

I will describe the flow the auth system goes through, and then you can decide how to set up your auth.

[cli] Hey, we're going to deploy to 1111111111/us-east-1
[app] Okay cool. In order to deploy you'll have to assume role `cdk-hnb659fds-deploy-role-11111111111-us-east-1`
[cli] Cool. Let me look for credentials in order to assume that role.

(...this is where any credential providers you have configured come in... but remember those were invented before we had modern bootstrapping and so don't know about delegation...)
[cli] Any credential providers have credentials for account 1111111111?

(if you have the cdk-assume-role-credential-plugin configured)
    [cdk-assume-role-credential-plugin] Yes, me. Hold on! 
    (proceeds to assume-role into `cdk-hnb659fds-deploy-role-11111111111-us-east-1`)
    [cdk-assume-role-credential-plugin] Here you go. Fresh credentials for account 1111111111!
    
    [cli] Thanks for the credentials. Remember, I'm now going to use those to assume into `cdk-hnb659fds-deploy-role-11111111111-us-east-1`!
    (assume-role fails)

(if you didn't have any credential plugins configured)
    [cli] No credential providers, I'm going to use AWS CLI credentials.
    [cli] Okay let me use those to assume into `cdk-hnb659fds-deploy-role-11111111111-us-east-1`
    (it works)

You see the credential provider mechanism and the modern bootstrapping/stack synthesis are trying to solve the same problem, and thereby stepping on each other's toes.

@rix0rrr
Copy link
Contributor

rix0rrr commented Dec 7, 2020

@scarytom I'm suprised the behavior between deploy and diff would be different though.

Would you mind running both with -v and pasting the results?

@rix0rrr rix0rrr added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Dec 7, 2020
@SomayaB SomayaB removed the needs-triage This issue or PR still needs to be triaged. label Dec 7, 2020
@corymhall
Copy link
Contributor

@rix0rrr

To all in this thread, I want to stress that cdk-assume-role-credential-plugin is not a recommended plugin to assume roles created by cdk bootstrap. It might come in useful as a stopgap if you want to auth into roles that exist outside the CDK ecosystem and tooling, but we do not recommend the combination of CDK's built-in auth delegation and credential providers.

Prior to #11350 the cdk-assume-role-credential-plugin worked I think the way that you intended, i.e. it didn't assume roles created by CDK Bootstrap. The modern stack synthesis authentication would always use the default user credentials when it assumed one of the bootstrapped roles, regardless of whether you were using a credential plugin. With #11350 the authentication modes are now mixed, which it sounds like is not something that was intended.

prior to #11350

[cli] Hey, we're going to deploy to 1111111111/us-east-1
[cli] Do you have access to 1111111111? Checking default credentials...
[cli]Nope, let me check any credential plugins .

[cli] Any credential providers have credentials for account 1111111111?

(if you have the cdk-assume-role-credential-plugin configured)
    [cdk-assume-role-credential-plugin] Yes, me. Hold on! 
    (proceeds to assume-role into `cdk-readOnlyRole` or other specified role)
    [cdk-assume-role-credential-plugin] Here you go. Fresh credentials for account 1111111111!
    
    [cli] Thanks for the credentials, but I'm not going to use those to assume into `cdk-hnb659fds-deploy-role-11111111111-us-east-1` I'm still going to use the default credentials!
    (assume-role succeeds)

@scarytom
Copy link
Contributor Author

scarytom commented Dec 7, 2020

@rix0rrr thanks for your detailed response.

I'm trying to get my head around what you are saying. Take, for example, the second path you describe:

(if you didn't have any credential plugins configured)
[cli] No credential providers, I'm going to use AWS CLI credentials.
[cli] Okay let me use those to assume into cdk-hnb659fds-deploy-role-11111111111-us-east-1
(it works)

I tried setting my AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to credentials in my auth account for a user that has permission to assume the deploy role in the target account. When I run cdk diff, I just get the following output:
Need to perform AWS calls for account 111111111111, but the current credentials are for 000000000000.

Doesn't this show that the cli doesn't just use the available creds to assume into cdk-hnb659fds-deploy-role-11111111111-us-east-1?

@jmortlock
Copy link
Contributor

I think the logic for the case where no credential providers are provided is also to strict as the current credential check explicitly matches the account you are deploying / diffing into

(if you didn't have any credential plugins configured)
    [cli] No credential providers, I'm going to use AWS CLI credentials.
    if) Check if the destination account matches the current credentials account (obtainCredentials method)
        [cli] Okay let me use those to assume into `cdk-hnb659fds-deploy-role-11111111111-us-east-1`
      (it works)
   else)
      [cli] error Need to perform AWS calls for account xxxx, but the current credentials are for yyyy

Also look forward to moving forward without the cdk-assume-role-credential-plugin plugin :)

@scarytom
Copy link
Contributor Author

scarytom commented Dec 7, 2020

@scarytom I'm suprised the behavior between deploy and diff would be different though.

Would you mind running both with -v and pasting the results?

@rix0rrr as requested, here are the outputs for a failing cdk diff and a successful cdk deploy

cdk-1.75.0-deploy.txt
cdk-1.75.0-diff.txt

@scarytom
Copy link
Contributor Author

scarytom commented Dec 7, 2020

@rix0rrr If I read your description of the cdk auth system flow correctly, then the correct way to do this is to have another role in the target account that the cdk-assume-role-credential-plugin is configured to assume before cdk takes over and assumes the appropriate bootstrap roles? If so, what bootstrap roles does this role need to be able to assume, just the cdk-xxx-deploy role, or does it need to be able to assume the cfn-exec or publishing roles too?

I believe that I must still use the cdk-assume-role-credential-plugin if my user is in a different account to the target account. Please correct me if there is a way to avoid using the plugin in this scenario.

Thanks again.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Dec 8, 2020
@rix0rrr rix0rrr added @aws-cdk/core Related to core CDK functionality p1 labels Dec 8, 2020
@rix0rrr rix0rrr added this to the [GA] CDK Pipelines milestone Dec 8, 2020
@redbaron
Copy link
Contributor

redbaron commented Dec 8, 2020

@scarytom ,

If so, what bootstrap roles does this role need to be able to assume

CDK bootstrap doesn't create such role yet. New role has to be created for plugin to work, this role permissions should allow it to:

  • assume existing bootstrap roles for deploy, image upload and file upload
  • allow resource lookups in the target account

I was able to successfully use plugin to do cross account cdk deploy with a new role with following policies:

  • attached job-function/ViewOnlyAccess policy
  • inline policy to assume CDK bootstrapped roles:
             "Action": "sts:AssumeRole",
             "Resource": [
                 "arn:aws:iam::99999999:role/cdk-hnb659fds-deploy-role-9999999-eu-west-2",
                 "arn:aws:iam::99999999:role/cdk-hnb659fds-file-publishing-role-9999999-eu-west-2",
                 "arn:aws:iam::99999999:role/cdk-hnb659fds-image-publishing-role-999999-eu-west-2"
             ],
             "Effect": "Allow"
    

@polothy
Copy link
Contributor

polothy commented Dec 8, 2020

Just wanted to chime in and +1 this comment: #11792 (comment)

This is the only reason why I use the cdk-assume-role-credential-plugin. Without the plugin, I get this error:

Need to perform AWS calls for account AAA, but the current credentials are for BBB.

I configure the plugin to just use the bootstrap role in cdk.json:

"assume-role-credentials:readIamRoleName": "cdk-hnb659fds-deploy-role-{ACCOUNT_ID}-us-east-1"

And this basically allowed me to get past the Need to perform AWS calls... error and then CDK would go ahead and assume the new bootstrap roles for me. This is incredibly useful for deploying a stack that depend on other stacks that are not in the same account. A single cdk deploy root-stack-name deploys all the stacks to all the different AWS accounts. Otherwise, I'd have to deploy them separately, passing in different credentials, which aren't even used due to the switching to the new bootstrap roles.

I did also try the new --lookups=false flag, in hopes it would skip this AWS account check, but get the same error.

Any recommendations on how to get CDK to just attempt to assume the new bootstrap roles vs doing the account ID checking? Right now, it only seems possible with this plugin.

@rix0rrr
Copy link
Contributor

rix0rrr commented Dec 9, 2020

With #11350 the authentication modes are now mixed, which it sounds like is not something that was intended.

Well, it was somewhat. The point is that:

  • you could have different credentials in 11111111111, distinct from cdk-hnb659fds-deploy-role-11111111111-us-east-1 (let's say, my-account-user) which a plugin can get for you, and
  • you would use THOSE credentials to assume cdk-hnb659fds-deploy-role-11111111111-us-east-1

You would then completely forego the --trust mechanism that CDK Bootstrap provides... but then apparently you don't need it.

@rix0rrr
Copy link
Contributor

rix0rrr commented Dec 9, 2020

I believe that I must still use the cdk-assume-role-credential-plugin if my user is in a different account to the target account. Please correct me if there is a way to avoid using the plugin in this scenario.

The design goal is that you don't have to. It might be that we broke that though, which your previous comment also seems to indicate. I'll try to confirm.

@rix0rrr
Copy link
Contributor

rix0rrr commented Dec 9, 2020

Ah, yep. Can reproduce.

rix0rrr added a commit that referenced this issue Dec 9, 2020
This addresses a regression introduced in #11350 and reported in #11792.

The original PR added support for using a credential provider as a source
for credentials that will be used to assume the CDK Deploy Roles
(created by Modern Bootstrapping and used by Default Synthesis).

However, by doing so, it broke support for using current AWS credentials
to do the same.

Closes #11792.
@mergify mergify bot closed this as completed in #11966 Dec 11, 2020
mergify bot pushed a commit that referenced this issue Dec 11, 2020
This addresses a regression introduced in #11350 and reported in #11792.

The original PR added support for using a credential provider as a source
for credentials that will be used to assume the CDK Deploy Roles
(created by Modern Bootstrapping and used by Default Synthesis).

However, by doing so, it broke support for using current AWS credentials
to do the same.

## Original Behavior

- In case of legacy stack, use "current credentials" if they are for the right account, otherwise ask credential provider plugins, fail if current credentials are for the wrong account
- In case of DefaultSynthesized stack, always use "current credentials" to assume role

## Broken Behavior

- In case of legacy stack, use "current credentials" if they are for the right account, otherwise ask credential provider plugins for credentials to the target account, fail if current credentials are for the wrong account
- In case of DefaultSynthesized stack, 
  - Use "current credentials" if they are for the right account, otherwise ask credential provider plugins for credentials to the target account, fail if current credentials are for the wrong account (reusing the logic from legacy stacks)
  - Then assume role using credentials obtained in the previous step

The reuse of the same "credential obtaining" logic here broke cross-account role assumption, because we'd fail as soon as the current credentials would be for the wrong account instead of still trying to use them for AssumeRole.

## New Behavior

- Try to get "base credentials" for a target environment: 
  - Use "current credentials" if they are for the right account
  - Ask credential provider plugins
  - Use "current credentials" if they are for the wrong account (but remember that they were wrong)
- In case of a legacy stack:
  - If the credentials are for the wrong account, fail
- In case of a DefaultSynthesized stack:
  - Use the set of "base credentials" to try to assume the target role
  - If that succeeds, we're done
  - (Fallback) If it fails and the base credentials were already for the right account, use them after all. This fallback will allow people already using credential plugins to keep on using them, even when interoperating with DefaultSynthesized stacks. It will also allow people who seed their terminal with "ReadOnly" credentials (which might not have `sts:AssumeRole` permissions) to still run `cdk diff` as in the past.

## Changes to tests

There has been a major refactoring of the tests around this.

The current way of testing (mocking some calls left and right) was completely insufficient to test these mechanisms properly. I've therefore resorted to implementing a fake, in-memory version of STS's `GetCallerIdentity` and `AssumeRole`, and using the testing library `nock` to redirect network calls to that in-memory service. This will allow us to test the entire chain from our code down to and including the SDK's behavior, and make sure the right behavior happens without worrying about exact call orders.

At the same time, the single gigantic test fixture (with the `~/.aws/credentials` and `~/.aws/config` files) was becoming rather cumbersome. Instead, each test now includes just the one or two profile sections it uses, and a helper function creates both the config files as well as immediately creates those users in-memory in the fake STS service.

Closes #11792.


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

flochaz pushed a commit to flochaz/aws-cdk that referenced this issue Jan 5, 2021
This addresses a regression introduced in aws#11350 and reported in aws#11792.

The original PR added support for using a credential provider as a source
for credentials that will be used to assume the CDK Deploy Roles
(created by Modern Bootstrapping and used by Default Synthesis).

However, by doing so, it broke support for using current AWS credentials
to do the same.

## Original Behavior

- In case of legacy stack, use "current credentials" if they are for the right account, otherwise ask credential provider plugins, fail if current credentials are for the wrong account
- In case of DefaultSynthesized stack, always use "current credentials" to assume role

## Broken Behavior

- In case of legacy stack, use "current credentials" if they are for the right account, otherwise ask credential provider plugins for credentials to the target account, fail if current credentials are for the wrong account
- In case of DefaultSynthesized stack, 
  - Use "current credentials" if they are for the right account, otherwise ask credential provider plugins for credentials to the target account, fail if current credentials are for the wrong account (reusing the logic from legacy stacks)
  - Then assume role using credentials obtained in the previous step

The reuse of the same "credential obtaining" logic here broke cross-account role assumption, because we'd fail as soon as the current credentials would be for the wrong account instead of still trying to use them for AssumeRole.

## New Behavior

- Try to get "base credentials" for a target environment: 
  - Use "current credentials" if they are for the right account
  - Ask credential provider plugins
  - Use "current credentials" if they are for the wrong account (but remember that they were wrong)
- In case of a legacy stack:
  - If the credentials are for the wrong account, fail
- In case of a DefaultSynthesized stack:
  - Use the set of "base credentials" to try to assume the target role
  - If that succeeds, we're done
  - (Fallback) If it fails and the base credentials were already for the right account, use them after all. This fallback will allow people already using credential plugins to keep on using them, even when interoperating with DefaultSynthesized stacks. It will also allow people who seed their terminal with "ReadOnly" credentials (which might not have `sts:AssumeRole` permissions) to still run `cdk diff` as in the past.

## Changes to tests

There has been a major refactoring of the tests around this.

The current way of testing (mocking some calls left and right) was completely insufficient to test these mechanisms properly. I've therefore resorted to implementing a fake, in-memory version of STS's `GetCallerIdentity` and `AssumeRole`, and using the testing library `nock` to redirect network calls to that in-memory service. This will allow us to test the entire chain from our code down to and including the SDK's behavior, and make sure the right behavior happens without worrying about exact call orders.

At the same time, the single gigantic test fixture (with the `~/.aws/credentials` and `~/.aws/config` files) was becoming rather cumbersome. Instead, each test now includes just the one or two profile sections it uses, and a helper function creates both the config files as well as immediately creates those users in-memory in the fake STS service.

Closes aws#11792.


----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. p1 package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants