Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to check the environment variables for AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY #49

Closed
InbarRose opened this issue Mar 30, 2022 · 25 comments

Comments

@InbarRose
Copy link
Contributor

InbarRose commented Mar 30, 2022

We had an annoying issue where the checker was showing everything was configured okay. and we could connect to some of our containers but not all of them. we had to get AWS support involved.

the error we got was

An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

it turns out, that the SSM uses the AWS SDK which by default uses the AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY environment variables if they are present. we had those configured in our container but with very limited access, and this caused the SSM to fail, and so we couldn't do execute command.

I suggest to add a check when we "describe task" that will see if there are AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY defined in a container, that this could cause an issue. and ideally it should not be used, but if it is used then it should have SSM permissions.

ref: https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html#credentials-default

InbarRose added a commit to InbarRose/amazon-ecs-exec-checker that referenced this issue Mar 30, 2022
AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY can override the AWS SDK
see aws-containers#49
@InbarRose
Copy link
Contributor Author

I've opened a PR #50

@kalleth
Copy link

kalleth commented Mar 31, 2022

@InbarRose which error did you get? I've started having a problem recently when attempting to connect to some of my containers, ecs exec checker shows no problem, but my "web" containers (which have the AWS_ACCESS_KEY set for S3 access) with an error like the following:

An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

However, I don't get this error on other services. I'm wondering if this is the problem I'm running into - but it only started happening a week or so ago.

@kalleth
Copy link

kalleth commented Mar 31, 2022

Having created a new task definition for a new service, which is identical except without the AWS_ACCESS_KEY_ID environment variables present, I'm now able to connect. So yes, it looks like this was this issue -- weird it only started happening within the last week or so though.

@InbarRose
Copy link
Contributor Author

Yes, that was the error i got too.

@george-silva
Copy link

george-silva commented Mar 31, 2022

Great utility, thanks for building it!

I'm having a similar issue and I'm trying to name the env variable something else. Full details: aws/copilot-cli#3408

I'll keep it posted here.

EDIT:

I can confirm. Having AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in your environment variables will prevent you from connecting to ECS

@kroehre
Copy link

kroehre commented Mar 31, 2022

We were also struggling with this issue and found we had AWS_SECRET_ACCESS_KEY set. Renaming it fixed the issue. @InbarRose @george-silva THANK YOU

@adamjkeller
Copy link
Contributor

Hey ya'll, thanks for bringing this up and for proposing a fix! I do have a follow up question to better understand the problem statement.

What is the reason for using the AWS_ACCESS_KEY_ID/SECRET as opposed to letting the AWS SDK pull credentials dynamically from the IAM role attached to the task? Generally it's recommended to not hardcode credentials as environment variables as these values can be read in plaintext in the task definition.

Thanks again!

@kroehre
Copy link

kroehre commented Mar 31, 2022

@adamjkeller it's common to have configs in ENV because of https://12factor.net/config. In our case they aren't plain-text in the task definition, but injected during deploy. We're moving away from this though.

@InbarRose
Copy link
Contributor Author

well, i am happy that this is helpful to so many people.

the reason we had it is because even though we were using the role on the task for everything, one of the libraries we used expected these environment variables in order to connect to some resource outside of the scope of the role that runs the task, so we have since change the library to use diferently named environment variables.

@adamjkeller
Copy link
Contributor

@adamjkeller it's common to have configs in ENV because of https://12factor.net/config. In our case they aren't plain-text in the task definition, but injected during deploy. We're moving away from this though.

Thank you for the response!

Awesome, i'm very familiar with 12-factor! Using environment variables for dynamic configuration values is absolutely the right approach, but I want to better understand why some are using AWS_ACCESS_* env vars when the SDK will grab those values dynamically from the metadata endpoints available to the container at runtime (via the credential provider chain).

examples of how sdk grabs credentials:
https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html
https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html#credentials-chain

@adamjkeller
Copy link
Contributor

well, i am happy that this is helpful to so many people.

the reason we had it is because even though we were using the role on the task for everything, one of the libraries we used expected these environment variables in order to connect to some resource outside of the scope of the role that runs the task, so we have since change the library to use diferently named environment variables.

Ah, I had a feeling this would be one of the reasons we see this. Glad the library gave the option to look at a different env var name. Thanks for sharing!

@george-silva
Copy link

TLDR: AWS Copilot does not fully support all AWS services equally and we needed a way out to access these services. Therefore we injected KEY/SECRET variables into our containers.

Longer answer:

We are using Python to build an application and we're using AWS Copilot to deploy it on ECS. The truth is while copilot is great, there are some pain points to integrating with non-copilot things. Example: we have infrastructure built with terraform that doesn't quite play well with the copilot's built-in way of doing things (or they don't support it).

So our terraform scripts create IAM profiles that we use and have the specific permissions for all different services. These are injected in our container.

mergify bot pushed a commit that referenced this issue Mar 31, 2022
AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY can override the AWS SDK
see #49
@InbarRose
Copy link
Contributor Author

@adamjkeller thanks for approving and merging my PR. I hope it helps. I guess this can be closed now?

@InbarRose
Copy link
Contributor Author

weird it only started happening within the last week or so though.

Yes, this is something we noticed too. AWS must have changed something internally. 🤷‍♂️

@kalleth
Copy link

kalleth commented Mar 31, 2022

Awesome, i'm very familiar with 12-factor! Using environment variables for dynamic configuration values is absolutely the right approach, but I want to better understand why some are using AWS_ACCESS_* env vars when the SDK will grab those values dynamically from the metadata endpoints available to the container at runtime (via the credential provider chain).

In my case, it was legacy reasons -- we previously hosted on DigitalOcean on plain VM's running docker compose, and therefore these needed to be read from ENV rather than provided through integrated AWS auth.

When we migrated to AWS, I left the config in there for our app to use for S3 access.

@pauricthelodger
Copy link

@InbarRose, just to check - your PR checks for AWS_ACCESS_KEY but @george-silva mentioned the issue was with AWS_ACCESS_KEY_ID, are both causing the problem? I have the latter =)

@InbarRose
Copy link
Contributor Author

InbarRose commented Apr 1, 2022 via email

@george-silva
Copy link

george-silva commented Apr 1, 2022

@InbarRose I can confirm that AWS_ACCESS_KEY_ID also gets you into trouble:

image

In the image there is diff from our previous configuration, setting AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and moving towards other names.

ps: don't worry, these are not the secrets themselves, just paths to resolve them internally on Copilot.

@InbarRose
Copy link
Contributor Author

another PR to add AWS_ACCESS_KEY_ID #52

@lukasz90b
Copy link

Hi u dont need to delete AWS enviroments/credentials from your tasks.

  1. Go to IAM
  2. Next click on users
  3. In search paste value from your aws-access-key-id (from your task enviroment)
  4. Add ssm permissions to the user that u searched
  5. {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Action": [
    "ssmmessages:CreateControlChannel",
    "ssmmessages:CreateDataChannel",
    "ssmmessages:OpenControlChannel",
    "ssmmessages:OpenDataChannel"
    ],
    "Resource": "*"
    }
    ]
    }
  6. Everything should work fine. I had a match with this for a long time and luckily the support helped me. Maybe it will help someone. GL and have a nice day.

@adamjkeller
Copy link
Contributor

Thanks for the feedback and contributions! Closing this issue as both PR's fix it up.

@lukasz90b
Copy link

weird it only started happening within the last week or so though.

Yes, this is something we noticed too. AWS must have changed something internally. 🤷‍♂️

Yes indeed. Probably aws changed something related to security and didn't report it

@woodjme
Copy link

woodjme commented Apr 28, 2022

Sorry to comment on a closed issue but it seems the new checks for AWS_ACCESS_KEY etc only check the environment array and not the secrets array, both of which can be used to set environment variables inside the container.

I've currently got an issue where I'm loading the value for AWS_ACCESS_KEY from Secrets Manager and can't connect to ECS Exec as the variable is set, but these checks pass.

@lukasz90b
Copy link

Sorry to comment on a closed issue but it seems to new checks for AWS_ACCESS_KEY etc only check the environment array and not the secrets array, both of which can be used to set environment variables inside the container. I've currently got an issue where I'm loading the value for AWS_ACCESS_KEY from Secrets Manager and can't connect to ECS Exec as the variable is set, but these checks pass.

Did u add permissions to the AWS_ACCESS_KEY_ID like I wrote last time?

@woodjme
Copy link

woodjme commented Apr 28, 2022

Sure I can add the permissions to fix this or better still I will rename that variable but I just want to raise that the check itself doesn't cover all scenarios for setting the variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants