Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda was unable to decrypt the environment variables because KMS access was denied #279

Open
mohitkale opened this issue Jul 13, 2018 · 57 comments

Comments

@mohitkale
Copy link

mohitkale commented Jul 13, 2018

Dear Author,

For some strange reasons only the GET SINGLE TODO ITEM request is not working while all other APIs are working fine (i.e., LIST, CREATE, UPDATE, and DELETE).

I am getting this error, in the API Gateway console.

Reference Example: https://github.com/serverless/examples/tree/master/aws-node-rest-api-with-dynamodb

Endpoint response body before transformations: {"Message":"Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: AccessDeniedExceptionKMS Message: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access.","Type":null}

I am using same ITEM ID in both GET and DELETE methods, the DELETE method works but the GET method throws an Internal Server Error (stack trace as mentioned above).

Please suggest.

@tremby
Copy link

tremby commented Aug 29, 2018

I came across the same error today in my own project. Like you, it seems only one of my functions is affected, and I'm not sure why.

@andarilhoz
Copy link

I'm having the same issue, did someone figure out a workaround?

@liampauling
Copy link

I had this issue and after some head banging found out it was due to deleting an IAM policy and creating using the same name, simply changing the IAM of the lambda to something else, saving and then changing back fixed it.

@tremby
Copy link

tremby commented Sep 11, 2018

I ran the command to remove everything serverless had deployed, then deployed again, and for some reason it was then OK. 😕

@Lasim
Copy link

Lasim commented Sep 12, 2018

I had the same issue.
It was necessary to remove all lambda functions and they deploy them again.

@jaybarts
Copy link

jaybarts commented Dec 5, 2018

Same issue happened to me today with my own project using sls version 1.32.0. It's an unfortunate workaround, since removing and deploying results in brand new endpoints, which would be a problem for me in production.

@dschep
Copy link
Contributor

dschep commented Dec 5, 2018

I've never seen this. Can this be reproduced reliably? If so, could you provide me with your serverless.yml so I can debug this?

@jaybarts
Copy link

jaybarts commented Dec 6, 2018

I've never seen this. Can this be reproduced reliably? If so, could you provide me with your serverless.yml so I can debug this?

@dschep I was able to reproduce it quite a few times today, but it seemed to take a few tries (of deploys & removes) before I got the same exact error. I created a repo with the serverless.yml as well as instructions on how to reproduce. I think it's related to a serverless deployment failing midway, which in my case was due to a duplicate name for a CloudWatch Event Rule. I'm sure any name conflict error would also cause the issue, but I included this particular case since it did the trick for reproducing the issue.

Link to the Repo: https://github.com/jaybarts/sls-kms-issue

Thank You for offering to take a look at this issue. Please let me know if you need anything else.

@dschep
Copy link
Contributor

dschep commented Dec 6, 2018

Thanks for the dtails @jaybarts!! I'll take a look at this tomorrow or early next week.

@dschep dschep self-assigned this Dec 6, 2018
@PvanHengel
Copy link

I had the same issue today, it is related to when you delete and re-deploy. Ive had some instances where I want to do a clean test of the entire stack.

@huangenyang
Copy link

Was developing and deploying fine on one computer. About to travel so setup on a new laptop. Same code but just new serverless setup on a different computer, getting this error and couldn't pass it. The lambda complained was configured using the default encryption. I got back to the other PC I used and tried to deploy the same code, no problem. So I have two computers one I can deploy and the other (possibly running a newer version of serverless and other tools which cannot.

@sverraest
Copy link

I ran into this issue just now.

  • Nested stack (Api/Log)
  • Initial stack deployment failed due to hitting rate limit on Lambda creation
  • Redeployed and succeeded
  • A single lambda of 34 lambdas in that package has this issue

@hard-coders
Copy link

I had the same issue and figured out the problem.
AWS Doc said,

AWS Lambda authorizes your function to use the default KMS key through a user grant, which it adds when you assign the role to the function. If you delete the role and create a new role with the same name, you need to refresh the role's grant. Refresh the grant by re-assigning the role to the function

So, I just re-deploy function and it worked well.

@ctippur
Copy link

ctippur commented May 7, 2019

Experienced the same issue as well. Had to delete the lambda function manually and recreate using terraform to resolve it.

@GCCreemars
Copy link

This happens to me quite frequently, more so as the number of functions in my serverless service grows. Removing and subsequently re-deploying has an almost 50% chance of having this error pop up when I try to test my deployment now.

@Hyperadministrator
Copy link

My problem was caused due to the fact:

  1. I changed the user's key which is used on building new instances (the first key which gets placed into the instance to enable SSH-connection) without changing the corresponding KMS key policy in AWS
  2. I also had few orphaned account-IDs in key policy. I read from somewhere these might also cause failures.

When I added my AWS user account ARN to the list of allowed users under policy's decrypt action and removed orphaned user account IDs (orphaned due to the fact we deleted one AWS user, but corresponding user's account ID persisted in policy) then problems disappeared.

@Al-Jp
Copy link

Al-Jp commented Oct 30, 2019

Go to the Lamda console > Encryption Configuration > Restart the configuration. For example, change it to a customer master key and save and then again return it to default and save. This solved my problem.

@adimoraret
Copy link
Contributor

I've deployed my lambdas with serverless framework and I got this only for one function, but not for the others. All functions are using the same role. Manually changing role in AWS for the function with this issue, to some other random role, and back to the original role fixed the problem. If it helps the one that was not working was triggered by Http GET, the one that worked was triggered by Http POST

@cmardonespino
Copy link

cmardonespino commented Dec 3, 2019

I got the same problem that started when I changed from one custom KMS key for another. So once changed the custom KMS in the lambda, when I tried to update the lambda configuration with the AWS CLI command:

aws lambda update-function-configuration --function-name notifications-status-update-emitter --runtime nodejs10.x --handler handler.handler --timeout 60 --memory-size 256 --environment Variables={ENVIRONMENT=staging}

I got the following output

An error occurred (AccessDeniedException) when calling the UpdateFunctionConfiguration operation: Lambda was unable to configure access to your environment variables because KMS returned Access Denied. Please check your KMS permissions. KMS Exception: AccessDeniedException KMS Message: User: arn:aws:iam::xxxxxx:user/deploy is not authorized to perform: kms:CreateGrant on resource: arn:aws:kms:us-west-2:xxxxxx:key/xxxxxxxxxxxxxxxxxx

And that was pretty weird because I already have granted to the deploy user permissions to update the lambda configuration... I though that is so weird!
So after some try a couple of times searching what could be a solution for it, I fixed it with the following:

  1. Modifying the encryption configuration to the default encryption

(default) aws/lambda

Screen Shot 2019-12-03 at 13 00

  1. And then, execute update the lambda configuration again
  2. Later enable again the encryption configuration with my custom KMS key
  3. Execute again the update the lambda configuration and it should work again

I think maybe this is a AWS bug?

@wafaaSultan
Copy link

I have tested from AWS side, I am able to create the lambda function without any issue
f018982d4db3:testlambda wafaas$ sls deploy -s dev
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Uploading CloudFormation file to S3...
Serverless: Uploading artifacts...
Serverless: Uploading service service-name.zip file to S3 (1.84 KB)...
Serverless: Validating template...
Serverless: Updating Stack...
Serverless: Checking Stack update progress...
.................................
Serverless: Stack update finished...
Service Information
service: service-name
stage: dev
region: eu-west-1
stack: service-name-dev
resources: 9
api keys:
None
endpoints:
None
functions:
lambda1: service-name-dev-lambda1
lambda2: service-name-dev-lambda2
lambda3: service-name-dev-lambda3
layers:
None
Serverless: Run the "serverless" command to setup monitoring, troubleshooting and testing.
f018982d4db3:testlambda wafaas$

looks like the issue from serverless side

here is the sample of my template

functions:
lambda1: # Do Not Change This Lambda Name Without Update The manna-serverless-plugin !!!
handler: handler.hello
description: testing function
integration: lambda
resultTtlInSeconds: 0
type: request
tags:
LambdaName: lambda1
environment:
test: testdata
ENVIRONMENT: lambda

@tjcobb
Copy link

tjcobb commented Jan 14, 2020

@dschep Is there any update on this? This happens to us fairly consistently when doing a remove followed shortly by a re-deploy. The issue usually resolves itself within 5-10 minutes. Is there anything we can add to our deployment to speed that up?

@dschep dschep removed their assignment Jan 15, 2020
@wafaaSultan
Copy link

I found a workaround to fix this issue by adding a role direct to your template "serverless.yml" with lambda full access as following;
functions:
lambda1: # Do Not Change This Lambda Name Without Update The manna-serverless-plugin !!!
handler: handler.hello
description: testing function
integration: lambda
role : arn:aws:iam::xxxxxxxxxxxx:role/Lambda
resultTtlInSeconds: 0
type: request
tags:
LambdaName: lambda1
environment:
test: testdata
ENVIRONMENT: lambda

I have tested from myside and it's working

@ajoga
Copy link

ajoga commented Feb 22, 2020

I had this issue and after some head banging found out it was due to deleting an IAM policy and creating using the same name, simply changing the IAM of the lambda to something else, saving and then changing back fixed it.

I believe this is because the lambda references the identifier of the IAM role to use, not the ARN of the IAM role. Read more about identifiers here : https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_identifiers.html#identifiers-unique-ids

@sh4des
Copy link

sh4des commented Apr 27, 2020

AWS, multi-billion dollar company, compute in the cloud, remove the need for servers.
-- have you tried turning it off and on again?

@campellcl
Copy link

Still seeing this issue intermittently. Redeployment does appear to resolve the issue, but is far from an ideal solution. What is worse is that Lambda when invoked, still appears to return an HTTP status code 200, or 202 (depending on a sync or async invocation) which makes it rather hard to detect this error programmatically.

@prashanthtiramareddi
Copy link

We are facing the same issue with one of our application, wonder why it is happening to only one lambda. Anyone recently fixed the issue?

@GCCreemars
Copy link

I think this happens to me when I have the AWS Lambda GUI open in a browser tab on one of the Lambdas in the service when I redeploy. The error seems to occur less frequently when closing all open Lambda tabs before redeploying.

@marcelomanchester
Copy link

@liampauling your suggestion is still working!!! thanks

@gurunathchoukekar9
Copy link

If you are Deleting IAM Role and recreating again it causes this KMS issue when running Lambda

Resolution: Do not delete IAM Role when redeploying. You can delete all policies under role and recreate all policies

I did the same in my AzureDevOps AWS CLI script to resolve this issue

@tomaszdudek7
Copy link

Quick fix provided by @ramgrandhi above (go to Lambda UI -> edit Lambda config (with no tweaks whatsoever) -> save) solves the issue for me.

Any idea why does it occur and when? I am not able to reproduce it. Duh.

@david-mcqueen
Copy link

david-mcqueen commented Jan 26, 2021

We had this issue if our Role was unchanged between deployments and did a serverless remove && serverless deploy.
We solved it by removing the name from the Role within serverless.yml. With the name omitted Serverless generates a unique name, for each deployment.

ExecutionRole:
    Type: AWS::IAM::Role
    Properties:
        RoleName: my-execution-role-name        // Remove this line
        AssumeRolePolicyDocument:
        ...

@parencik
Copy link

I had the same issue.
It was necessary to remove all lambda functions and they deploy them again.

Well... after redeployment my function worked well but another one failed with this error...

@mikaelvesavuori
Copy link

Seeing the same every now and then. A real heart-breaker, and creates a huge mess when you're working on something that is not really ideal to "rip and replace".

@dspenard
Copy link

dspenard commented Feb 16, 2021

This is frustrating, I've been working with a CloudFormation deployment all day with one Lambda function, building and tearing down repeatedly to do some testing, and now all of a sudden I get this error message. Redeploying the Lambda solved the issue, which is troubling, but at least I'm back in business. This is far from ideal for a robust CI/CD process, but I'm not dealing with a production-ready system at the moment, so for my situation this solution is fine for now.

@createdbykartik
Copy link

Yup, re-deploying fixes the problem. It's that simple.

@tomaszdudek7
Copy link

tomaszdudek7 commented Mar 1, 2021

It may sound simple, but having your CI/CD randomly fail now and then(well, even worse than fail - deploy something that does not work) is awful. And so is telling your teammates "Well, this rock and solid framework can sometimes render your deployment unusable. Just try deploying again when it does!".

I'd love sls team tracking down and fixing this bug.

@sambonator1
Copy link

sambonator1 commented Mar 13, 2021

Having to custom code post-IAC deployment tests to automatically redeploy portions of it to get around this bug really sucks.

@sverraest
Copy link

Almost 3 years later...

@drexler
Copy link

drexler commented Mar 17, 2021

Took me 3 days to track this issue down!! Perhaps, for a mitigation step, the CF template can be analyzed for renaming changes which cause this issue and then if present perform a redeployment of the APIs. This can be externalized via the Serverless.yml to control when it should be triggered. I'll hash up a draft PR for this when I have a few cycles.

@fkunecke
Copy link

I had the same issue and figured out the problem.
AWS Doc said,

AWS Lambda authorizes your function to use the default KMS key through a user grant, which it adds when you assign the role to the function. If you delete the role and create a new role with the same name, you need to refresh the role's grant. Refresh the grant by re-assigning the role to the function

So, I just re-deploy function and it worked well.

This worked for me, except I am using AWS Amplify. Thanks!

@jweilhammer
Copy link

Was also able to fix this by just changing the execution role of the lambda function in the Configure tab to anything else, and then back to the role it needs. Seems to re-apply the role to the lambda and it runs as expected.

Re-deploying the entire lambda itself also works, but I found this to be an easier and quicker solution :-)

@nathant727
Copy link

We saw these errors recently too:
"Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: AccessDeniedExceptionKMS Message: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access."
I resolved this error ^ by assigning our Lambda function to a different Execution role and then reassigning it to the correct Execution role.

@jweilhammer
Copy link

After hitting this again, believe this error is because of the IAM role session time. Think that if the role is changed, and the lambda tries to execute again within a window of its max session time, then this error will occur.

Potentially waiting the duration for the old role's session to expire would fix as well, and explains why switching the role is fixing it (lambda retrieving new session with updated role)

@dithos211
Copy link

We ran into this issue a couple of days back. Our lambdas have been deployed using Terraform and the lambdas are meant to be triggered using event bridge events.
But the lambdas were not recognizing the events since event bridge was not added as a trigger to the lambdas. I suspect it might be because the terraform scripts for events were executed before the lambdas were deployed.
Once the triggers were set (had to edit the rules and save them manually), got the below error when we tried to test the lambdas.

"Lambda was unable to decrypt the environment variables because KMS access was denied. Please check the function's KMS key settings. KMS Exception: AccessDeniedExceptionKMS Message: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access."

Setting the IAM role to a different one, saving, setting it back to the original and saving it again got the lambdas to work.

@steven-hunt-devopsgroup

Thanks @dithos211 , those steps worked for me perfectly. Ta very

@DenysVyskrebetsTR
Copy link

DenysVyskrebetsTR commented Apr 7, 2022

manually changing lambda role to something else on the web portal and then back to the original role fixed the thing

@JCapriotti
Copy link

For anyone looking for a fix, there's a great write-up of the problem here: https://www.lastweekinaws.com/blog/the-sneaky-weakness-behind-aws-managed-kms-keys/

A brief summary is that this issue can occur if you delete and recreate the IAM role used by a Lambda function. The workarounds mentioned above seem to work: either update the lambda's role or recreate the lambda.

To avoid this altogether, one should avoid removing the IAM role used by a lambda (if possible) or use a customer managed key for encryption of the environment variables.

@domengabrovsek
Copy link

@liampauling thank you, this was a lifesaver!

#279 (comment)

@miguellgramacho96
Copy link

Still happening to this day. Had to manually change the iam role to something else, saving and then changing back like @liampauling shared.

@Fibio
Copy link

Fibio commented Jan 30, 2024

I just had the same problem and as people mention here: it is related with redeployment using the same role name.

I did solved it by: IAM -> Roles -> $YourRoleNameHere -> Revoke Sessions -> Revoke active sessions

I hope it helps.

thank you a lot !!!

@lagouyn
Copy link

lagouyn commented Jun 12, 2024

This helped me with my particular KMS/lambda issue, which occurred after my lambda role had gotten deleted, and I redeployed a replacement for that role:
https://repost.aws/knowledge-center/lambda-kmsaccessdeniedexception-errors

@felipegabry
Copy link

I've deployed my lambdas with serverless framework and I got this only for one function, but not for the others. All functions are using the same role. Manually changing role in AWS for the function with this issue, to some other random role, and back to the original role fixed the problem. If it helps the one that was not working was triggered by Http GET, the one that worked was triggered by Http POST

Still working, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests