Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetching SSM params: InvalidSignatureException: Signature expired #1123

Closed
hamilton-s opened this issue Nov 6, 2023 · 13 comments · Fixed by #1150
Closed

Fetching SSM params: InvalidSignatureException: Signature expired #1123

hamilton-s opened this issue Nov 6, 2023 · 13 comments · Fixed by #1150
Labels

Comments

@hamilton-s
Copy link

hamilton-s commented Nov 6, 2023

Describe the bug
Since the end of October, we have seen our Lambda functions intermittently fail due to SSM parameters not being fetched. The error we are seeing looks like the following:

InvalidSignatureException: Signature expired: 20231103T171116Z is now earlier than 20231103T171224Z (20231103T171724Z - 5 min.)
 at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22)
 at /var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:18:39
 at de_GetParametersCommandError (/var/runtime/node_modules/@aws-sdk/client-ssm/dist-cjs/protocols/Aws_json1_1.js:4194:20)
 at process.processTicksAndRejections ...

We have managed to bring our errors down by disabling prefetch and reducing the cacheExpiry - however, we would prefer to keep these options as they were. We're also not sure why the issue has suddenly started happening as the issues didn't seem to coincide with a particular upgrade/code change.

To Reproduce

export default middy(handler)
  .use(
    ssm({
      fetchData: {
        paramA: `some/path/a`,
        paramB: `some/path/b`,
      },
      cacheExpiry: 600000,
      setToContext: true,
    })
  )

We notice it only fails in around 1-2% of cases in our production environments.

Environments

  • Node.js: 18
  • "@middy/ssm": "4.6.5"
  • "@middy/core": "4.6.5"
  • "@aws-sdk/client-lambda": "3.425.0"

Additional context
We've noticed this across a range of our services, with different versions of @middy/ssm

@hamilton-s hamilton-s added the bug label Nov 6, 2023
@willfarrell
Copy link
Member

willfarrell commented Nov 8, 2023

Thanks for reporting. As you mentioned "We're also not sure why the issue has suddenly started happening as the issues didn't seem to coincide with a particular upgrade/code change.", which makes me think there was a change on the AWS infra side, but I don't see anything in the docs that jumps out as changed. I wonder if there was an SDK change (or one of it's deps)?

If you reached out the AWS support, can you shared their response here.

I'll do some digging as well.

@hamilton-s
Copy link
Author

Thanks for your reply!

We've discovered a significant problem in our Observability lambda layer that seems to slow down boot-up times due to instrumenting the aws-sdk. Initially, we thought this was a middy issue because it only occurred with middy, but we've now determined that it's related to aws-sdk being in the dependency tree. We're addressing this with our Observability partner and hope it resolves our problem. If others are also facing this issue, it could remain open, but for now, we're focusing on fixing our lambda layer to see if it resolves the problem. We can close this unless others are experiencing the same issue.

@willfarrell
Copy link
Member

For what you described, that sounds like it could easily cause this issue. I'll close for now. If you you need to reopen, please do so. Other are welcome to comment if they're also running into this.

@HumbleBeck
Copy link

Hello,
I've hit the same error on Lambda@Edge recently.
Setup looks like

export default middy(handler)
  .use(doNotWaitForEmptyEventLoop())
  .use(ssm({
    fetchData: {
      paramA: "path"
    },
    setToContext: true,
    awsClientOptions: {
      region: process.env.REGION || 'us-east-1'
    }
  }))

Environment

  • Node.js: 18
  • "@middy/ssm": "4.6.5"
  • "@middy/core": "4.6.5"
INFO	InvalidSignatureException: Signature expired: 20231119T162823Z is now earlier than 20231119T163124Z (20231119T163624Z - 5 min.)
    at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22)
    at /var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:18:39
    at de_GetParametersByPathCommandError (/var/runtime/node_modules/@aws-sdk/client-ssm/dist-cjs/protocols/Aws_json1_1.js:4242:20)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /var/runtime/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24
    at async /var/runtime/node_modules/@aws-sdk/middleware-signing/dist-cjs/awsAuthMiddleware.js:14:20
    at async /var/runtime/node_modules/@aws-sdk/middleware-retry/dist-cjs/retryMiddleware.js:27:46
    at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:7:26
    at async Promise.allSettled (index 0)
    at async to (/var/task/src/edgeGate/handler.js:63:43001) {
  '$fault': 'client',
  '$metadata': {
    httpStatusCode: 400,
    requestId: '55d1f065-3954-4531-9ffe-af7e2e2a8a29',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  __type: 'InvalidSignatureException'
}

It happens randomly to a small number of requests. I couldn't figure out a pattern yet.

@hamilton-s
Copy link
Author

We managed to fix our observability lambda layer issue but we are still experiencing the middy issue. @willfarrell Can we please reopen this issue since others appear to have the same issue in the thread above too?

@willfarrell willfarrell reopened this Dec 12, 2023
@HumbleBeck
Copy link

My theory on what is happening in our case:

  • here we cache a client in memory
    let client
  • AWS signature valid for 15 minutes, so if you have traffic and your lambda keeps running without cold starts you reuse your client
  • eventually, the signature expires and your client starts throwing InvalidSignatureException
  • what's interesting is that this line
    value[internalKey] = undefined
    sets undefined in the cache, and while it throwing an error all future invocations receiving undefined, which breaks completely everything

@willfarrell
Copy link
Member

willfarrell commented Dec 15, 2023

After some digging, I think I have a theory.

  1. Middy is setup with an expiry
  2. first Request is made, ssm is fetched, it gets cached, timer is set to refresh the cache
  3. While the lambda has been idle for a while, the timer expires triggering the refresh. A fetch promise is create but stopped somehow by AWS (the theory part)
  4. next request comes in >5min later, the fetch promise w/ a now expire signature fails.

How to fix, All middlewares that fetch from aws services will need to catch InvalidSignatureException and force a retry during the request. I'll have to think on how to best implement this.

Would love to hear if the above steps makes sense those running into this issue.

Ref:
https://repost.aws/knowledge-center/lambda-sdk-signature

@willfarrell
Copy link
Member

I pushed a PR, if someone could test irl that would be great.

@HumbleBeck
Copy link

I'll try to copy-paste your changes, we are still on v4, and run it for a few days.

@willfarrell
Copy link
Member

@HumbleBeck Any feedback on this?

@HumbleBeck
Copy link

HumbleBeck commented Dec 30, 2023

Hi @willfarrell. While this bug rarely happens to us, I can confirm that the fix works, and it started recovering expired signature calls.

@willfarrell
Copy link
Member

Awesome, I'll update the PR to cover all AWS service middleware (just in case) and merge in. Thanks a lot for testing it out.

@pranav-chefman
Copy link

pranav-chefman commented May 8, 2024

Hi, If you are still on version 4. One workaround is overriding the retry strategy and passing it to middy.

const middy = require('@middy/core');
const ssm = require('@middy/ssm');
const {ConfiguredRetryStrategy} = require('@smithy/util-retry');


class ClockSkewRetryStrategy extends ConfiguredRetryStrategy {
  constructor(maxAttempts, computeNextBackoffDelay) {
    super(maxAttempts, computeNextBackoffDelay);
  }

  isRetryableError(errorType) {
    return errorType === 'CLIENT_ERROR' || super.isRetryableError(errorType);
  }
}

...

middy()
      .use(
        ssm({
          ...
          awsClientOptions: {
            retryStrategy: new ClockSkewRetryStrategy(3, 500),
          },
          ...
        })
      )
      .before(async (request) => {
        ...
      });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

4 participants