Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Log state not cleared before logging event in middleware #1328

Closed
angelcabo opened this issue Feb 23, 2023 · 12 comments · Fixed by #1329
Closed

Bug: Log state not cleared before logging event in middleware #1328

angelcabo opened this issue Feb 23, 2023 · 12 comments · Fixed by #1329
Assignees
Labels
bug Something isn't working completed This item is complete and has been merged/shipped logger This item relates to the Logger Utility

Comments

@angelcabo
Copy link

Expected Behaviour

I am currently using the injectLambdaContext middleware with the middy library to both clear persistent log state and log the invocation event. My lambda handler function puts a request ID on the logger for debugging purposes.

According to the docs, because I've set clearState to true, persistent attributes added inside the handler will NOT be cached. So, my expectation would be that the request_id attribute would never be shared between invocations (assuming each invocation contained a unique request_id).

Current Behaviour

I am seeing the initial event logged by the middleware including persistent attributes added from a previous invocation, despite clearState being set to true

Code snippet

const logger = new Logger({
    logLevel: 'DEBUG',
    persistentLogAttributes: {
        foo: 'bar'
    }
});

const lambdaHandler = async (event: { headers: {x-request-id: string } }, _context: unknown): Promise<void> => {
    logger.appendKeys({request_id: event.headers['x-request-id']});
    ...
};

export const handler = middy(lambdaHandler)
    .use(injectLambdaContext(logger, { logEvent: true, clearState: true }));

Possible Solution

It seems like the logEvent step occurs in the "before" middleware, here: https://github.com/awslabs/aws-lambda-powertools-typescript/blob/main/packages/logger/src/Logger.ts#L338 whereas the clearState step happens in the "after" middleware step here: https://github.com/awslabs/aws-lambda-powertools-typescript/blob/main/packages/logger/src/Logger.ts#L346, with the initialPersistentAttributes being used to override any existing persistent attributes set in a previous invocation.

Steps to Reproduce

  1. Host a lambda behind API Gateway.
  2. Invoke the lambda twice, each time using a unique x-request-id header in the request.
  3. Observe via CloudWatch logs that the "Lambda invocation event" log line for the second invocation (assuming the second invocation was warm) has the first x-request-id header value in the request_id logger attribute.

AWS Lambda Powertools for TypeScript version

latest

AWS Lambda function runtime

16.x

Packaging format used

Lambda Layers

Execution logs

No response

@angelcabo angelcabo added triage This item has not been triaged by a maintainer, please wait bug Something isn't working labels Feb 23, 2023
@dmeehan1968
Copy link

I experienced the same just with a simple use of clearState: true. I worked around with the following in the top of my Lambda handler:

logger.removePersistentLogAttributes(Object.keys(logger.getPersistentLogAttributes()))

NB: Only works for me because I don't set any 'initial persistent attributes'.

@dreamorosi
Copy link
Contributor

Hi both, thank you for the report.

I'll be looking at this tomorrow morning (EMEA timezone). I'll try to reproduce the issue and I'll either share a path forward or an explanation in case it's an intended behavior.

@dreamorosi dreamorosi self-assigned this Feb 23, 2023
@dreamorosi dreamorosi added the logger This item relates to the Logger Utility label Feb 23, 2023
@dreamorosi
Copy link
Contributor

Hi, I've been trying to reproduce the issue following the steps described above and I was not able to.

To isolate the issue and provide a self-contained example, I have created this sample repo that has the similar components as the ones described in the "steps to reproduce" section of the OP (link to resources definition):

  • 1 API Gateway
  • 1 Lambda function with the code described below
  • 1 Proxy integration that binds the function to a get method in a route

The code of the Lambda I used is this (link to repo):

import { APIGatewayProxyEvent } from 'aws-lambda';
import { Logger, injectLambdaContext } from '@aws-lambda-powertools/logger';
import middy from '@middy/core';

const logger = new Logger({
  logLevel: process.env.LOG_LEVEL || 'info',
  persistentLogAttributes: {
    version: 1,
  }
});

export const handler = middy(async (event: APIGatewayProxyEvent) => {
  logger.appendKeys({ request_id: event.headers['x-request-id'] });

  logger.info('Hello World!');

  return {
    statusCode: 200,
    body: JSON.stringify({
      message: 'Hello World!',
    }),
  }
}).use(injectLambdaContext(logger, { clearState: true, logEvent: true }));

While the command I used to make the requests to the API were these:

curl https://API_ID.execute-api.REGION.amazonaws.com/prod/test --header "x-request-id: $(uuidgen)"

Note that both API_ID and REGION were replaced with the ones of my deployment, while uuidgen is a command that generates an unique id each time it's run.

Based on the above, after making two requests these are the logs:
Screenshot 2023-02-24 at 15 34 44
Screenshot 2023-02-24 at 15 34 59

As you can see from the screenshots in my case the custom header x-request-id values in each requests are different and appear to be contained to the logs in its execution.

Also for the sake of completeness, I'm using the latest published version of the utility which is 1.5.1.

Could you please confirm that I'm understanding the issue and the steps correctly? If it appears so, could you please also try the sample I provided and check if you can observe the same behavior?

@dreamorosi dreamorosi added need-more-information Requires more information before making any calls and removed triage This item has not been triaged by a maintainer, please wait labels Feb 24, 2023
@angelcabo
Copy link
Author

angelcabo commented Feb 24, 2023

@dreamorosi My apologies, the repro steps are a bit more complicated than I laid out. Simply executing two requests synchronously won't cause the issue to appear. The issue seems to arise when multiple requests hit a warm lambda.

You can re-work the sample code as follows:

import type {APIGatewayEvent, APIGatewayProxyResult} from 'aws-lambda';
import middy from "@middy/core";
import {injectLambdaContext, Logger} from '@aws-lambda-powertools/logger';

const logger = new Logger({
    logLevel: 'info',
    persistentLogAttributes: {
        version: 1
    },
});

const doSomething: () => Promise<string> = function () {
    return new Promise((resolve) => setTimeout(() => resolve('Done!'), 1000))
};

const main = async (event: APIGatewayEvent): Promise<APIGatewayProxyResult> => {
    logger.appendKeys({request_id: event.headers['X-Request-Id']});

    logger.info('Hello World!');

    const message = await doSomething();

    logger.info(message);

    return {
        statusCode: 200,
        body: JSON.stringify({
            message: 'Hello World!',
        }),
    }
};

export const handler = middy(main)
    .use(injectLambdaContext(logger, {logEvent: true, clearState: true}))

Then, issue a single request to the lambda which will warm it up. Then, issue multiple requests in parallel to reproduce the issue with the shared logger instance.

I used the exact code above and this is what my logs look like (note the difference between request_id and X-Request-ID on the log event)

Screenshot 2023-02-24 at 1 38 29 PM

@dreamorosi
Copy link
Contributor

Thank you for clarifying, I have two notes:

The issue seems to arise when multiple requests hit a warm lambda.

You can see from the screenshots above that the second one is a not a cold start (first line of each log entry in the second screenshot).

Then, issue a single request to the lambda which will warm it up. Then, issue multiple requests in parallel to reproduce the issue with the shared logger instance.

Regardless of whether your code has async/await or promises, Lambda invokes your function one request at the time for each execution environment. This means that if you have two requests at the same time, or if the second request arrives before the first invocation has ended, Lambda will spin up two environments, both of which will have a cold start.

From the docs, in the lifecycle section, let's look at this diagram:
Overview-Successful-Invokes

From left to right, the diagram represents two subsequent requests. When the first request arrives Lambda performs the "Init" phase, then it proceeds to the "Invoke" one. In this case, the subsequent request arrives sometime after the first function has already returned (first "Invoke" has ended), so another "Invoke" phase (the second) happens. Finally, the environment is shutdown.

If the second request arrived before the first "Invoke" had completed, which could happen when issuing parallel requests, then Lambda would have to spin up a second environment, which would result in a separate "Init" & "Invoke", completely independent from the first one.


I'm going to try updating my sample with your code & query the logs like you did to see if I can reproduce and get back to you.

@dreamorosi
Copy link
Contributor

dreamorosi commented Feb 24, 2023

Okay, I can reproduce the issue and I can do so even without the promise or async/await in the code:

image

Apparently some of the persistent attributes seem to leak into the next invocation, but I'm able to reproduce this only starting from the second warm invocation onwards. Super weird, but interesting!

I'll continue investigating and work on a fix.

@dreamorosi dreamorosi added confirmed The scope is clear, ready for implementation and removed need-more-information Requires more information before making any calls labels Feb 24, 2023
@dreamorosi
Copy link
Contributor

The decorator seems to not be affected by the issue:

image

import { APIGatewayProxyEvent } from 'aws-lambda';
import { Logger } from '@aws-lambda-powertools/logger';
import { LambdaInterface } from '@aws-lambda-powertools/commons';

const logger = new Logger({
  logLevel: process.env.LOG_LEVEL || 'info',
  persistentLogAttributes: {
    version: 1,
  }
});

class Lambda implements LambdaInterface {
  // Decorate your handler class method
  @logger.injectLambdaContext({ clearState: true, logEvent: true })
  public async handler(event: APIGatewayProxyEvent, _context: unknown): Promise<unknown> {
    logger.appendKeys({ request_id: event.headers['x-request-id'] });

    logger.info('Hello World!');

    return {
      statusCode: 200,
      body: JSON.stringify({
        message: 'Hello World!',
      }),
    }
  }

}

const myFunction = new Lambda();
export const handler = myFunction.handler.bind(myFunction);

@github-actions
Copy link
Contributor

⚠️ COMMENT VISIBILITY WARNING ⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@github-actions github-actions bot added the pending-release This item has been merged and will be released soon label Feb 27, 2023
@dreamorosi dreamorosi removed the confirmed The scope is clear, ready for implementation label Feb 27, 2023
@dreamorosi
Copy link
Contributor

A fix was released as part of the latest v1.6.0 release.

@dreamorosi dreamorosi added completed This item is complete and has been merged/shipped and removed pending-release This item has been merged and will be released soon labels Mar 2, 2023
@danrivett
Copy link

danrivett commented Mar 2, 2023

I believe we also ran into this, but I didn't have time at the time to investigate the root cause. So thank you so much for reporting and fixing.

At the time we overrode the addPersistentLogAttributes() method to clear out any existing keys matching the ones being added in this call.

This is what we added:

  // Override addPersistentLogAttributes() and not appendKeys() since the latter delegates to the former
  addPersistentLogAttributes(attributes?: LogAttributes | undefined): void {
    // I've found that attributes don't always get cleared consistently even when `clearState` is set to `true`
    // So we'll clear out any keys we're about to set to make sure
    if (attributes != null) {
      this.removePersistentLogAttributes(Object.keys(attributes));
    }

    super.addPersistentLogAttributes(attributes);
  }

We've upgraded to v1.6.0 but haven't yet removed this override, but once we have some time to test, we'll do so and retest as I think this PR should fix the issue we saw. If not, I'll create a separate bug later on.

Side note: Why did we have to clear out the attributes if we're about to set them? It was because we were passing in objects that can contain fields that aren't always present.

For example the first request may set:

logger.addPersistentLogAttributes({
  user: {
    userId: "user-A",
    csrUserId: "csr-user" // set during CSR User Impersonation when responding to helpdesk requests
  }
})

The second request may set:

logger.addPersistentLogAttributes({
  user: {
    userId: "user-B",
  }
})

And what we found was for the second request we had the following metadata on its log entries:

{
  user: {
    userId: "user-B",
    csrUserId: "csr-user" // <-- This should not be here as was only set on the first request
  }
}

@Aaron-Zhao-Asurion
Copy link

@dreamorosi

If the second request arrived before the first "Invoke" had completed, which could happen when issuing parallel requests, then Lambda would have to spin up a second environment, which would result in a separate "Init" & "Invoke", completely independent from the first one.

Is this guaranteed that they are "completely independent"? Quote from the original doc,
After the function and all extensions have completed, Lambda maintains the execution environment for some time in anticipation of another function invocation. In effect, Lambda freezes the execution environment. When the function is invoked again, Lambda thaws the environment for reuse. Reusing the execution environment has the following implications:...

Does that mean there is a chance it will be reused? so they're not completely independent. If so, my question is if it is reused, then log context from the first request can still leak to the second request? if it is reset globally and the first request hasn't completed, first request's context will then be replaced by the second request.

@dreamorosi
Copy link
Contributor

Hi @Aaron-Zhao-Asurion, in Lambda each execution environment can only process one request at the time, this is guaranteed by the service. If a new request comes while the first one hasn't finished processing, Lambda will provision a new execution environment.

The only case in which Lambda will reuse the first execution environment is when the first request completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working completed This item is complete and has been merged/shipped logger This item relates to the Logger Utility
Projects
None yet
5 participants