Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lambda: onHealthCheck handling doesn't stop normal request processing #3999

Closed
6 tasks done
pontusvision opened this issue Apr 16, 2020 · 7 comments
Closed
6 tasks done

Comments

@pontusvision
Copy link

This bug report should include:

  • A short, but descriptive title. The title doesn't need "Apollo" in it.
  • The package name and version of Apollo showing the problem:

apolo-server-lambda: 2.12.0

  • If applicable, the last version of Apollo where the problem did not occur.

n/a

  • The expected behaviour

when registered (by calling ApolloServer.createHandler(), the callback should be invoked whenever an HTTP request is made to /.well-known/apollo/server-health.

  • The actual behaviour:

Because of a bug in the implementation, the call gets executed, but the handler function carries on running, and interprets the server-health request as a normal GQL query, or as a Playground request. The problem is here in the ApolloServer.createHandler() method:

        if (event.path === '/.well-known/apollo/server-health') {
        const successfulResponse = {
          body: JSON.stringify({ status: 'pass' }),
          statusCode: 200,
          headers: {
            'Content-Type': 'application/json',
            ...requestCorsHeadersObject,
          },
        };
        if (onHealthCheck) {
      /// >>>>>>  THIS FUNCTION NEVER RETURNS <<<<<<<<<<
          onHealthCheck(event)
            .then(() => {
              return callback(null, successfulResponse);
            })
            .catch(() => {
              return callback(null, {
                body: JSON.stringify({ status: 'fail' }),
                statusCode: 503,
                headers: {
                  'Content-Type': 'application/json',
                  ...requestCorsHeadersObject,
                },
              });
            });
          } else {
            return callback(null, successfulResponse);
          }
      }


To reproduce, instantiate apollo server, and set a simple 'hello world' function:

const apolloServer = new ApolloServer();
apolloServer.createHandler(
  { cors: null, 
  onHealthCheck: (event: APIGatewayProxyEvent)=> {
    console.log(`got event ${JSON.stringify(event)}`); 
  return true;
  });

Then run a bash script to call the health check:

while [[ true ]]; do curl -v -XGET -H 'Accept: application/json' -H 'Content-Type: application/json' 'http://localhost:4000/.well-known/apollo/server-health'; done

Proposed solution:

My proposed solution (after spending about 3 days trying to troubleshoot what was going on in lambdas) is to convert the handler function to use the more modern 'async/await' pattern:

class ApoloServer{
   ...

  public createHandlerHC(
    { cors, onHealthCheck }: CreateHandlerOptions = {
      cors: undefined,
      onHealthCheck: undefined
    }
  ) {
    // We will kick off the `willStart` event once for the server, and then
    // await it before processing any requests by incorporating its `await` into
    // the GraphQLServerOptions function which is called before each request.
    const promiseWillStart = this.willStart();

    const corsHeaders = new Headers();

    if (cors) {
      if (cors.methods) {
        if (typeof cors.methods === "string") {
          corsHeaders.set("access-control-allow-methods", cors.methods);
        } else if (Array.isArray(cors.methods)) {
          corsHeaders.set(
            "access-control-allow-methods",
            cors.methods.join(",")
          );
        }
      }

      if (cors.allowedHeaders) {
        if (typeof cors.allowedHeaders === "string") {
          corsHeaders.set("access-control-allow-headers", cors.allowedHeaders);
        } else if (Array.isArray(cors.allowedHeaders)) {
          corsHeaders.set(
            "access-control-allow-headers",
            cors.allowedHeaders.join(",")
          );
        }
      }

      if (cors.exposedHeaders) {
        if (typeof cors.exposedHeaders === "string") {
          corsHeaders.set("access-control-expose-headers", cors.exposedHeaders);
        } else if (Array.isArray(cors.exposedHeaders)) {
          corsHeaders.set(
            "access-control-expose-headers",
            cors.exposedHeaders.join(",")
          );
        }
      }

      if (cors.credentials) {
        corsHeaders.set("access-control-allow-credentials", "true");
      }
      if (typeof cors.maxAge === "number") {
        corsHeaders.set("access-control-max-age", cors.maxAge.toString());
      }
    }

    return async (
      event: APIGatewayProxyEvent,
      context: LambdaContext
      // callback: APIGatewayProxyCallback
    ) => {
      // We re-load the headers into a Fetch API-compatible `Headers`
      // interface within `graphqlLambda`, but we still need to respect the
      // case-insensitivity within this logic here, so we'll need to do it
      // twice since it's not accessible to us otherwise, right now.
      const eventHeaders = new Headers(event.headers);

      // Make a request-specific copy of the CORS headers, based on the server
      // global CORS headers we've set above.
      const requestCorsHeaders = new Headers(corsHeaders);

      if (cors && cors.origin) {
        const requestOrigin = eventHeaders.get("origin");
        if (typeof cors.origin === "string") {
          requestCorsHeaders.set("access-control-allow-origin", cors.origin);
        } else if (
          requestOrigin &&
          (typeof cors.origin === "boolean" ||
            (Array.isArray(cors.origin) &&
              requestOrigin &&
              cors.origin.includes(requestOrigin)))
        ) {
          requestCorsHeaders.set("access-control-allow-origin", requestOrigin);
        }

        const requestAccessControlRequestHeaders = eventHeaders.get(
          "access-control-request-headers"
        );
        if (!cors.allowedHeaders && requestAccessControlRequestHeaders) {
          requestCorsHeaders.set(
            "access-control-allow-headers",
            requestAccessControlRequestHeaders
          );
        }
      }

      // Convert the `Headers` into an object which can be spread into the
      // various headers objects below.
      // Note: while Object.fromEntries simplifies this code, it's only currently
      //       supported in Node 12 (we support >=6)
      const requestCorsHeadersObject = Array.from(requestCorsHeaders).reduce<
        Record<string, string>
      >((headersObject, [key, value]) => {
        headersObject[key] = value;
        return headersObject;
      }, {});

      if (event.httpMethod === "OPTIONS") {
        context.callbackWaitsForEmptyEventLoop = false;
        return {
          body: "",
          statusCode: 204,
          headers: {
            ...requestCorsHeadersObject
          }
        };
        // return callback(null, {
        //   body: '',
        //   statusCode: 204,
        //   headers: {
        //     ...requestCorsHeadersObject,
        //   },
        // });
      }

      if (event.path === "/.well-known/apollo/server-health") {
        const successfulResponse = {
          body: JSON.stringify({ status: "pass" }),
          statusCode: 200,
          headers: {
            "Content-Type": "application/json",
            ...requestCorsHeadersObject
          }
        };

        const failureResponse = {
          body: JSON.stringify({ status: "fail" }),
          statusCode: 503,
          headers: {
            "Content-Type": "application/json",
            ...requestCorsHeadersObject
          }
        };

        if (onHealthCheck) {
          // wait HERE!!! Otherwise, the apollo server execution carries on!!!
          // return;
          const hcRes = await onHealthCheck(event);

          return hcRes ? successfulResponse : failureResponse;
          // .then(() => {
          //   // return callback(null, successfulResponse);
          //   return successfulResponse;
          // })
          // .catch(() => {
          //   return callback(null, {
          //     body: JSON.stringify({ status: 'fail' }),
          //     statusCode: 503,
          //     headers: {
          //       'Content-Type': 'application/json',
          //       ...requestCorsHeadersObject,
          //     },
          //   });
          // });
        } else {
          return successfulResponse;
          // return callback(null, successfulResponse);
        }
      }

      if (this.playgroundOptions && event.httpMethod === "GET") {
        const acceptHeader = event.headers["Accept"] || event.headers["accept"];
        if (acceptHeader && acceptHeader.includes("text/html")) {
          const path =
            event.path ||
            (event.requestContext && event.requestContext.path) ||
            "/";

          const playgroundRenderPageOptions: PlaygroundRenderPageOptions = {
            endpoint: path,
            ...this.playgroundOptions
          };

          // return callback(null, {
          return {
            body: renderPlaygroundPage(playgroundRenderPageOptions),
            statusCode: 200,
            headers: {
              "Content-Type": "text/html",
              ...requestCorsHeadersObject
            }
          };
          // );
        }
      }

      return this.handleGQL(
        event,
        context,
        promiseWillStart,
        requestCorsHeaders
      );

      // return graphqlLambda()
      // const callbackFilter: APIGatewayProxyCallback = (error, result) => {
      //   callback(
      //     error,
      //     result && {
      //       ...result,
      //       headers: {
      //         ...result.headers,
      //         ...requestCorsHeadersObject
      //       }
      //     }
      //   );
      // };

      // graphqlLambda(async () => {
      //   // In a world where this `createHandler` was async, we might avoid this
      //   // but since we don't want to introduce a breaking change to this API
      //   // (by switching it to `async`), we'll leverage the
      //   // `GraphQLServerOptions`, which are dynamically built on each request,
      //   // to `await` the `promiseWillStart` which we kicked off at the top of
      //   // this method to ensure that it runs to completion (which is part of
      //   // its contract) prior to processing the request.
      //   await promiseWillStart;
      //   return this.createGraphQLServerOptions(event, context);
      // })(event, context, callbackFilter);
    };
  }

  private async handleGQL(
    event: APIGatewayProxyEvent,
    context: LambdaContext,
    promiseWillStart: Promise<void>,
    requestCorsHeaders: Headers
  ) {
    context.callbackWaitsForEmptyEventLoop = false;
    const optionsFunc = async () => {
      // In a world where this `createHandler` was async, we might avoid this
      // but since we don't want to introduce a breaking change to this API
      // (by switching it to `async`), we'll leverage the
      // `GraphQLServerOptions`, which are dynamically built on each request,
      // to `await` the `promiseWillStart` which we kicked off at the top of
      // this method to ensure that it runs to completion (which is part of
      // its contract) prior to processing the request.
      await promiseWillStart;
      return this.createGraphQLServerOptions(event, context);
    };
    if (event.httpMethod === "POST" && !event.body) {
      return {
        body: "POST body missing.",
        statusCode: 500
      };
    }
    const corsHeaders = Array.from(requestCorsHeaders).reduce<
      Record<string, string>
    >((headersObject, [key, value]) => {
      headersObject[key] = value;
      return headersObject;
    }, {});
    // let headers = new Headers(event.headers);
    // headers.append (requestCorsHeadersObject);
    const eventHeaders = Array.from(new Headers(event.headers)).reduce<
      Record<string, string>
    >((headersObject, [key, value]) => {
      headersObject[key] = value;
      return headersObject;
    }, {});

    try {
      const gqlResp: HttpQueryResponse = await runHttpQuery([event, context], {
        method: event.httpMethod,
        options: optionsFunc,
        query:
          event.httpMethod === "POST" && event.body
            ? JSON.parse(event.body)
            : event.queryStringParameters,
        request: {
          url: event.path,
          method: event.httpMethod,
          headers: new Headers(event.headers)
        }
      });

      const combinedHeaders = new Headers({
        ...corsHeaders,
        ...eventHeaders,
        ...gqlResp.responseInit.headers
      });

      return {
        body: gqlResp.graphqlResponse,
        statusCode: 200,
        headers: combinedHeaders
      };
    } catch (error) {
      if ("HttpQueryError" !== error.name) {
        return error;
      }
      const combinedHeaders = new Headers({
        ...corsHeaders,
        ...eventHeaders,
        ...error.headers
      });

      return {
        body: error.message,
        statusCode: error.statusCode,
        headers: combinedHeaders
      };
    }
  }

}
@uehondor
Copy link
Contributor

+1

I've also found that handler doesn't wait on the promise returned from the onHealthCheck callback.

I think a simple return in this code will suffice.

For example:
image

uehondor added a commit to uehondor/apollo-server that referenced this issue Feb 26, 2021
This fixes the issue "Custom onHealthCheck function is intermittently called" reported at apollographql#3999.
This fix ensures the healthcheck handling code does not carry on executing and waits for the returned promise to be resolved or rejected.
@glasser
Copy link
Member

glasser commented Feb 26, 2021

I agree that this looks like a bug in the Lambda health check handler that you can escape from the health check block and keep going.

Note that @adikari in #1989 is discussing refactoring the Lambda handler to work either as async (Promise-returning) or callback-taking, which may conflict with this work.

@glasser glasser changed the title Custom onHealthCheck function is intermittently called. Lambda: onHealthCheck handling doesn't stop normal request processing Feb 26, 2021
@uehondor
Copy link
Contributor

@glasser Hi David, thanks for bringing this to my attention. I agree the lambda handler needs refactoring.
I think a reactor at that scale might take longer to get in, merged and released with the backlog of open PRs.

I wondered if the simple bug fix above might be easier to get merged and released to unblock teams.

@glasser
Copy link
Member

glasser commented Feb 26, 2021

Yep, I'd also take a PR for that!

glasser pushed a commit that referenced this issue Mar 1, 2021
…mise (#4969)

* Ensure lambda healthcheck callback waits on returned promise

This fixes the issue "Custom onHealthCheck function is intermittently called" reported at #3999.
This fix ensures the healthcheck handling code does not carry on executing and waits for the returned promise to be resolved or rejected.

* Add new tests to exercise lambda healthcheck

The tests validates that a healthcheck endpoint exists and correctly handles promises returned from the callback.

This commit also moves the 'mock server'-like code (responsible for transforming lambda response into HTTP response) has been moved to a common location where it is reused in the newly created apollo-server-lambda/src/ApolloServer.test.ts.
@glasser
Copy link
Member

glasser commented Mar 1, 2021

Fixed in #4969

@glasser glasser closed this as completed Mar 1, 2021
@glasser
Copy link
Member

glasser commented Mar 6, 2021

I've published a release with this to version 2.21.1-alpha.0. My plan is to release 2.21.1 on Monday. If you want to test that this fix works before then, try running the prerelease yourself, and provide feedback on #4990

@adikari
Copy link
Contributor

adikari commented Mar 7, 2021

I have updated the changes in this #4969 slightly to support changes for #5004

glasser added a commit that referenced this issue Mar 11, 2021
Lambda has two ways of defining a handler: as an async Promise-returning function, and as a callback-invoking function. https://docs.aws.amazon.com/lambda/latest/dg/nodejs-handler.html The async variety was introduced with Lambda's Node 8 runtime. Apparently the callback-invoking variety was removed with their Node 14 runtime (their docs don't mention this anywhere but our users have reported this: #1989 (comment)). While AWS doesn't directly support pre-Node-8 runtimes any more, it's possible some users are using a Custom Runtime that still requires the Node 6 version, and Apollo Server still technically supports Node 6. So we would like to support both an async and a callback-invoking handler.

To implement this, I rewrote the handler to be an async function and used a little maybeCallbackify function to make it work as either type of handler. (Apollo Server 3 will drop Node 6 support, at which point we should just make this package always return an async handler. Tracking in #5018.) This simplified a lot of the logic and I was able to replace various Promise and callback-y stuff with direct code. To make this easier, I inlined the graphqlLambda function (which was the main API in Apollo Server v1 but is not exported in v2). This makes it easier to avoid bugs like #3999.

(We aren't positive if there is actually a Node 14 compatibility issue or not; see #1989 (comment). It's still a helpful refactor though.)

Original version by @adikari.
Fixes #1989

Co-authored-by: David Glasser <glasser@davidglasser.net>
This was referenced Mar 15, 2021
kodiakhq bot added a commit to ProjectXero/dbds that referenced this issue Mar 17, 2021
Bumps apollo-datasource from 0.7.2 to 0.7.3.

Changelog
Sourced from apollo-datasource's changelog.

CHANGELOG
The version headers in this history reflect the versions of Apollo Server itself.  Versions of other packages (e.g., those which are not actual HTTP integrations; packages not prefixed with "apollo-server", or just supporting packages) may use different versions.
🆕 Please Note!: 🆕 The @apollo/federation and @apollo/gateway packages now live in the apollographql/federation repository.

@apollo/gateway
@apollo/federation

vNEXT

The changes noted within this vNEXT section have not been released yet.  New PRs and commits which introduce changes should include an entry in this vNEXT section as part of their development.  With few exceptions, the format of the entry should follow convention (i.e., prefix with package name, use markdown backtick formatting for package names and code, suffix with a link to the change-set à la [PR #YYY](https://link/pull/YYY), etc.).  When a release is being prepared, a new header will be (manually) created below and the appropriate changes within that release will be moved into the new section.


apollo-server-core: The SIGINT and SIGTERM signal handlers installed by default (when not disabled by stopOnTerminationSignals: false) now stay active (preventing process termination) while the server shuts down, instead of letting a second signal terminate the process. The handlers still re-signal the process after this.stop() concludes. Also, if this.stop() throws, the signal handlers will now log and exit 1 instead of throwing an uncaught exception. [Issue #4931](apollographql/apollo-server#4931)
apollo-server-lambda: (UPDATE THIS MESSAGE BEFORE RELEASE; we are not sure if this actually helps nodejs14 compatibility or if it's just a nice refactor.) Support the nodejs14 runtime by changing the handler to be an async handler. (For backwards compatibility, if the handler receives a callback, it still acts like a non-async handler.) [Issue #1989](apollographql/apollo-server#1989) [PR #5004](apollographql/apollo-server#5004)

v2.21.1

apollo-server-lambda: The onHealthCheck option did not previously work. Additionally, health checks (with onHealthCheck or without) didn't work in all Lambda contexts, such as behind Custom Domains; the path check is now more flexible. [Issue #3999](apollographql/apollo-server#3999) [PR #4969](apollographql/apollo-server#4969) [Issue #4891](apollographql/apollo-server#4891) [PR #4892](apollographql/apollo-server#4892)
The debug option to new ApolloServer (which adds stack traces to errors) now affects errors that come from requests executed with server.executeOperation (and its wrapper apollo-server-testing), instead of just errors that come from requests executed over HTTP. [Issue #4107](apollographql/apollo-server#4107) [PR #4948](apollographql/apollo-server#4948)
Bump version of @apollographql/graphql-playground-html to v1.6.27 and @apollographql/graphql-playground-react to v1.7.39 to resolve incorrectly rendered CDN URL when Playground version was false-y.  [PR #4932](apollographql/apollo-server#4932) [PR #4955](apollographql/apollo-server#4955) [Issue #4937](apollographql/apollo-server#4937)

v2.21.0

Apollo Server can now be installed with graphql@15 without causing peer dependency errors or warnings. (Apollo Server has a file upload feature which was implemented as a wrapper around the graphql-upload package. We have been unable to upgrade our dependency on that package due to backwards-incompatible changes in later versions, and the version we were stuck on did not allow graphql@15 as a peer dependency. We have now switched to a fork of that old version called @apollographql/graphql-upload-8-fork that allows graphql@15.) Also bump the graphql-tools dependency from 4.0.0 to 4.0.8 for graphql@15 support. [Issue #4865](apollographql/apollo-server#4865)

v2.20.0

apollo-server: Previously, ApolloServer.stop() functioned like net.Server.close() in that it did not close idle connections or close active connections after a grace period. This meant that trying to await ApolloServer.stop() could hang indefinitely if there are open connections. Now, this method closes idle connections, and closes active connections after 10 seconds. The grace period can be adjusted by passing the new stopGracePeriodMillis option to new ApolloServer, or disabled by passing Infinity (though it will still close idle connections). Note that this only applies to the "batteries-included" ApolloServer in the apollo-server package with its own built-in Express and HTTP servers. [PR #4908](apollographql/apollo-server#4908) [Issue #4097](apollographql/apollo-server#4097)
apollo-server-core: When used with ApolloGateway, ApolloServer.stop now invokes ApolloGateway.stop. (This makes sense because ApolloServer already invokes ApolloGateway.load which is what starts the behavior stopped by ApolloGateway.stop.) Note that @apollo/gateway 0.23 will expect to be stopped in order for natural program shutdown to occur. [PR #4907](apollographql/apollo-server#4907) [Issue #4428](apollographql/apollo-server#4428)
apollo-server-core: Avoid instrumenting schemas for the old graphql-extensions library unless extensions are provided. [PR #4893](apollographql/apollo-server#4893) [Issue #4889](apollographql/apollo-server#4889)
apollo-server-plugin-response-cache@0.6.0: The shouldReadFromCache and shouldWriteToCache hooks were always documented as returning ValueOrPromise<boolean> (ie, that they could be either sync or async), but they actually only worked if they returned a bool. Now they can be either sync or async as intended. [PR #4890](apollographql/apollo-server#4890) [Issue #4886](apollographql/apollo-server#4886)
apollo-datasource-rest@0.10.0: The RESTDataSource.trace method is now protected instead of private to allow more control over logging and metrics. [PR #3940](apollographql/apollo-server#3940)

v2.19.2

apollo-server-express: types: Export ExpressContext from main module. [PR #4821](apollographql/apollo-server#4821) [Issue #3699](apollographql/apollo-server#3699)
apollo-server-env: types: The first parameter to fetch is now marked as required, as intended and in accordance with the Fetch API specification. [PR #4822](apollographql/apollo-server#4822) [Issue #4741](apollographql/apollo-server#4741)
apollo-server-core: Update graphql-tag package to latest, now with its graphql-js peerDependencies expanded to include ^15.0.0 [PR #4833](apollographql/apollo-server#4833)

v2.19.1

apollo-server-core: The debugPrintReports option to ApolloServerPluginUsageReporting now prints traces as well. [PR #4805](apollographql/apollo-server#4805)

v2.19.0

apollo-server-testing: types: Allow generic variables usage of query and mutate functions. [PR #4383](apollograpqh/apollo-server#4383)
apollo-server-express: Export the GetMiddlewareOptions type. [PR #4599](apollograpqh/apollo-server#4599)
apollo-server-lambda: Fix file uploads - ignore base64 decoding for multipart queries. [PR #4506](apollographql/apollo-server#4506)
apollo-server-core: Do not send  operation documents that cannot be executed to Apollo Studio. Instead, information about these operations will be combined into one "operation" for parse failures, one for validation failures, and one for unknown operation names.



... (truncated)


Commits

c212627 Release
See full diff in compare view




Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants