Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[@sentry/node] AWS Lambda and other Serverless solutions support #1449

Closed
vietbui opened this issue Jul 28, 2018 · 77 comments
Closed

[@sentry/node] AWS Lambda and other Serverless solutions support #1449

vietbui opened this issue Jul 28, 2018 · 77 comments

Comments

@vietbui
Copy link

@vietbui vietbui commented Jul 28, 2018

  • @sentry/node version 4.0.0-beta.11
  • I'm using hosted Sentry

What is the current behavior?

I'm using @sentry/node to capture exception on AWS lambda function.

    .catch(err => {
      Sentry.captureException(err)
      context.fail()
    })

However, it kills the process when context.fail() is called and the exception does not end up in Sentry.

I could do a workaround like:

    .catch(err => {
      Sentry.captureException(err)
      setTimeout(() => context.fail(), 1000)
    })

What is the expected behavior?

It would be nice if I can do something like:

    .catch(err => {
      Sentry.captureException(err, () => context.fail())
    })

Or something globally handle callback.

@vietbui vietbui changed the title [raven-node] Callback after capture exception (or event, message) [raven-node] Callback after capturing exception (or event, message) Jul 29, 2018
@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Aug 2, 2018

This may help I guess https://blog.sentry.io/2018/06/20/how-droplr-uses-sentry-to-debug-serverless (it's using old raven version, which had a callback, but I'm mostly pointing to a callbackWaitsForEmptyEventLoop flag.

There's no official way yet, as we're still trying things out in beta, but it's doable with this code:

import { init, getDefaultHub } from '@sentry/node';

init({
  dsn: 'https://my-dsn.com/1337'
});

exports.myHandler = async function(event, context) {
  // your code

  await getDefaultHub().getClient().captureException(error, getDefaultHub().getScope());
  context.fail();
}
@vietbui
Copy link
Author

@vietbui vietbui commented Aug 6, 2018

@kamilogorek Thank you for the pointer. I'll give it a try and play back the learnings.

@vietbui
Copy link
Author

@vietbui vietbui commented Aug 7, 2018

@kamilogorek You suggestion works. I'm looking forward to a more official way.

@vietbui vietbui changed the title [raven-node] Callback after capturing exception (or event, message) [@sentry/node] Callback after capturing exception (or event, message) Aug 7, 2018
@HazAT
Copy link
Member

@HazAT HazAT commented Sep 7, 2018

@vietbui
In 4.0.0-rc.1 we introduced a function on the client called close, you call it like this:

import { getCurrentHub } from '@sentry/node';

getCurrentHub().getClient().close(2000).then(result => {
      if (!result) {
        console.log('We reached the timeout for emptying the request buffer, still exiting now!');
      }
      global.process.exit(1);
})

close will wait until all requests are sent, it will always resolve (result = false timeout was reached), up until the timeout is reached (in this example 2000ms).
This is our official API.
While the prev approach will still work, the close method works for all cases.

@HazAT HazAT closed this Sep 7, 2018
@vietbui
Copy link
Author

@vietbui vietbui commented Sep 13, 2018

@HazAT Nice one. Thanks for all the hard work.

@vietbui
Copy link
Author

@vietbui vietbui commented Sep 21, 2018

In 4.0.3 I call it like this in my lambda function:

try {
  ...
} catch (err) {
  await getCurrentHub().getClient().captureException(err, getCurrentHub().getScope())
  throw err
}

getDefaultHub() is no longer available.

@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Sep 24, 2018

@vietbui it's called getCurrentHub now, as we had to unify our API with other languages SDKs.

@vietbui
Copy link
Author

@vietbui vietbui commented Sep 26, 2018

@kamilogorek Thanks for the clarification. There is a problem with getCurrentHub approach as somehow the scope I set up did not end up in Sentry.

In the end I took a different approach as suggested by @HazAT to capture exception in my lambda functions:

try {
  ...
} catch (err) {
  Sentry.captureException(err)
  await new Promise(resolve => Sentry.getCurrentHub().getClient().close(2000).then(resolve))
  throw err
}

And it works perfectly.

@albinekb
Copy link

@albinekb albinekb commented Oct 3, 2018

Is this the recommended way to wait/force sentry to send events?

@Andriy-Kulak
Copy link

@Andriy-Kulak Andriy-Kulak commented Nov 11, 2018

@albinekb yes – https://docs.sentry.io/learn/draining/?platform=browser

This solution does not work for me for some reason. It only works the first time in production when there is a cold start time and does not work after that. here is example code

'use strict'

const Sentry =  require('@sentry/node')
Sentry.init({
  dsn: 'xxx',
  environment: process.env.STAGE
});

module.exports.createPlaylist = async (event, context, callback) => {
  context.callbackWaitsForEmptyEventLoop = false
  if(!event.body) {
    Sentry.captureException(error)
    await new Promise(resolve => Sentry.getCurrentHub().getClient().close(2000).then(resolve))
    return {
      statusCode: 500,
      headers: { 'Content-Type': 'text/plain' },
      body: 'Missing body parameters'
    }
  }
  return {
    statusCode: 200,
  }
};
@albinekb
Copy link

@albinekb albinekb commented Nov 12, 2018

@Andriy-Kulak Thats also stated in the docs:

After shutdown the client cannot be used any more so make sure to only do that right before you shut down the application.

So I don't know how we can handle this in lambda where we don't know when the application will be killed. Best would be to drain sentry per request like we could with the old API?

@LinusU
Copy link

@LinusU LinusU commented Nov 12, 2018

@HazAT could we reopen this, please? I think it's important to have a way to work with this on Lambda, which is becoming an increasingly common target to deploy to.

This is currently blocking me from upgrading to the latest version...

Personally, I would prefer being able to get a Promise/callback when reporting an error. Having a way to drain the queue without actually closing it afterward would be the next best thing...

What was the rationale of removing the callback from captureException?

@Andriy-Kulak
Copy link

@Andriy-Kulak Andriy-Kulak commented Nov 12, 2018

@albinekb it does not work at all if I remove the following line

await new Promise(resolve => Sentry.getCurrentHub().getClient().close(2000).then(resolve))

@LinusU what is the solution and sentry or raven solution are you using?

For me basically the following works with sentry/node @4.3.0, but I have to make lambda manually wait some period of time (in this case I put 2 seconds) for sentry to do what it needs to do. Which I am not sure why it needs to be there because we are awaiting for sentry to finish the captureException request. If I don't have the waiting period afterwards, then sentry does not seem to send the error.

'use strict'

const Sentry =  require('@sentry/node')
Sentry.init({
  dsn: 'xxx',
  environment: process.env.STAGE
});

module.exports.createPlaylist = async (event, context, callback) => {
  context.callbackWaitsForEmptyEventLoop = false
  if(!event.body) {
    const error = new Error('Missing body parameters in createPlaylist')
    await Sentry.captureException(error)
    await new Promise(resolve => {setTimeout(resolve, 2000)})
    return {
      statusCode: 500,
      headers: { 'Content-Type': 'text/plain' },
      body: 'Missing body parameters'
    }
  }
  return {
    statusCode: 200,
  }
};
@dwelch2344
Copy link

@dwelch2344 dwelch2344 commented Nov 12, 2018

We're also getting burned on this on Lambda. We started with the new libs and are totally boxed out, considering to go back to Raven. We're writing tests right now to attempt to close the hub and then reinitialize, which would be a workable workaround if it holds water. But still hacky / likely to cause problems under load.

Personally I'd prefer some sort of flush() that returns a promise – hard to find a downside. Think it'd ever happen?

@LinusU
Copy link

@LinusU LinusU commented Nov 12, 2018

what is the solution and sentry or raven solution are you using?

I'm using the following express error handler:

app.use((err: any, req: express.Request, res: express.Response, next: express.NextFunction) => {
  let status = (err.status || err.statusCode || 500) as number

  if (process.env.NODE_ENV === 'test') {
    return next(err)
  }

  if (status < 400 || status >= 500) {
    Raven.captureException(err, () => next(err))
  } else {
    next(err)
  }
})

I'm then using scandium to deploy the Express app to Lambda

edit: this is with Raven "raven": "^2.6.3",

@LinusU
Copy link

@LinusU LinusU commented Nov 13, 2018

The dream API would be something like this 😍

Sentry.captureException(err: Error): Promise<void>
@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Nov 13, 2018

@LinusU https://github.com/getsentry/sentry-javascript/blob/master/packages/core/src/baseclient.ts#L145-L152 🙂

You have to use client instance directly to get it though. The reason for this is that decided that the main scenario is a "fire and forget" type of behavior, thus it's not an async method. Internally however, we do have async API which we use ourselves.

@LinusU
Copy link

@LinusU LinusU commented Nov 13, 2018

Seems that what I actually want is something more like:

const backend = client.getBackend()
const event = await backend.eventFromException(error)
await client.processEvent(event, finalEvent => backend.sendEvent(finalEvent))

In order to skip all the queueing and buffering...

I get that the design is tailored to "fire and forgot", and for running in a long-running server, and it's probably quite good at that since it does a lot of buffering, etc. The problem is that this is the exact opposite that you want for Lambda, App Engine, and other "serverless" architectures, which are becoming more and more common.

Would it be possible to have a special method that sends the event as fast as possible, and returns a Promise that we can await? That would be perfect for the serverless scenarios!

class Sentry {
  // ...

  async unbufferedCaptureException(err: Error): Promise<void> {
    const backend = this.client.getBackend()
    const event = await backend.eventFromException(error)
    await this.client.processEvent(event, finalEvent => backend.sendEvent(finalEvent))
  }

  // ...
}
@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Nov 13, 2018

@LinusU we'll most likely create a specific serverless package for this scenario. We just need to find some time, as it's the end of the year and things are getting crowdy now. Will keep you posted!

@LinusU
Copy link

@LinusU LinusU commented Nov 13, 2018

we'll most likely create a specific serverless package for this scenario

That would be amazing! 😍

@kamilogorek kamilogorek changed the title [@sentry/node] Callback after capturing exception (or event, message) [@sentry/node] AWS Lambda and other Serverless solutions support Nov 13, 2018
@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Nov 20, 2018

@mtford90

when exactly would I use this better solution? As far as I know it's not possible to know when the lambda will be shutdown - plus it seems silly to wait for an arbitrary amount of time for shutdown to allow sentry to do its thing - especially on expensive high memory/cpu lambda functions.

(talking about draining)

It's meant to be used as the last thing before closing down the server process. Timeout in drain method is maximum time that we'll wait before shutting down the process, which doesn't mean that we will always use up that time. If the server is fully responsive, it'll send all the remaining events right away.

There's no way to know this per se, but there's a way to tell the lambda when it should be shut down using handler's callback argument.

Also @LinusU, I re-read your previous comment, specifically this part:

Would it be possible to have a special method that sends the event as fast as possible, and returns a Promise that we can await? That would be perfect for the serverless scenarios!

This is how we implemented our buffer. Every captureX call on the client, will add it to the buffer, that's correct, but it's not queued in any way, it's executed right away and this pattern is only used so that we can get the information if everything was successfully sent through to Sentry.

public async add(task: Promise<T>): Promise<T> {
if (this.buffer.indexOf(task) === -1) {
this.buffer.push(task);
}
task.then(async () => this.remove(task)).catch(async () => this.remove(task));
return task;
}

This means that if you do something like this in AWS Lambda (assuming you want to use default client, which is the simplest case):

import * as Sentry from '@sentry/browser';

Sentry.init({ dsn: '__YOUR_DSN__' });

exports.handler = (event, context, callback) => {
    try {
      // do something
    catch (err) {
      Sentry.getCurrentHub()
        .getClient()
        .captureException(err)
        .then((status) => {
          // request status
          callback(null, 'Hello from Lambda');
        })
    }
};

You can be sure that it was sent right away and there was no timing/processing overhead.

@jviolas
Copy link

@jviolas jviolas commented Nov 20, 2018

@kamilogorek
Does this mean something like this should work in a async/await handler (where you don't use the callback)?

import * as Sentry from '@sentry/node';

Sentry.init({ dsn: '__YOUR_DSN__' });

exports.handler = async (event, context) => {
    try {
      // do something

      return 'Hello from Lambda';
    catch (err) {
      await Sentry.getCurrentHub().getClient().captureException(err);
      return 'Hello from Lambda with error';
    }
};
@zeusdeux
Copy link

@zeusdeux zeusdeux commented Mar 4, 2019

@kamilogorek I wish I had debug info but there's nothing in the logs. I always did await on the overriden captureException. By the regular way, do you mean without overriding captureException?

@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Mar 4, 2019

@zeusdeux exactly, just call our native Sentry.captureException(error) without any overrides.

So your helper will be:

import * as Sentry from '@sentry/node'

export function init({ host, method, lambda, deployment }) {
  const environment = host === process.env.PRODUCTION_URL ? 'production' : host

  Sentry.init({
    dsn: process.env.SENTRY_DSN,
    environment,
    beforeSend(event, hint) {
      if (hint && hint.originalException) {
        // eslint-disable-next-line
        console.log('Error:', hint.originalException);
      }
      return event;
    }
  })

  Sentry.configureScope(scope => {
    scope.setTag('deployment', deployment)
    scope.setTag('lambda', lambda)
    scope.setTag('method', method)
  })
}

and in the code you call it:

import * as Sentry from '@sentry/node'

try {
  // ...
} catch (err) {
  Sentry.captureException(err);
  await Sentry.flush(2000);
  return respondWithError('Something went wrong', 500);
}
@zeusdeux
Copy link

@zeusdeux zeusdeux commented Mar 4, 2019

@kamilogorek I'll give it a go and report back. Also, thanks for the tip on beforeSend ^_^

@tanduong
Copy link

@tanduong tanduong commented Mar 6, 2019

await Sentry.flush(2000);

is also not working for me.

@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Mar 6, 2019

@tanduong can you provide repro case? Just stating that it doesn't work isn't too helpful 😅

@tanduong
Copy link

@tanduong tanduong commented Mar 6, 2019

@kamilogorek actually, I just found out that

await Sentry.getCurrentHub().getClient().close(2000)

doesn't work for me either because my lambda function is attached to VPC.

I confirm that

await Sentry.flush(2000);

is actually working.

BTW, so how would you deal with lambda in VPC? Attach to a NAT gateway? I just want Sentry but not the public internet.

@vietbui
Copy link
Author

@vietbui vietbui commented Mar 7, 2019

@tanduong Sentry is on the public internet, so yes you need to have NAT gateway if your lambda is running within your VPC. Otherwise you would have to explore hosted Sentry option.

@enapupe
Copy link

@enapupe enapupe commented Jun 21, 2019

What's the flush(2000) actually doing? I had this code working mostly fine but now I have a couple captureMessage calls happening concurrently it's timing out every time!

@zeusdeux
Copy link

@zeusdeux zeusdeux commented Jun 21, 2019

Flushing the internal queue of messages over the wire

@enapupe
Copy link

@enapupe enapupe commented Jun 21, 2019

Ok, that makes total sense. I think my issue then it's that this promise never returns when there's nothing else to flush? Whenever I run my wrapped captureException fn concurrently it times out my handler.

@enapupe
Copy link

@enapupe enapupe commented Jun 21, 2019

export const captureMessage = async (
  message: string,
  extras?: any,
): Promise<boolean> =>
  new Promise((resolve) => {
    Sentry.withScope(async (scope) => {
      if (typeof extras !== 'undefined') {
        scope.setExtras(extras)
      }
      Sentry.captureMessage(message)
      await Sentry.flush(2000)
      resolve(true)
    })
  })

await Sentry.flush() doesn't really finish after the first captureMessage call.

@esetnik
Copy link

@esetnik esetnik commented Jul 1, 2019

I have what I believe is a similar issue as @enapupe. If you call await client.flush(2000); in parallel only the first promise is resolved. This can happen in AWS lambda contexts where the client is reused among multiple concurrent calls to the handler.

I am using code like this:

 let client = Sentry.getCurrentHub().getClient();
  if (client) {
    // flush the sentry client if it has any events to send
    log('begin flushing sentry client');
    try {
      await client.flush(2000);
    } catch (err) {
      console.error('sentry client flush error:', err);
    }
    log('end flushing sentry client');
  }

But when I make two calls to my lambda function in rapid succession, I get:

  app begin flushing sentry client +2ms
  app begin flushing sentry client +0ms
  app end flushing sentry client +2ms

You can see that the second promise is never resolved.

@enapupe
Copy link

@enapupe enapupe commented Jul 1, 2019

@esetnik I've filed an issue on that: #2131
My currrent workaround is a wrapper flush fn that always resolves (based on a timeout):

const resolveAfter = (ms: number) =>
  new Promise((resolve) => setTimeout(resolve, ms))

const flush = (timeout: number) =>
  Promise.race([resolveAfter(timeout), Sentry.flush(timeout)])
@esetnik
Copy link

@esetnik esetnik commented Jul 1, 2019

@enapupe I added a note about your workaround in #2131. I believe it will cause a performance regression on concurrent flush.

@SarasArya
Copy link

@SarasArya SarasArya commented Aug 10, 2019

In case anybody is having any issues.
This works beautifully

@cibergarri
Copy link

@cibergarri cibergarri commented Dec 10, 2019

@SarasArya @HazAT
First of all... Thanks for sharing your solution! :)
There is one callback of configureScope method that I guess it is supposed to be called before captureException but it is not being done in the same "thread".
Couldn't this lead to the appearance of race conditions?

@SarasArya
Copy link

@SarasArya SarasArya commented Dec 11, 2019

@cibergarri I don't think so, looks synchronous to me, in case you have an async method in there, then there would be race conditions.
Consider is like .map of array's that the same thing happening here. In case you have issues wrapping your head around it. I hope that helps.

@kamilogorek
Copy link
Member

@kamilogorek kamilogorek commented Dec 11, 2019

Yeah, it's totally fine to do that

@ajjindal
Copy link

@ajjindal ajjindal commented Sep 21, 2020

Update: Sentry now supports automated error capture for Node/Lambda environments: https://docs.sentry.io/platforms/node/guides/aws-lambda/

@armando25723
Copy link

@armando25723 armando25723 commented Nov 9, 2020

I'm using @sentry/serverless like this:

const Sentry = require("@sentry/serverless");
Sentry.AWSLambda.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 1.0,
  environment: appEnv
});

exports.main = Sentry.AWSLambda.wrapHandler(async (event, context) => {
     try{
           //my code
     }catch(error){
          Sentry.captureException(error);
          await Sentry.flush(3000);
     }

});

It does not work on lambda.
In my testing env it was working but in prod where there is a lot a concurrent executions and the containers are reuse it is logging just about 10% of the total amount.

Any advice?

@marshall-lee
Copy link
Contributor

@marshall-lee marshall-lee commented Nov 10, 2020

@armando25723

Please tell how did your measure that it loses the exception events? Do you have a code sample of how such lost exception was thrown? Need more context.

@armando25723
Copy link

@armando25723 armando25723 commented Nov 10, 2020

const Sentry = require("@sentry/serverless"); // "version": "5.27.3"
Sentry.AWSLambda.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampleRate: 1.0,
  environment: appEnv
});
exports.main = Sentry.AWSLambda.wrapHandler(async (event, context) => {
     try{
           throw new Error('Test Error');
     }catch(error){
          Sentry.captureException(error);
          await Sentry.flush(3000);
     }
});

What is happening?
If the function is invoked several times with short interval between invocations the event is only logged once.
If the time interval between invocations is larger all events are logged.

I assume that the problem is when the invocation is over a reused container.

I have tried also
await Sentry.captureException(error);
and:
await Sentry.flush();
and without flushing
same result

@armando25723
Copy link

@armando25723 armando25723 commented Nov 11, 2020

@marshall-lee what do you recomend? Should I create an issue, I'm stuck here.

@ajjindal
Copy link

@ajjindal ajjindal commented Nov 11, 2020

@armando25723 Looks like the server is responding with 429 (too many exceptions) while sending these events. We throw that in case of over quota/rate limiting scenarios. Do you know if you are sequentially sending errors or over quota? We can debug further if you think these are real error events getting dropped and you are under our 5k limit for the free tier.

@armando25723
Copy link

@armando25723 armando25723 commented Nov 11, 2020

@ajjindal all other projects are working fine with sentry. The organization slug is "alegra", project name is mail-dispatch-serverless under #mail-micros. We have been using sentry for a long time, but first time with serverless. This is not free tier, I can't tell you exactly which plan we are using but it is a paid one.
It will be nice if you could help me to debug further.
Thanks for reply : )

PD: is Team Plan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet