Skip to content

Error Monitoring

Mathew Charles edited this page Oct 29, 2015 · 34 revisions

Overview

As summarized in the README you can apply the ErrorTrigger attribute to functions so they will be automatically invoked when errors pass certain thresholds. For example, here's a function that will be called whenever 10 errors occur within a 30 minute sliding window (throttled at a maximum of 1 notification per hour):

public static void ErrorMonitor(
    [ErrorTrigger("0:30:00", 10, Throttle = "1:00:00")] TraceFilter filter)
{
    // send a Text notification using IFTTT
    string body = string.Format("{{ \"value1\": \"{0}\" }}", filter.Message);
    var request = new HttpRequestMessage(HttpMethod.Post, WebNotificationUri)
    {
        Content = new StringContent(body, Encoding.UTF8, "application/json")
    };
    HttpClient.SendAsync(request);
}

That function makes a POST request to IFTTT to trigger a recipe that will send a text message alert to a mobile phone. The notification sections below walk through the details of setting that up end to end, as well as showing a few ways to send email notifications.

In addition to the above sliding window binding, ErrorTrigger also allows you to specify a custom TraceFilter Type (e.g. [ErrorTrigger(typeof(MyFilter))]) which allows you to use your own filtering logic to trigger notifications. ErrorTrigger supports binding to the following parameter types:

  • TraceFilter - to the TraceFilter that triggered the notification. You have access to all the TraceEvents and other details.
  • IEnumerable<TraceEvent> - the collection of TraceEvents that triggered the notification
  • TraceEvent - the last TraceEvent that occurred (i.e. the most recent)

IFTTT Event Notifications

You can use the IFTTT Maker Channel for receiving external events via HTTP POST. That's what the above function does - when the error trigger fires, it sends a simple POST request to IFTTT, to an event trigger that was previously configured. The beauty of this is that once you set the event up, you can use IFTTT to configure the action to be whatever you want, choosing from their vast array of channels. For example you can send a text message, email, make a phone call, etc. We'll now walk through the simple steps required to set up an SMS text alert.

  1. In IFTTT create a New Recipe
  2. For the "This" trigger channel, choose "Maker"
  3. Configure the Maker trigger by selecting "Receive a web request"
  4. Choose an appropriate event name, e.g. "WebJobErrors"
  5. Configure the "That" action by choosing "SMS" and entering the mobile number to message
  6. Set the message body to template "WebJob Error Notification {{Value1}}". The "Value1" ingredient comes from the IFTTT dictated format for HTTP request triggers. We'll send a JSON property "Value1" in our payload, and IFTTT will fill in the template.

Let's test the event. If you go to the IFTTT Maker Channel and select "How to trigger events", it will take you to a test page where you can enter your event info in the browser and trigger the event. Let's do this now:

  1. In the browser, click the {event} field and change it to our event name above WebJobErrors.
  2. In the JSON body edit box for value1 enter a test value "The back-end is on fire!"
  3. Hit the "Test it" button - you should receive a text message in a couple seconds :)
  4. Copy the full trigger URL for the event you just tested, since we'll need to add it as an app setting to your WebJob. The URL looks like this: https://maker.ifttt.com/trigger/{event}/with/key/{key}

That's it. Now that IFTTT is configured, add the IFTTT event URL as an app setting using whatever key you want. In your WebJob code load it however you wish, and use it for the "WebNotificationUri" value in the sample error function above.

Email Notifications

You can use whatever mail client you want to send an email notification. One simple way is to use the SendGrid extension binding that is part of this library. An example use of SendGrid can be seen in the extensions sample here. You can also use the SendGrid binder as part of your error handler function to simplify sending an email. Here's an example that sends a detailed alert email:

public static void ErrorMonitor(
    [ErrorTrigger("0:30:00", 10, Throttle = "1:00:00")] TraceFilter filter,
    [SendGrid] SendGridMessage message)
{
    message.Subject = "WebJobs Error Alert";
    message.Text = filter.GetDetailedMessage(5);
}

The SendGrid binding will send the email when the error handler completes. This example sends the email directly to SendGrid without involving any "middle-man". You could also use IFTTT as detailed above which would allow you to use any IFTTT supported mail service. For example if you prefer to use MailChimp or some other service that doesn't have a first class WebJobs SDK support, you can go that route.

TraceMonitor

The mechanism that ErrorTrigger uses behind the scenes can also be used directly if you want more control or would like to set things up manually yourself. Each ErrorTrigger binding is creating a TraceMonitor configured with the options specified by the attribute, and adding it to the JobHostConfiguration.Tracing.Tracers TraceWriter collection. You can also do this manually yourself as part of your JobHost startup code:

config.UseCore();

var traceMonitor = new TraceMonitor()
    .Filter(new SlidingWindowTraceFilter(TimeSpan.FromMinutes(5), 3))
    .Filter(p =>
    {
        var fex = p.Exception as FunctionInvocationException;
        return fex != null &&
               fex.MethodName == "ExtensionsSample.FileSamples.ImportFile";
    }, "ImportFile Job Failed")
    .Subscribe(WebNotify, EmailNotify)
    .Subscribe(p =>
    {
        // error handler code here
    })
    .Throttle(TimeSpan.FromMinutes(30));

config.Tracing.Tracers.Add(traceMonitor);

As you can see, TraceMonitors are created by chaining together one or more Filters and Subscribers via a fluent interface. A TraceMonitor inherits from TraceWriter. When added to the Tracers collection, the JobHost will route trace events through them, giving them a chance to inspect, filter and act upon events. TraceFilters are responsible for inspecting events and aggregating them as needed. They will then trigger notification when their threshold is reached (e.g. sliding window error count, function name match, etc.). Subscribers are simply actions taking a TraceFilter instance and performing whatever action they need, e.g. alert notifications, etc.

TraceMonitors (and ErrorTrigger functions) will receive events as long as the JobHost is running. Errors occurring before the JobHost is up and running will not be processed (but are still logged to the standard error log). For example, errors occurring outside of the WebJobs SDK (e.g. if the Continuous WebJob host fails to start) happen before the SDK is even running, so are not visible to TraceMonitors. TraceMonitor/EventTrigger monitoring is designed to handle steady state errors that occur while the WebJob is up and running. When a particular job function fails, the error(s) are logged but the host continues running. It's these errors that TraceMonitor/EventTrigger provide insights into.

In addition to handling general unhandled errors, you can also use the TraceWriter binding to log your own TraceEvents/Errors as needed to trigger alerts. Consider the following function:

public static void ProcessMessage(
    [QueueTrigger("messages")] string body,
    TraceWriter trace)
{
    . . .

    TraceEvent error = new TraceEvent(TraceLevel.Error, "Error detected");
    error.Properties.Add("ErrorInfo", "Some Custom Error Info");
    trace.Trace(error);
}

Any errors written to the TraceWriter will flow through all registered TraceMonitors (including those created by ErrorTrigger bindings). You can use this to trigger alerts as needed. A TraceEvent can also include additional custom properties (TraceEvent.Properties), which can be inspected by TraceFilters do conditional aggregation/alerting.