d-sandbox

<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 600px; height: 163px">
</div>

# Alerting

Alerting allows you to announce the progress of different applications, which becomes increasingly important in automated production systems.  In this lesson, you explore basic alerting strategies using email and REST integration with tools like Slack.

## ![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) In this lesson you:<br>
 - Explore the alerting landscape
 - Walk through basic email alerting using Databricks Jobs
 - Create a basic REST alert integrated with Slack
 - Create a more complex REST alert for Spark jobs using `SparkListener`

### The Alerting Landscape

There are a number of different alerting tools with various levels of sophistication...<br><br>

* PagerDuty has risen to be one of the most popular tools for monitoring production outages
  - It allows for the escalation of issues across a team with alerts including text messages and phone calls
* Slack
* Twilio   
* Email alerts

Most alerting frameworks allows for custom alerting done through REST integration

One additional helpful tool for Spark workloads....<br><br> 

* Is the `SparkListener`
* It can perform custom logic on various Cluster actions

Run the following cell to set up our environment.

In [5]:
%run "./Includes/Classroom-Setup"

### Setting Basic Alerts

Create a basic alert using a Slack endpoint.

-sandbox
Define a Slack webhook.  This has been done for you.

<img alt="Side Note" title="Side Note" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.05em; transform:rotate(15deg)" src="https://files.training.databricks.com/static/images/icon-note.webp"/> Define your own Slack webhook <a href="https://api.slack.com/incoming-webhooks#getting-started" target="_blank">Using these 4 steps.</a><br>
<img alt="Side Note" title="Side Note" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.05em; transform:rotate(15deg)" src="https://files.training.databricks.com/static/images/icon-note.webp"/> This same approach applies to PagerDuty as well.

In [8]:
webhookMLProductionAPIDemo = ""

Send a test message and check Slack.

In [10]:
def postToSlack(webhook, content):
  import requests
  from string import Template
  t = Template('{"text": "${content}"}')
  
  response = requests.post(webhook, data=t.substitute(content=content), headers={'Content-Type': 'application/json'})
  
postToSlack(webhookMLProductionAPIDemo, "This is my post from Python")

Do the same thing using Scala.  This involves a bit more boilerplate and a different library.

In [12]:
%scala

def postToSlack(webhook:String, content:String):Unit = {
  import org.apache.http.entity._
  import org.apache.http.impl.client.{HttpClients}
  import org.apache.http.client.methods.HttpPost

  val client = HttpClients.createDefault()
  val httpPost = new HttpPost(webhook)
  
  val payload = s"""{"text": "${content}"}"""

  val entity = new StringEntity(payload)
  httpPost.setEntity(entity)
  httpPost.setHeader("Accept", "application/json")
  httpPost.setHeader("Content-type", "application/json")

  val response = client.execute(httpPost)
  client.close()
}

val webhook = "https://hooks.slack.com/services/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"

postToSlack(webhookMLProductionAPIDemo, "This is my post from Scala")

Now you can easily integrate custom logic back to Slack.

In [14]:
mse = .45

postToSlack(webhookMLProductionAPIDemo, "The newly trained model MSE is now {}".format(mse))

-sandbox
### Using a `SparkListener`

A custom `SparkListener` allows for custom actions taken on cluster activity.  **This API is only available in Scala.**  Take a look at the following code.

<img alt="Side Note" title="Side Note" style="vertical-align: text-bottom; position: relative; height:1.75em; top:0.05em; transform:rotate(15deg)" src="https://files.training.databricks.com/static/images/icon-note.webp"/> <a href="http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.scheduler.SparkListener" target="_blank">See the `SparkListener` docs here.</a>

In [16]:
%scala
// Package in a notebook helps to ensure a proper singleton
package com.databricks.academy

object SlackNotifyingListener extends org.apache.spark.scheduler.SparkListener {
  import org.apache.spark.scheduler._

  val webhook = "https://hooks.slack.com/services/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"
  
  def postToSlack(message:String):Unit = {
    import org.apache.http.entity._
    import org.apache.http.impl.client.{HttpClients}
    import org.apache.http.client.methods.HttpPost

    val client = HttpClients.createDefault()
    val httpPost = new HttpPost(webhook)

    val content = """{ "text": "%s" }""".format(message)
    
    val entity = new StringEntity(content)
    httpPost.setEntity(entity)
    httpPost.setHeader("Accept", "application/json")
    httpPost.setHeader("Content-type", "application/json")

    val response = client.execute(httpPost)
    client.close()
  }
  
  override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
    postToSlack("Called when the application ends")
  }

  override def onApplicationStart(applicationStart: SparkListenerApplicationStart): Unit = {
    postToSlack("Called when the application starts")
  }

  override def onBlockManagerAdded(blockManagerAdded: SparkListenerBlockManagerAdded): Unit = {
    postToSlack("Called when a new block manager has joined")
  }

  override def onBlockManagerRemoved(blockManagerRemoved: SparkListenerBlockManagerRemoved): Unit = {
    postToSlack("Called when an existing block manager has been removed")
  }

  override def onBlockUpdated(blockUpdated: SparkListenerBlockUpdated): Unit = {
    postToSlack("Called when the driver receives a block update info.")
  }

  override def onEnvironmentUpdate(environmentUpdate: SparkListenerEnvironmentUpdate): Unit = {
    postToSlack("Called when environment properties have been updated")
  }

  override def onExecutorAdded(executorAdded: SparkListenerExecutorAdded): Unit = {
    postToSlack("Called when the driver registers a new executor.")
  }

  override def onExecutorBlacklisted(executorBlacklisted: SparkListenerExecutorBlacklisted): Unit = {
    postToSlack("Called when the driver blacklists an executor for a Spark application.")
  }

  override def onExecutorBlacklistedForStage(executorBlacklistedForStage: SparkListenerExecutorBlacklistedForStage): Unit = {
    postToSlack("Called when the driver blacklists an executor for a stage.")
  }

  override def onExecutorMetricsUpdate(executorMetricsUpdate: SparkListenerExecutorMetricsUpdate): Unit = {
    // This one is a bit on the noisy side so I'm pre-emptively killing it
    // postToSlack("Called when the driver receives task metrics from an executor in a heartbeat.")
  }

  override def onExecutorRemoved(executorRemoved: SparkListenerExecutorRemoved): Unit = {
    postToSlack("Called when the driver removes an executor.")
  }

  override def onExecutorUnblacklisted(executorUnblacklisted: SparkListenerExecutorUnblacklisted): Unit = {
    postToSlack("Called when the driver re-enables a previously blacklisted executor.")
  }

  override def onJobEnd(jobEnd: SparkListenerJobEnd): Unit = {
    postToSlack("Called when a job ends")
  }

  override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
    postToSlack("Called when a job starts")
  }

  override def onNodeBlacklisted(nodeBlacklisted: SparkListenerNodeBlacklisted): Unit = {
    postToSlack("Called when the driver blacklists a node for a Spark application.")
  }

  override def onNodeBlacklistedForStage(nodeBlacklistedForStage: SparkListenerNodeBlacklistedForStage): Unit = {
    postToSlack("Called when the driver blacklists a node for a stage.")
  }

  override def onNodeUnblacklisted(nodeUnblacklisted: SparkListenerNodeUnblacklisted): Unit = {
    postToSlack("Called when the driver re-enables a previously blacklisted node.")
  }

  override def onOtherEvent(event: SparkListenerEvent): Unit = {
    postToSlack("Called when other events like SQL-specific events are posted.")
  }

  override def onSpeculativeTaskSubmitted(speculativeTask: SparkListenerSpeculativeTaskSubmitted): Unit = {
    postToSlack("Called when a speculative task is submitted")
  }

  override def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit = {
    postToSlack("Called when a stage completes successfully or fails, with information on the completed stage.")
  }

  override def onStageSubmitted(stageSubmitted: SparkListenerStageSubmitted): Unit = {
    postToSlack("Called when a stage is submitted")
  }

  override def onTaskEnd(taskEnd: SparkListenerTaskEnd): Unit = {
    postToSlack("Called when a task ends")
  }

  override def onTaskGettingResult(taskGettingResult: SparkListenerTaskGettingResult): Unit = {
    postToSlack("Called when a task begins remotely fetching its result (will not be called for tasks that do not need to fetch the result remotely).")
  }

  override def onTaskStart(taskStart: SparkListenerTaskStart): Unit = {
    postToSlack("Called when a task starts")
  }

  override def onUnpersistRDD(unpersistRDD: SparkListenerUnpersistRDD): Unit = {
    postToSlack("Called when an RDD is manually unpersisted by the application")
  }
}

Register this Singleton as a `SparkListener`

In [18]:
%scala
sc.addSparkListener(com.databricks.academy.SlackNotifyingListener)

Now run a basic DataFrame operation and observe the results in Slack.

In [20]:
%scala
spark.read
  .option("header", true)
  .option("inferSchema", true)
  .parquet("/mnt/training/airbnb/sf-listings/airbnb-cleaned-mlflow.parquet")
  .count

This will also work back in Python.

In [22]:
(spark.read
  .option("header", True)
  .option("inferSchema", True)
  .parquet("/mnt/training/airbnb/sf-listings/airbnb-cleaned-mlflow.parquet")
  .count()
)

When you're done, remove the listener.

## Review
**Question:** What are the most common alerting tools?  
**Answer:** PagerDuty tends to be the tool most used in production environments.  SMTP servers emailing alerts are also popular, as is Twilio for text message alerts.  Slack webhooks and bots can easily be written as well.

**Question:** How can I write custom logic to monitor Spark?  
**Answer:** The `SparkListener` API is only exposed in Scala.  This allows you to write custom logic based on your cluster activity.

## Next Steps

Start the next lesson, [Delta Time Travel]($./11-Delta-Time-Travel ).

## Additional Topics & Resources

**Q:** Where can I find the alerting tools mentioned in this lesson?  
**A:** Check out <a href="https://www.twilio.com" target="_blank">Twilio</a> and <a href="https://www.pagerduty.com" target="_blank">PagerDuty</a>.

-sandbox
&copy; 2019 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="http://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/>
<a href="https://databricks.com/privacy-policy">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use">Terms of Use</a> | <a href="http://help.databricks.com/">Support</a>