Name	Name	Last commit message	Last commit date
parent directory ..
assets	assets
datadog_checks/spark	datadog_checks/spark
images	images
tests	tests
CHANGELOG.md	CHANGELOG.md
README.md	README.md
hatch.toml	hatch.toml
manifest.json	manifest.json
metadata.csv	metadata.csv
pyproject.toml	pyproject.toml

Name

Last commit message

Last commit date

datadog_checks/spark

Spark Check

Data Jobs Monitoring helps you observe, troubleshoot, and cost-optimize your Spark and Databricks jobs and clusters.

This page only documents how to ingest Spark metrics and logs.

Overview

This check monitors Spark through the Datadog Agent. Collect Spark metrics for:

Drivers and executors: RDD blocks, memory used, disk used, duration, etc.
RDDs: partition count, memory used, and disk used.
Tasks: number of tasks active, skipped, failed, and total.
Job state: number of jobs active, completed, skipped, and failed.

Setup

Installation

The Spark check is included in the Datadog Agent package. No additional installation is needed on your Mesos master (for Spark on Mesos), YARN ResourceManager (for Spark on YARN), or Spark master (for Spark Standalone).

Configuration

Host

To configure this check for an Agent running on a host:

Edit the spark.d/conf.yaml file, in the conf.d/ folder at the root of your Agent's configuration directory. The following parameters may require updating. See the sample spark.d/conf.yaml for all available configuration options.

init_config:

instances:
  - spark_url: http://localhost:8080 # Spark master web UI
    #   spark_url: http://<Mesos_master>:5050 # Mesos master web UI
    #   spark_url: http://<YARN_ResourceManager_address>:8088 # YARN ResourceManager address

    spark_cluster_mode: spark_yarn_mode # default
    #   spark_cluster_mode: spark_mesos_mode
    #   spark_cluster_mode: spark_yarn_mode
    #   spark_cluster_mode: spark_driver_mode

    # required; adds a tag 'cluster_name:<CLUSTER_NAME>' to all metrics
    cluster_name: "<CLUSTER_NAME>"
    # spark_pre_20_mode: true   # if you use Standalone Spark < v2.0
    # spark_proxy_enabled: true # if you have enabled the spark UI proxy

Restart the Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Parameter	Value
`<INTEGRATION_NAME>`	`spark`
`<INIT_CONFIG>`	blank or `{}`
`<INSTANCE_CONFIG>`	`{"spark_url": "%%host%%:8080", "cluster_name":"<CLUSTER_NAME>"}`

Log collection

Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:
```
 logs_enabled: true
```

Uncomment and edit the logs configuration block in your spark.d/conf.yaml file. Change the type, path, and service parameter values based on your environment. See the sample spark.d/conf.yaml for all available configuration options.

 logs:
   - type: file
     path: <LOG_FILE_PATH>
     source: spark
     service: <SERVICE_NAME>
     # To handle multi line that starts with yyyy-mm-dd use the following pattern
     # log_processing_rules:
     #   - type: multi_line
     #     pattern: \d{4}\-(0?[1-9]|1[012])\-(0?[1-9]|[12][0-9]|3[01])
     #     name: new_log_start_with_date

Restart the Agent.

To enable logs for Docker environments, see Docker Log Collection.

Validation

Run the Agent's status subcommand and look for spark under the Checks section.

Data Collected

Metrics

See metadata.csv for a list of metrics provided by this check.

Events

The Spark check does not include any events.

Service Checks

See service_checks.json for a list of service checks provided by this integration.

Troubleshooting

Spark on AWS EMR

To receive metrics for Spark on AWS EMR, use bootstrap actions to install the Datadog Agent:

For Agent v5, create the /etc/dd-agent/conf.d/spark.yaml configuration file with the proper values on each EMR node.

For Agent v6/7, create the /etc/datadog-agent/conf.d/spark.d/conf.yaml configuration file with the proper values on each EMR node.

Successful check but no metrics are collected

The Spark integration only collects metrics about running apps. If you have no currently running apps, the check will just submit a health check.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

spark

spark

README.md

Spark Check

Overview

Setup

Installation

Configuration

Host

Containerized

Log collection

Validation

Data Collected

Metrics

Events

Service Checks

Troubleshooting

Spark on AWS EMR

Successful check but no metrics are collected

Further Reading

Files

spark

Directory actions

More options

Directory actions

More options

Latest commit

History

spark

Folders and files

parent directory

README.md

Spark Check

Overview

Setup

Installation

Configuration

Host

Containerized

Log collection

Validation

Data Collected

Metrics

Events

Service Checks

Troubleshooting

Spark on AWS EMR

Successful check but no metrics are collected

Further Reading