Permalink
Find file Copy path
106 lines (80 sloc) 6.98 KB

Monitor Apache Spark with Prometheus the alternative way

We have externalized the sink into a separate library thus you can use this by either building for yourself or take it from our Maven repository.

Prometheus sink

PrometheusSink is a Spark metrics sink that publishes spark metrics into Prometheus.

Prerequisites

Prometheus uses a pull model over http to scrape data from the applications. For batch jobs it also supports a push model. We need to use this model as Spark pushes metrics to sinks. In order to enable this feature for Prometheus a special component called pushgateway needs to be running.

How to enable PrometheusSink in Spark

Spark publishes metrics to Sinks listed in the metrics configuration file. The location of the metrics configuration file can be specified for spark-submit as follows:

--conf spark.metrics.conf=<path_to_the_metrics_properties_file>

Add the following lines to metrics configuration file:

# Enable Prometheus for all instances by class name
*.sink.prometheus.class=com.banzaicloud.spark.metrics.sink.PrometheusSink
# Prometheus pushgateway address
*.sink.prometheus.pushgateway-address-protocol=<prometheus pushgateway protocol> - defaults to http
*.sink.prometheus.pushgateway-address=<prometheus pushgateway address> - defaults to 127.0.0.1:9091
*.sink.prometheus.period=<period> - defaults to 10
*.sink.prometheus.unit=< unit> - defaults to seconds (TimeUnit.SECONDS)
*.sink.prometheus.pushgateway-enable-timestamp=<enable/disable metrics timestamp> - defaults to false
# Metrics name processing (version 2.3-1.1.0 +)
*.sink.prometheus.metrics-name-capture-regex=<regular expression to capture sections metric name sections to be replaces>
*.sink.prometheus.metrics-name-replacement=<replacement captured sections to be replaced with>
*.sink.prometheus.labels=<labels in label=value format separated by comma>
# Support for JMX Collector (version 2.3-2.0.0 +)
*.sink.prometheus.enable-dropwizard-collector=false
*.sink.prometheus.enable-jmx-collector=true
*.sink.prometheus.jmx-collector-config=/opt/spark/conf/monitoring/jmxCollector.yaml
# Enable JVM metrics source for all instances by class name
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource
  • pushgateway-address-protocol - the scheme of the URL where pushgateway service is available
  • pushgateway-address - the host and port the URL where pushgateway service is available
  • period - controls the periodicity of metrics being sent to pushgateway
  • unit - the time unit of the periodicity
  • pushgateway-enable-timestamp - controls whether to send the timestamp of the metrics sent to pushgateway. This is disabled by default as not all versions of pushgateway support timestamp for metrics.
  • metrics-name-capture-regex - if provided than this regexp is applied on each metric name prior sending to Prometheus. The metric name sections captured(regexp groups) will be replaced with the value passed in metrics-name-replacement. e.g. (.*driver_)(.+). Supported only in version 2.3-1.1.0 and above.
  • metrics-name-replacement - the replacement to replace captured sections(regexp groups) metric name. e.g. ${2}. Supported only in version 2.3-1.1.0 and above.
  • labels - the list of labels to be passed to Prometheus with each metrics in addition to the default ones. This must be specified in the format label=value sperated by comma. Supported only in version 2.3-1.1.0 and above.
  • enable-dropwizard-collector - from version 2.3-2.0.0 you can enable/disable dropwizard collector
  • enable-jmx-collector - from version 2.3-2.0.0 you can enable/disable JMX collector which collects configure metrics from JMX
  • jmx-collector-config - the location of jmx collector config file

Example JMX collector configuration file:

    lowercaseOutputName: false
    lowercaseOutputLabelNames: false
    whitelistObjectNames: ["*:*"]

You can find more detailed description on this configuration file here.

spark-submit needs to know repository where to download the jar containing PrometheusSink from:

--repositories https://raw.github.com/banzaicloud/spark-metrics/master/maven-repo/releases

Note: this is a maven repo hosted on GitHub

Also we have to specify the spark-metrics package that includes PrometheusSink and it's dependent packages for spark-submit:

  • Spark 2.2:
--packages com.banzaicloud:spark-metrics_2.11:2.2.1-1.0.0,io.prometheus:simpleclient:0.0.23,io.prometheus:simpleclient_dropwizard:0.0.23,io.prometheus:simpleclient_pushgateway:0.0.23,io.dropwizard.metrics:metrics-core:3.1.2
  • Spark 2.3:
--packages com.banzaicloud:spark-metrics_2.11:2.3-2.0.1,io.prometheus:simpleclient:0.3.0,io.prometheus:simpleclient_dropwizard:0.3.0,io.prometheus:simpleclient_pushgateway:0.3.0,io.dropwizard.metrics:metrics-core:3.1.2

Note: the --packages option currently is not supported by Spark 2.3 when running on Kubernetes. The reason is that the --packages option behind the scenes downloads the files from maven repo to local machines than uploads these to the cluster using Local File Dependency Management feature. This feature has not been backported from the Spark 2.2 on Kubernetes fork to Spark 2.3. See details here: Spark 2.3 Future work. This can be worked around by:

  1. building ourselves Spark 2.3 with Kubernetes support from source
  2. downloading and adding the following jars to assembly/target/scala-2.11/jars:
    1. com.banzaicloud:spark-metrics_2.11:2.3-2.0.1
    2. io.prometheus:simpleclient:0.3.0
    3. io.prometheus:simpleclient_common:0.3.0
    4. io.prometheus:simpleclient_dropwizard:0.3.0
    5. io.prometheus:simpleclient_pushgateway:0.3.0
    6. io.dropwizard.metrics:metrics-core:3.1.2
  3. building Spark docker images for Kubernetes

Package version

The version number of the package is formatted as: com.banzaicloud:spark-metrics_<scala version>:<spark version>-<version>

Conclusion

This is not the ideal scenario but it perfectly does the job and it's independent of Spark core. At Banzai Cloud we are willing to contribute this sink once the community decided that it actually needs it. Meanwhile you can open a new feature requests, use this existing PR or use any other means to ask for native Prometheus support and let us know through one of our social channels. As usual, we are happy to help. All we create is open source and at the same time all we consume is open source as well - so we are always eager to make open source projects better.