varOne

This is an Apache Spark web monitoring tool, named varOne.

varOne provides a web UI for you to monitor the metrics of Spark applications more efficiently and easily. varOne ingests the spark event logs and metric data to summarizes them as rich charts. If you don't want to use this web UI, you can use the RESTful APIs provided by varOne and custom one by yourself.

Usage

Prerequisites

Spark on YARN(We will loose this restriction in near future)
- Support yarn-client.
JDK 7 and later
metrics.properties should enable CsvSink for all instances, you can follow the config in the below:

*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
*.sink.csv.period=1
*.sink.csv.unit=seconds
*.sink.csv.directory=/path/to/CSV_SINK
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource

Tip: metrics.properties exists in $SPARK_HOME/conf.

Setting the below flag when using spark-submit:

--files=/path/to/metrics.properties --conf spark.metrics.conf=metrics.properties

or set the properties in you application. For example:

val sparkConf = new spark.SparkConf()
.set("spark.metrics.conf.*.sink.csv.class", "org.apache.spark.metrics.sink.CsvSink")
.set("spark.metrics.conf.*.sink.csv.period", "1")
.set("spark.metrics.conf.*.sink.csv.unit", "seconds")
.set("spark.metrics.conf.*.sink.csv.directory", "/path/to/CSV_SINK")
.set("spark.metrics.conf.driver.source.jvm.class", "org.apache.spark.metrics.source.JvmSource")
.set("spark.metrics.conf.executor.source.jvm.class", "org.apache.spark.metrics.source.JvmSource")

val sc = new spark.SparkContext(sparkConf)

Enable Spark event log, in $SPARK_HOME/conf/spark-defaults.conf
- Set spark.eventLog.enabled to true
- Give spark.eventLog.dir to a HDFS folder

Tip: varOne only support eventLog be storaged on HDFS currently, we will loose this restriction in near future.

a. Download

Click here to download varOne-0.1.0

b. Start varOne daemond

Deploy the varOne-0.1.0.tgz to each node in your cluster and untar all of it
Then pick one node to start all daemonds by following instructions:

Configure varOne-site.xml in $VARONE_HOME/conf directory
Configure varOne-env.sh in $VARONE_HOME/conf directory
- Make sure you have set SPARK_HOME
Configure varonedaemond in $VARONE_HOME/conf directory
- List your each hostname(one host per line)
Run: ./bin/varOned-all.sh start

After running, you can check whether VarOned process listed by jps
In addition, you can stop all varOne daemond as this command: ./bin/varOned-all.sh stop

c. Start varOne web server

Follow below steps to start varOne web server:

Configure varOne-site.xml in $VARONE_HOME/conf directory
Configure varOne-env.sh in $VARONE_HOME/conf directory
- Make sure you have set SPARK_HOME
- Make sure you have set HADOOP_CONF_DIR
Run: ./bin/varOne.sh

After running, open browser and go to http://localhost:8080/varOne-web/index.html

d. About varOne-site.xml

varOne.server.port

varOne Web Server port, default is 8080

varOne.node.port

varOne daemond port, default is 8181

varOne.node.thread.number

The number of RPC handler for varOne daemond, default is 5

varOne.server.context.path

Context Path of the varOne Web Application, default is /varOne-web

varOne.war.tempdir

Location of jetty temporary directory

Name		Name	Last commit message	Last commit date
Latest commit History 344 Commits
bin		bin
conf		conf
docs		docs
varOne-node		varOne-node
varOne-rpc		varOne-rpc
varOne-server		varOne-server
varOne-web		varOne-web
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
settings.gradle		settings.gradle

License

SparkMonitor/varOne

Folders and files

Latest commit

History

Repository files navigation

varOne

Usage

Prerequisites

a. Download

b. Start varOne daemond

c. Start varOne web server

d. About varOne-site.xml

Development

Check this document

About

Resources

License

Stars

Watchers

Forks

Languages