Spark jobs not showing up on Dr Elephant UI #456

kartiknooli · 2018-10-24T16:53:28Z

hello, I am having a similar issue like a lot of others mentioned but none of those tickets helped me resolve my issue. My spark jobs won't show up on Dr. Elephant UI. I can only see MapReduce jobs. I went through this thread but could not figure out where to find dr elephant logs for the spark jobs? I am on EMR with Hadoop v 2.7.3, Spark 2.1.1. All the configs you mentioned above exist in my cluster. I can see the running spark job on the Resource Manager UI as well as spark history server once it's completed.

spark.yarn.historyServer.address ip-10-XX-XX-X.ec2.internal:18080
spark.eventLog.dir hdfs:///var/log/spark/apps
Here is how my dr elephant folder looks like:
drwxr-xr-x 2 ec2-user ec2-user 4096 Oct 24 16:29 app-conf
drwxr-xr-x 2 ec2-user ec2-user 4096 Oct 17 22:29 bin
drwxr-xr-x 3 ec2-user ec2-user 4096 Oct 17 22:29 conf
-rwxr-xr-x 1 ec2-user ec2-user 1199 Oct 24 16:30 dr.log
drwxr-xr-x 2 ec2-user ec2-user 16384 Oct 17 22:29 lib
drwxr-xr-x 2 ec2-user ec2-user 4096 Oct 24 16:31 logs
-rwxr-xr-x 1 ec2-user ec2-user 2925 Oct 17 22:26 README.md
-rw-r--r-- 1 root root 5 Oct 24 16:30 RUNNING_PID
drwxr-xr-x 3 ec2-user ec2-user 4096 Oct 17 22:29 scripts
drwxr-xr-x 3 ec2-user ec2-user 4096 Oct 17 22:29 share
echo $SPARK_HOME
/usr/lib/spark

echo $SPARK_CONF_DIR
/usr/lib/spark/conf
Am I missing something here? Please help.

thanks,
Kartik.

ColinArmstrong · 2018-10-25T22:28:38Z

There is a logs directory before the your dr.elephant folder that I didn't see you list.

$DR_ELEPHANT_DIR/../logs/elephant/dr_elephant.log

kartiknooli · 2018-10-26T16:28:52Z

Thanks @ColinArmstrong for the response. I did check and here is the log and this time reran another spark job on the cluster and noticed that the elephant UI says it is a Hadoop job and doesn't identify it as a spark job. The dr-elephant.log file does not give me any error messages.
Is my understanding not right about how Dr Elephant displays spark jobs on the UI?

When i filter out the jobs on the UI by Job Type Spark, it returns no results.

thanks,
Kartik.

shahrukhkhan489 · 2018-11-09T06:23:22Z

Is HTTPS enabled on YARN? If HTTPS is not enabled then use the below steps to get it working

Inject exports of SPARK_HOME and SPARK_CONF_DIR in ./bin/start.sh file.
Make sure you have Spark Client Installed as a Component is you are using Vendor Specific Distribution.
Update the Spark fetcher configuration to com.linkedin.drelephant.spark.fetchers.SparkFetcher in the conf file app-conf/FetcherConf.xml. By default it is commented

This should get Dr. Elephant working against Spark Jobs.

lubomir-angelov · 2018-11-22T08:04:26Z

@kartiknooli

To find the dr_elephant.log use $locate dr_elephant.log.

In my case to start getting Spark jobs I had to add the following in app-conf/FetcherConf.xml

<fetcher> <applicationtype>spark</applicationtype> <classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname> <params> <use_rest_for_eventlogs>true</use_rest_for_eventlogs> <should_process_logs_locally>true</should_process_logs_locally> <event_log_dir>webhdfs:///spark-history</event_log_dir> </params> </fetcher>

Our Spark event log dir is configured as hdfs:///spark-history -> we added <event_log_dir>webhdfs:///spark-history</event_log_dir>

And comment out these lines:

<applicationtype>spark</applicationtype> <classname>com.linkedin.drelephant.spark.fetchers.FSFetcher</classname>

More info at #206

kartiknooli · 2018-11-26T19:41:01Z

@shahrukhkhan489 and @lubomir-angelov
thanks for the response.

I tried making the suggested changes.

Inject exports of SPARK_HOME and SPARK_CONF_DIR in ./bin/start.sh file. I hope you meant the following:

export SPARK_HOME=/usr/lib/spark
export SPARK_CONF_DIR=/etc/spark/conf

Please correct me if I am wrong.

Make sure you have Spark Client Installed as a Component is you are using Vendor Specific Distribution.
We have spark client bootstrapped with EMR

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.1.1
      /_/

Using Python version 2.7.12 (default, Sep  1 2016 22:14:00)
SparkSession available as 'spark'.
>>>

Updated the Spark fetcher configuration to the following:

<fetcher>
    <applicationtype>spark</applicationtype>
    <classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname>
    <params>
      <use_rest_for_eventlogs>true</use_rest_for_eventlogs>
      <should_process_logs_locally>true</should_process_logs_locally>
    </params>
  </fetcher>

I tried with and without adding the hdfs path for the eventlogs. both of them did not work.

Here is the error message i got from the logs:

11-26-2018 19:24:35 INFO  [dr-el-executor-thread-2] com.linkedin.drelephant.ElephantRunner : Analyzing SPARK application_1520505558307_35023
11-26-2018 19:24:35 INFO  [ForkJoinPool-1-worker-9] com.linkedin.drelephant.spark.fetchers.SparkRestClient : calling REST API at http://hostname:18080/api/v1/applications/application_1520505558307_35027
11-26-2018 19:24:35 INFO  [dr-el-executor-thread-2] com.linkedin.drelephant.spark.fetchers.SparkFetcher : Fetching data for application_1520505558307_35023
11-26-2018 19:24:35 INFO  [ForkJoinPool-1-worker-5] com.linkedin.drelephant.spark.fetchers.SparkRestClient : calling REST API at http://hostname:18080/api/v1/applications/application_1520505558307_35023
11-26-2018 19:24:35 ERROR [ForkJoinPool-1-worker-9] com.linkedin.drelephant.spark.fetchers.SparkRestClient : error reading applicationInfo http:hostname:18080/api/v1/applications/application_1520505558307_35027. Exception Message = HTTP 404 Not Found
11-26-2018 19:24:35 WARN  [dr-el-executor-thread-1] com.linkedin.drelephant.spark.fetchers.SparkFetcher : Failed fetching data for application_1520505558307_35027. I will retry after some time! Exception Message is: HTTP 404 Not Found

Appreciate your help with this.

lubomir-angelov · 2018-11-26T20:55:50Z

It looks like your spark history server is not responding. I think you need a patched version of SHS to get Spark2 jobs registered. #327

…

On Mon, Nov 26, 2018, 21:41 Kartik ***@***.***> wrote: @shahrukhkhan489 <https://github.com/shahrukhkhan489> and @lubomir-angelov <https://github.com/lubomir-angelov> thanks for the response. I tried making the suggested changes. 1. Inject exports of SPARK_HOME and SPARK_CONF_DIR in ./bin/start.sh file. I hope you meant the following: export SPARK_HOME=/usr/lib/spark export SPARK_CONF_DIR=/etc/spark/conf Please correct me if I am wrong. 1. Make sure you have Spark Client Installed as a Component is you are using Vendor Specific Distribution. We have spark client bootstrapped with EMR Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 2.1.1 /_/ Using Python version 2.7.12 (default, Sep 1 2016 22:14:00) SparkSession available as 'spark'. >>> 1. Updated the Spark fetcher configuration to the following: <fetcher> <applicationtype>spark</applicationtype> <classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname> <params> <use_rest_for_eventlogs>true</use_rest_for_eventlogs> <should_process_logs_locally>true</should_process_logs_locally> </params> </fetcher> I tried with and without adding the hdfs path for the eventlogs. both of them did not work. Here is the error message i got from the logs: 11-26-2018 19:24:35 INFO [dr-el-executor-thread-2] com.linkedin.drelephant.ElephantRunner : Analyzing SPARK application_1520505558307_35023 11-26-2018 19:24:35 INFO [ForkJoinPool-1-worker-9] com.linkedin.drelephant.spark.fetchers.SparkRestClient : calling REST API at http://hostname:18080/api/v1/applications/application_1520505558307_35027 11-26-2018 <http://hostname:18080/api/v1/applications/application_1520505558307_3502711-26-2018> 19:24:35 INFO [dr-el-executor-thread-2] com.linkedin.drelephant.spark.fetchers.SparkFetcher : Fetching data for application_1520505558307_35023 11-26-2018 19:24:35 INFO [ForkJoinPool-1-worker-5] com.linkedin.drelephant.spark.fetchers.SparkRestClient : calling REST API at http://hostname:18080/api/v1/applications/application_1520505558307_35023 11-26-2018 <http://hostname:18080/api/v1/applications/application_1520505558307_3502311-26-2018> 19:24:35 ERROR [ForkJoinPool-1-worker-9] com.linkedin.drelephant.spark.fetchers.SparkRestClient : error reading applicationInfo http:hostname:18080/api/v1/applications/application_1520505558307_35027. Exception Message = HTTP 404 Not Found 11-26-2018 19:24:35 WARN [dr-el-executor-thread-1] com.linkedin.drelephant.spark.fetchers.SparkFetcher : Failed fetching data for application_1520505558307_35027. I will retry after some time! Exception Message is: HTTP 404 Not Found Appreciate your help with this. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#456 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGaxRL8q8uD9_vKM8v31MusR_IqrjNaZks5uzEPVgaJpZM4X4XQ2> .

shahrukhkhan489 · 2018-11-27T07:49:07Z

@kartiknooli The error 404 indicates that your logs have been rolled out. This might not be the same case with all spark applications

error reading applicationInfo http:hostname:18080/api/v1/applications/application_1520505558307_35027. Exception Message = HTTP 404 Not Found

Try opening the same link using browser. You will see the same log - http:hostname:18080/api/v1/applications/application_1520505558307_35027

fusonghe · 2019-01-05T00:33:13Z

doesn't exist dr-elephant webUI sparkjobs I am at dr-elephant version 2.1.7 hadoop3.0.0 spark1.6
at app-conf/FetcherConf.xml

spark
org.apache.spark.deploy.history.SparkFSFetcher

<event_log_size_limit_in_mb>100</event_log_size_limit_in_mb>
<event_log_dir>/spark2-history</event_log_dir>
<spark_log_ext>.snappy</spark_log_ext> @shahrukhkhan489

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark jobs not showing up on Dr Elephant UI #456

Spark jobs not showing up on Dr Elephant UI #456

kartiknooli commented Oct 24, 2018

ColinArmstrong commented Oct 25, 2018

kartiknooli commented Oct 26, 2018

shahrukhkhan489 commented Nov 9, 2018

lubomir-angelov commented Nov 22, 2018 •

edited

Loading

kartiknooli commented Nov 26, 2018

lubomir-angelov commented Nov 26, 2018 via email

shahrukhkhan489 commented Nov 27, 2018 •

edited

Loading

fusonghe commented Jan 5, 2019

Spark jobs not showing up on Dr Elephant UI #456

Spark jobs not showing up on Dr Elephant UI #456

Comments

kartiknooli commented Oct 24, 2018

ColinArmstrong commented Oct 25, 2018

kartiknooli commented Oct 26, 2018

shahrukhkhan489 commented Nov 9, 2018

lubomir-angelov commented Nov 22, 2018 • edited Loading

kartiknooli commented Nov 26, 2018

lubomir-angelov commented Nov 26, 2018 via email

shahrukhkhan489 commented Nov 27, 2018 • edited Loading

fusonghe commented Jan 5, 2019

lubomir-angelov commented Nov 22, 2018 •

edited

Loading

shahrukhkhan489 commented Nov 27, 2018 •

edited

Loading