Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark history logs fetching issue in HA Cluster #123

Closed
vinaygulani1 opened this issue Aug 2, 2016 · 1 comment
Closed

Spark history logs fetching issue in HA Cluster #123

vinaygulani1 opened this issue Aug 2, 2016 · 1 comment

Comments

@vinaygulani1
Copy link

vinaygulani1 commented Aug 2, 2016

DrElephant is not able to fetch the Spark history logs in Yarn HA cluster by setting the namenode_addresses, below are the configs :

<params>
<event_log_size_limit_in_mb>100</event_log_size_limit_in_mb>
<event_log_dir>/user/spark/jobhistory</event_log_dir>
<spark_log_ext>_1</spark_log_ext>
#the values specified in namenode_addresses will be used for obtaining spark logs. The cluster configuration will be ignored.
<namenode_addresses>hahdfs1.hostname:50070, hahdfs2.hostname:50070</namenode_addresses>
</params>

But it works with webhdfs if I specifically go for current active namenode, below are the configs:
<params>
<event_log_size_limit_in_mb>100</event_log_size_limit_in_mb>
<event_log_dir>webhdfs://hahdfs1.hostname.net:50070/user/spark/jobhistory</event_log_dir>
<event_log_dir>/user/spark/jobhistory</event_log_dir>
<spark_log_ext>_1</spark_log_ext>
</params>

Error logs:
08-01-2016 21:45:13 ERROR [dr-el-executor-thread-2] com.linkedin.drelephant.ElephantRunner : java.security.PrivilegedActionException: java.io.FileNotFoundException: File does not exist: /user/spark/jobhistory/application_1460147926973_0091_1
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1636)
at com.linkedin.drelephant.security.HadoopSecurity.doAs(HadoopSecurity.java:99)
at org.apache.spark.deploy.history.SparkFSFetcher.fetchData(SparkFSFetcher.scala:189)
at org.apache.spark.deploy.history.SparkFSFetcher.fetchData(SparkFSFetcher.scala:55)
at com.linkedin.drelephant.analysis.AnalyticJob.getAnalysis(AnalyticJob.java:231)
at com.linkedin.drelephant.ElephantRunner$ExecutorThread.run(ElephantRunner.java:181)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: File does not exist: /user/spark/jobhistory/application_1460147926973_0091_1
at sun.reflect.GeneratedConstructorAccessor25.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:385)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$600(WebHdfsFileSystem.java:91)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:656)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:622)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:458)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:487)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:483)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:838)
at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:853)
at org.apache.spark.deploy.history.SparkFSFetcher.org$apache$spark$deploy$history$SparkFSFetcher$$shouldThrottle(SparkFSFetcher.scala:324)
at org.apache.spark.deploy.history.SparkFSFetcher$$anon$1.run(SparkFSFetcher.scala:242)
at org.apache.spark.deploy.history.SparkFSFetcher$$anon$1.run(SparkFSFetcher.scala:189)

@akshayrai
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants