New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to start external backend on YARN : java.io.IOException: Cannot run program "hadoop": error=2, No such file or directory #1759
Comments
code is throwing an exception in this function in class ExternalBackendUtils. '''
} on line : val proc = cmdToLaunch.mkString(" ").!(ProcessLogger( |
This command is passed to launchShellCommand: hadoop jar /home/project/sparkling-water-3.28.0.1-1-2.4/h2odriver-sw3.28.0-hdp2.6-extended.jar -Dmapreduce.job.queuename=default -Dmapreduce.job.tags=H2O/Sparkling-Water,Sparkling-Water/Spark/application_1580446727168_0013 -Dai.h2o.args.config=sparkling-water-external -nodes 2 -notify notify_H2O_via_SparklingWater_application_1580446727168_0013 -jobname H2O_via_SparklingWater_application_1580446727168_0013 -mapperXmx 2G -nthreads -1 -J -log_level -J INFO -port_offset 1 -baseport 54321 -timeout 120 -disown -sw_ext_backend -J -rest_api_ping_timeout -J 60000 -J -client_disconnect_timeout -J 60000 -extramempercent 10 |
Hi @BhushG, what's the hadoop distribution you're trying to run Sparkling Water on? |
I'm running Sparkling Water on Hadoop Yarn hdp 2.7 version. If I run the exact command to start cluster mentioned above H2O External cluster starts successfully. But H20 External cluster initialization is failing through the code |
External cluster initialization code:
and |
If I initiate H2O External cluster using sparkling shell, then also it works fine:
|
@jakubhava @mn-mikke Hi. I'm not able to run H2O ML programs neither on internal backend nor on the external backend. I'm stuck. Please let me know if you can help me with these exceptions. |
I got the reason for this exception.
Then I submitted this job to YARN cluster using command:
then it throws this exception:
Hadoop is not in path. So it is not able to execute. |
This is what it is happening there : https://stackoverflow.com/questions/51379619/running-hadoop-command-from-java-file |
If I give the full path to hadoop then my code CommandTest is running successfully on yarn. But I can't provide full path to hadoop in sparkling water code. |
@BhushG thanks for the investigation! Right now Sparkling Water expects that Adding a option which could specify full hadoop version is a good idea, we might add that - https://0xdata.atlassian.net/browse/SW-1866 |
@jakubhava hadoop is already in PATH. That's why I can execute hadoop commands from anywhere. I don't need to specify full path to hadoop. But when I'm executing a spark job to execute command such as "hadoop fs -ls /" then it is not able to find hadoop. In this case, running command through a spark job, I've to specify full path eg. "/home/bhushan/usr/local/hadoop/bin/hadoop dfs -ls /", only then it is able to execute that command. |
Even though Hadoop is in PATH. Spark job is not able to detect it. |
In that case it seems like some sort of Spark and not Sparkling Water one, what do you think? I have put in the Sparkling Water code configuration for specifying full path to hadoop to help out with this issue though #1766 |
I checked out your branch SW-1866 on Intellij IDE. I tried to build it using command ./gradlew dist as mentioned here: http://docs.h2o.ai/sparkling-water/2.4/latest-stable/doc/devel/build.html. But built failed.
Am i using right command for build or not? Even master branch build is failing. |
Hi @BhushG, To build this branch, you also need to get h2o-3 repo, switch to to build deployable artifacts run: Please also see H2O-3 build instructions: https://github.com/h2oai/h2o-3#4-building-h2o-3
|
@mn-mikke Thanks @mn-mikke @jakubhava Hey.. finally I got the problem: I was using Cluster as deploy-mode. I changed it to the Client. Now the Hadoop command is getting executed successfully. Will it solve this [https://github.com//issues/1739] Internal backend issue also by setting deploy-mode to client? What do you think? |
Hello @mn-mikke , @jakubhava , spark.yarn.appMasterEnv.PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/home/project/usr/local/hadoop/bin:/home/project/usr/local/hadoop/sbin:/home/project/usr/local/spark/bin:/home/project/usr/local/kafka/bin:/home/project/usr/local/scala/bin:/home/project/usr/local/hadoop/bin:/home/project/usr/local/hadoop/sbin:/home/project/usr/local/spark/bin:/home/project/usr/local/kafka/bin:/home/project/usr/local/scala/bin:/home/project/usr/local/hive/bin:/home/project/usr/local/zookeeper/bin The issue has been resolved. Thanks for your help @mn-mikke , @jakubhava :) |
Here are the logs
The text was updated successfully, but these errors were encountered: