Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi am facing an issue with submit a job from java. #16

Open
ankushreddy opened this issue May 5, 2017 · 8 comments
Open

Hi am facing an issue with submit a job from java. #16

ankushreddy opened this issue May 5, 2017 · 8 comments

Comments

@ankushreddy
Copy link

ankushreddy commented May 5, 2017

Hi when am invoking
https://github.com/mahmoudparsian/data-algorithms-book/blob/master/misc/how-to-submit-spark-job-to-yarn-from-java-code.md
this class from a running spark-submit from local it is getting invoked and able to submit the spark-submit to the yarn cluster.

But when am invoking the class by running a spark-submit which is being submitted to yarn then this particular how-to-submit-spark-job-to-yarn-from-java-code.md class is getting accepted but it is not moving to a running state. and getting failed by throwing an error.

Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster

Application application_1493671618562_0072 failed 5 times due to AM Container for appattempt_1493671618562_0072_000005 exited with exitCode: 1
For more detailed output, check the application tracking page: http://headnode.internal.cloudapp.net:8088/cluster/app/application_1493671618562_0072 Then click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e02_1493671618562_0072_05_000001
Exit code: 1
Exception message: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/resource/hadoop/yarn/local/usercache/helixuser/appcache/application_1493671618562_0072/container_e02_1493671618562_0072_05_000001/launch_container.sh: line 26: $PWD:$PWD/spark_conf:$PWD/spark.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/
:/usr/hdp/current/hadoop-client/lib/:/usr/hdp/current/hadoop-hdfs-client/:/usr/hdp/current/hadoop-hdfs-client/lib/:/usr/hdp/current/hadoop-yarn-client/:/usr/hdp/current/hadoop-yarn-client/lib/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/:$PWD/mr-framework/hadoop/share/hadoop/common/:$PWD/mr-framework/hadoop/share/hadoop/common/lib/:$PWD/mr-framework/hadoop/share/hadoop/yarn/:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
at org.apache.hadoop.util.Shell.runCommand(Shell.java:933)
at org.apache.hadoop.util.Shell.run(Shell.java:844)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1123)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:225)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.

Thank you for your help.

Thanks,
Ankush Reddy.

@mahmoudparsian
Copy link
Owner

Please provide more details: your script, its log and error messages.

@ankushreddy
Copy link
Author

ankushreddy commented May 5, 2017

Hi @mahmoudparsian

this is the logs.

Log Type: directory.info
Log Upload Time: Fri May 05 06:03:26 +0000 2017
Log Length: 5492
ls -l:
total 36
lrwxrwxrwx 1 yarn hadoop 95 May 5 06:03 app.jar -> /mnt/resource/hadoop/yarn/local/filecache/10/sparkfiller-1.0-SNAPSHOT-jar-with-dependencies.jar
-rw-r--r-- 1 yarn hadoop 74 May 5 06:03 container_tokens
-rwx------ 1 yarn hadoop 710 May 5 06:03 default_container_executor_session.sh
-rwx------ 1 yarn hadoop 764 May 5 06:03 default_container_executor.sh
-rwx------ 1 yarn hadoop 6433 May 5 06:03 launch_container.sh
lrwxrwxrwx 1 yarn hadoop 102 May 5 06:03 spark_conf -> /mnt/resource/hadoop/yarn/local/usercache/helixuser/filecache/80/__spark_conf__6125877397366945561.zip
lrwxrwxrwx 1 yarn hadoop 125 May 5 06:03 spark.jar -> /mnt/resource/hadoop/yarn/local/usercache/helixuser/filecache/81/spark-assembly-1.6.3.2.5.4.0-121-hadoop2.7.3.2.5.4.0-121.jar
drwx--x--- 2 yarn hadoop 4096 May 5 06:03 tmp
find -L . -maxdepth 5 -ls:
3933556 4 drwx--x--- 3 yarn hadoop 4096 May 5 06:03 .
3933558 4 drwx--x--- 2 yarn hadoop 4096 May 5 06:03 ./tmp
3933562 4 -rw-r--r-- 1 yarn hadoop 60 May 5 06:03 ./.launch_container.sh.crc
3933517 185944 -r-x------ 1 yarn hadoop 190402950 May 5 06:03 ./spark.jar
3933564 4 -rw-r--r-- 1 yarn hadoop 16 May 5 06:03 ./.default_container_executor_session.sh.crc
3933518 4 drwx------ 2 yarn hadoop 4096 May 5 06:03 ./spark_conf
3933548 4 -r-x------ 1 yarn hadoop 945 May 5 06:03 ./spark_conf/taskcontroller.cfg
3933543 4 -r-x------ 1 yarn hadoop 249 May 5 06:03 ./spark_conf/slaves
3933541 4 -r-x------ 1 yarn hadoop 2316 May 5 06:03 ./spark_conf/ssl-client.xml.example
3933520 4 -r-x------ 1 yarn hadoop 1734 May 5 06:03 ./spark_conf/log4j.properties
3933526 4 -r-x------ 1 yarn hadoop 265 May 5 06:03 ./spark_conf/hadoop-metrics2-azure-file-system.properties
3933536 4 -r-x------ 1 yarn hadoop 1045 May 5 06:03 ./spark_conf/container-executor.cfg
3933519 8 -r-x------ 1 yarn hadoop 5685 May 5 06:03 ./spark_conf/hadoop-env.sh
3933531 4 -r-x------ 1 yarn hadoop 2358 May 5 06:03 ./spark_conf/topology_script.py
3933547 8 -r-x------ 1 yarn hadoop 4113 May 5 06:03 ./spark_conf/mapred-queues.xml.template
3933528 4 -r-x------ 1 yarn hadoop 744 May 5 06:03 ./spark_conf/ssl-client.xml
3933544 4 -r-x------ 1 yarn hadoop 417 May 5 06:03 ./spark_conf/topology_mappings.data
3933549 4 -r-x------ 1 yarn hadoop 342 May 5 06:03 ./spark_conf/spark_conf.properties
3933523 4 -r-x------ 1 yarn hadoop 247 May 5 06:03 ./spark_conf/hadoop-metrics2-adl-file-system.properties
3933535 4 -r-x------ 1 yarn hadoop 1020 May 5 06:03 ./spark_conf/commons-logging.properties
3933525 24 -r-x------ 1 yarn hadoop 22138 May 5 06:03 ./spark_conf/yarn-site.xml
3933529 4 -r-x------ 1 yarn hadoop 2450 May 5 06:03 ./spark_conf/capacity-scheduler.xml
3933538 4 -r-x------ 1 yarn hadoop 2490 May 5 06:03 ./spark_conf/hadoop-metrics.properties
3933534 12 -r-x------ 1 yarn hadoop 8754 May 5 06:03 ./spark_conf/hdfs-site.xml
3933533 8 -r-x------ 1 yarn hadoop 4261 May 5 06:03 ./spark_conf/yarn-env.sh
3933532 4 -r-x------ 1 yarn hadoop 1335 May 5 06:03 ./spark_conf/configuration.xsl
3933530 4 -r-x------ 1 yarn hadoop 758 May 5 06:03 ./spark_conf/mapred-site.xml.template
3933545 4 -r-x------ 1 yarn hadoop 1000 May 5 06:03 ./spark_conf/ssl-server.xml
3933527 8 -r-x------ 1 yarn hadoop 4680 May 5 06:03 ./spark_conf/core-site.xml
3933522 8 -r-x------ 1 yarn hadoop 5783 May 5 06:03 ./spark_conf/hadoop-metrics2.properties
3933542 4 -r-x------ 1 yarn hadoop 1308 May 5 06:03 ./spark_conf/hadoop-policy.xml
3933540 4 -r-x------ 1 yarn hadoop 1602 May 5 06:03 ./spark_conf/health_check
3933537 8 -r-x------ 1 yarn hadoop 4221 May 5 06:03 ./spark_conf/task-log4j.properties
3933521 8 -r-x------ 1 yarn hadoop 7596 May 5 06:03 ./spark_conf/mapred-site.xml
3933546 4 -r-x------ 1 yarn hadoop 2697 May 5 06:03 ./spark_conf/ssl-server.xml.example
3933539 4 -r-x------ 1 yarn hadoop 752 May 5 06:03 ./spark_conf/mapred-env.sh
3932820 135852 -r-xr-xr-x 1 yarn hadoop 139105807 May 4 22:53 ./app.jar
3933566 4 -rw-r--r-- 1 yarn hadoop 16 May 5 06:03 ./.default_container_executor.sh.crc
3933563 4 -rwx------ 1 yarn hadoop 710 May 5 06:03 ./default_container_executor_session.sh
3933559 4 -rw-r--r-- 1 yarn hadoop 74 May 5 06:03 ./container_tokens
3933565 4 -rwx------ 1 yarn hadoop 764 May 5 06:03 ./default_container_executor.sh
3933560 4 -rw-r--r-- 1 yarn hadoop 12 May 5 06:03 ./.container_tokens.crc
3933561 8 -rwx------ 1 yarn hadoop 6433 May 5 06:03 ./launch_container.sh
broken symlinks(find -L . -maxdepth 5 -type l -ls):

this is how my project structure is.

spark-application

--> scala1 class // am calling the java class from this class.

--> java class // this will submit another spark application to the yarn cluster.

Another spark-application

--> scala2 class

if am invoking a java class from scala1.class by using spark-submit in the local spark-submit --class scala2.class is getting triggered and working good.

if am invoking a java class from scala1.class by using spark-submit in yarn spark-submit --class scala2.class is getting triggered and facing the issue or error.

import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.SparkException;


public class CallingSparkJob {

        public void submitJob(String latestreceivedpitrL,String newPtr) throws Exception {
       	   System.out.println("In submit job method");
           	try{
            System.out.println("Building a spark command");

   // prepare arguments to be passed to 
   // org.apache.spark.deploy.yarn.Client object
   String[] args = new String[] {
       // the name of your application
"--name",
"name",
// "--master",
    // "yarn",
      //	"--deploy-mode",
    //	"cluster",           


          //	   "--conf", "spark.yarn.executor.memoryOverhead=600", "--conf", 
        "spark.yarn.submit.waitAppCompletion=false",
       
       // memory for driver (optional)
       "--driver-memory",
       "1000M",

       "--num-executors",
       "2",
       "--executor-cores",
       "2",
          
       // path to your application's JAR file 
       // required in yarn-cluster mode      
       "--jar",
   "wasb://storage_account_container@storageaccount.blob.core.windows.net/user/ankushuser/sparkfiller/sparkfiller-1.0-SNAPSHOT-jar-with-dependencies.jar",
       // name of your application's main class (required)
       "--class",
       "com.test.SparkFiller",

       // comma separated list of local jars that want 
       // SparkContext.addJar to work with		
      // "--addJars",
      // "/Users/mparsian/zmp/github/data-algorithms-book/lib/spark-assembly-1.5.2-hadoop2.6.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/log4j-1.2.17.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/junit-4.12-beta-2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jsch-0.1.42.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/JeraAntTasks.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jedis-2.5.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jblas-1.2.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/hamcrest-all-1.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/guava-18.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math3-3.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math-2.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-logging-1.1.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang3-3.4.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang-2.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-io-2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-httpclient-3.0.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-daemon-1.0.5.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-configuration-1.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-collections-3.2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-cli-1.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/cloud9-1.3.2.jar",


       // argument 1 for latestreceivedpitrL
    "--arg",
       latestreceivedpitrL,

       // argument 2 for newPtr
     "--arg",
       newPtr,

"--arg",
"yarn-cluster"

       
   };
   
  System.out.println("create a Hadoop Configuration object");

// create a Hadoop Configuration object
   Configuration config = new Configuration();
   
   // identify that you will be using Spark as YARN mode
  System.setProperty("SPARK_YARN_MODE", "true");

   // create an instance of SparkConf object
   SparkConf sparkConf = new SparkConf();
sparkConf.setSparkHome("/usr/hdp/current/spark-client");
    // sparkConf.setMaster("yarn");
    sparkConf.setMaster("yarn-cluster");
    
   // sparkConf.setAppName("spark-yarn");
   //  sparkConf.set("master", "yarn");
    
    // sparkConf.set("spark.submit.deployMode", "cluster"); // worked

   // create ClientArguments, which will be passed to Client
   // ClientArguments cArgs = new ClientArguments(args);
   ClientArguments cArgs = new ClientArguments(args, sparkConf);
   
   // create an instance of yarn Client client
   Client client = new Client(cArgs, config, sparkConf); 
            
   // submit Spark job to YARN
   client.run(); 
   }catch(Exception e){
	
	   System.out.println("Error submitting spark Job");
	
	   System.out.println(e.getMessage());
   }
   
  }
   
 }

this is spark submit command am using.

spark-submit --class scala1 --master yarn --deploy-mode cluster --num-executors 2 --executor-cores 2 --conf spark.yarn.executor.memoryOverhead=600 --conf spark.yarn.submit.waitAppCompletion=false /home/ankushuser/kafka_retry/kafka_retry_test/sparkflightaware/target/sparkflightaware-0.0.1-SNAPSHOT-jar-with-dependencies.jar

if I run this spark-submit command locally it is invoking the java class and the spark-submit command for scala2 application is working good.

If I run it in yarn then am facing the issue.

@mahmoudparsian
Copy link
Owner

Can you also please include error messages you are getting?
If possible, include your Scala classes (all of your classes) as well so that I can try it on my side.

Thanks,
Mahmoud

@ankushreddy
Copy link
Author

Am getting the above error message alone.
I have about 10 to 20 classes on which am using in the application. it will basically pull the data from an api and push it to kafka topic. in this application am invoking a filler job to push the records to document db.

Thanks,
Ankush Reddy.

@JackDavidson
Copy link

Same issue here. Everything submits without any warnings or errors, but then nothing happens and the yarn logs report 'Error: Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster' on two nodes in the cluster.(nothing else on any other node) I have instead resorted to invoking spark-submit from my java code, which is working.

@ankushreddy
Copy link
Author

@JackDavidson hi did you copy the jar files to worker nodes? our existing spark application or the command that we are running might not run on the head node itself.

Work around use livy and store your jar war in a hdfs or any location. you can use it to post the spark-submit command.

@JackDavidson
Copy link

JackDavidson commented Apr 20, 2018

@ankushreddy My project's jar file (there is only one, and it is a fat jar) is stored in HDFS. I haven't ever done any manual copying of jars anywhere. This works with spark-submit, but maybe submitting from java is different in some way? The jar I am submitting does not contain the class that spark cant find though. The spark libraries under spark home do contain that class though. But I wonder, if I add a dependency on spark-yarn in my submitted fat jar, would spark then be able to find the missing class?

Livy sounds like a great alternative. I'll start looking into that

@ankushreddy
Copy link
Author

@JackDavidson in our case we included all the dependencies along with the jar so we didn't face any issues with missing dependencies.

I would suggest you to look at livy in most of the cases that should solve the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants