Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run samples in standalone mode. #283

Closed
sehunley opened this issue Jan 26, 2016 · 3 comments
Closed

Unable to run samples in standalone mode. #283

sehunley opened this issue Jan 26, 2016 · 3 comments

Comments

@sehunley
Copy link

I am currently trying to run the samples for SparkCLR and the local samples from within the localmode folder works great. However, when trying to execute the samples against my Spark server (cluster) using the sparkclr-submit.cmd script:

C:\MyData\Apache_Spark\SparkCLR-master\build\runtime>sparkclr-submit.cmd --verbose --master spark://spark01:7077 --exe SparkCLRSamples.exe %SPARKCLR_HOME%\samples spark.local.dir %SPARKCLR_HOME%\Temp sparkclr.sampledata.loc %SPARKCLR_HOME%\data

I am getting the following error:
The system cannot find the path specified.
SPARKCLR_JAR=spark-clr_2.10-1.6.0-SNAPSHOT.jar
Error: Could not find or load main class org.apache.spark.launcher.SparkCLRSubmitArguments

is this another environment variable that needs to be set? I have the SPARKCLR_HOME and JAVA_HOME variables. Are there more that are needed?

I am on the latest SparkCLR, downloaded a day ago.

Thanks all.

@skaarthik
Copy link
Contributor

You also need to set SPARK_HOME environment variable in addition to JAVA_HOME and SPARKCLR_HOME. I guess SPARK_HOME is set in your case. Otherwise, you will get the error message from https://github.com/Microsoft/SparkCLR/blob/master/scripts/sparkclr-submit.cmd#L77.

Since the error is on SparkCLRSubmitArguments class, I think this is most likely due to incorrect value for SPARKCLR_CLASSPATH environment variable. You do not have to explicitly set this environment variable as it is set by sparkclr-submit.cmd. You can simply echo the value of this environment variable to confirm if it points to SparkCLR jar file.

@sehunley
Copy link
Author

That did find an issue with the SPARKCLR_HOME variable, it was not set correctly, causing the issue above. However, now when I run, I am getting the following error:

You're right, I did get the error for the SPARK_HOME not being set:
[sparkclr-submit.cmd] Error - SPARK_HOME environment variable is not set
[sparkclr-submit.cmd] Note that SPARK_HOME environment variable should not have trailing \

Where would that point to? My Folder\SparkCLR-master\build\runtime\lib"? Where the spark-clr_2.10-1.6.0-SNAPSHOT.jar file resides?

Thanks.

@sehunley
Copy link
Author

Well, I pointed the SPARK_HOME to the C:Spark\SparkCLR-master\build\tools\spark-1.6.0-bin-hadoop2.6. That seems to have solved the issue with the SPARK_HOME and the SPARKCLR_HOME variables. However, when I tried the following command:

sparkclr-submit.cmd --verbose --master spark://spark01:7077 --exe SparkCLRSamples.exe %SPARKCLR_HOME%\samples spark.local.dir %SPARKCLR_HOME%\Temp sparkclr.sampledata.loc %SPARKCLR_HOME%\data

Basically trying to execute the samples on my Spark Cluster I get the following error:

C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\scripts>sparkclr-submit.cmd --verbose --master spark://spark01:7077 --exe SparkCLRSamples.exe %SPARKCLR_HOME%\samples spark.local.dir %SPARKCLR_HOME%\Temp sparkclr.sampledata.loc %SPARKCLR_HOME%\data
SPARKCLR_JAR=spark-clr_2.10-1.6.0-SNAPSHOT.jar
SPARKCLR_CLASSPATH=C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\lib\spark-clr_2.10-1.6.0-SNAPSHOT.jar
Zip driver directory C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples to C:\Users\shunley\AppData\Local\Temp\samples_1453846139169.zip
[sparkclr-submit.cmd] Command to run --verbose --master spark://spark01:7077 --name SparkCLRSamples --files C:\Users\shunley\AppData\Local\Temp\samples_1453846139169.zip --class org.apache.spark.deploy.csharp.CSharpRunner C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\lib\spark-clr_2.10-1.6.0-SNAPSHOT.jar C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe spark.local.dir C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp sparkclr.sampledata.loc C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data
Using properties file: null
Parsed arguments:
master spark://spark01:7077
deployMode null
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile null
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip
pyFiles null
archives null
mainClass org.apache.spark.deploy.csharp.CSharpRunner
primaryResource file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar
name SparkCLRSamples
childArgs [C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe spark.local.dir C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp sparkclr.sampledata.loc C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data]
jars null
packages null
packagesExclusions null
repositories null
verbose true

Spark properties used, including those specified through
--conf and those from the properties file null:

Main class:
org.apache.spark.deploy.csharp.CSharpRunner
Arguments:
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe
spark.local.dir
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp
sparkclr.sampledata.loc
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data
System properties:
SPARK_SUBMIT -> true
spark.files -> file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip
spark.app.name -> SparkCLRSamples
spark.jars -> file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar
spark.submit.deployMode -> client
spark.master -> spark://spark01:7077
Classpath elements:
file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar

[CSharpRunner.main] Starting CSharpBackend!
[CSharpRunner.main] Port number used by CSharpBackend is 1914
[CSharpRunner.main] adding key=spark.jars and value=file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar to environment
[CSharpRunner.main] adding key=spark.app.name and value=SparkCLRSamples to environment
[CSharpRunner.main] adding key=spark.files and value=file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip to environment
[CSharpRunner.main] adding key=spark.submit.deployMode and value=client to environment
[CSharpRunner.main] adding key=spark.master and value=spark://spark01:7077 to environment
[SparkCLRSamples.exe.PrintLogLocation] Logs by SparkCLR and Apache Spark are available at C:\Users\shunley\AppData\Local\Temp\SparkCLRLogs
on object of type NullObject failed
null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.api.csharp.CSharpBackendHandler.handleMethodCall(CSharpBackendHandler.scala:164)
at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:94)
at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:27)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:406)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1386)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.(SparkContext.scala:491)
... 25 more
()

It looks like a lot of variables are missing at the beginning of the submission, like DeployMode. Is there documentation on what to set and what's required?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants