You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying chapter 6 and i have 2 questions:
First,
cd aas
mvn install
cd ch06-lsa
mvn package
cd ..
./spark/bin/spark-submit --class com.cloudera.datascience.lsa.RunLSA aas/ch06-lsa/target/ch06-lsa-1.0.0.jar
but I get an error :
15/05/28 18:07:33 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Exception in thread "main" java.lang.NoClassDefFoundError: edu/umd/cloud9/collection/wikipedia/WikipediaPage
at com.cloudera.datascience.lsa.RunLSA$.preprocessing(RunLSA.scala:54)
at com.cloudera.datascience.lsa.RunLSA$.main(RunLSA.scala:33)
at com.cloudera.datascience.lsa.RunLSA.main(RunLSA.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: edu.umd.cloud9.collection.wikipedia.WikipediaPage
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
15/05/28 18:14:21 WARN spark.SparkContext: Multiple running SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.(SparkContext.scala:80)
Just looking for your best practice.
Thanks a lot.
The text was updated successfully, but these errors were encountered:
That jar only has the classes from ch06; you need that plus all of its dependencies. That is you need an assembly jar. Use ch06-lsa-1.0.0-jar-with-dependencies.jar
Hi,
I'm trying chapter 6 and i have 2 questions:
First,
cd aas
mvn install
cd ch06-lsa
mvn package
cd ..
./spark/bin/spark-submit --class com.cloudera.datascience.lsa.RunLSA aas/ch06-lsa/target/ch06-lsa-1.0.0.jar
but I get an error :
15/05/28 18:07:33 INFO cluster.SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Exception in thread "main" java.lang.NoClassDefFoundError: edu/umd/cloud9/collection/wikipedia/WikipediaPage
at com.cloudera.datascience.lsa.RunLSA$.preprocessing(RunLSA.scala:54)
at com.cloudera.datascience.lsa.RunLSA$.main(RunLSA.scala:33)
at com.cloudera.datascience.lsa.RunLSA.main(RunLSA.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: edu.umd.cloud9.collection.wikipedia.WikipediaPage
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
I'm launching from the master node of an EC2 Spark install (https://spark.apache.org/docs/latest/ec2-scripts.html).
Secondly, how do I launch the main function from RunLSA in the SparkShell ?
./spark/bin/spark-shell --jars aas/ch06-lsa/target/ch06-lsa-1.0.0.jar
I have been trying
import com.cloudera.datascience.lsa.RunLSA
RunLSA.main(Array("100","1000","0.1"))
but I get the error
15/05/28 18:14:21 WARN spark.SparkContext: Multiple running SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.(SparkContext.scala:80)
Just looking for your best practice.
Thanks a lot.
The text was updated successfully, but these errors were encountered: