-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No Module in PySpark #37
Comments
What sort of error or log messages do you get when it fails? |
Sorry, I thought I had attached the error messages: Using Python version 2.7.12 (default, Nov 19 2016 06:48:10)
I have started the pyspark with this options: $SPARK_HOME/bin/pyspark --master spark://base:7077 --executor-memory 1000m --executor-cores 4 --conf "spark.locality.wait.node=0" --conf "spark.executor.extraJavaOptions=-XX:MaxDirectMemorySize=1000m" --conf "spark.default.parallelism=3" --driver-memory=600m --jars $SPARK_HOME/mysql-connector-java-5.1.41-bin.jar,$SPARK_HOME/hadoop-pcap-serde-1.1-jar-with-dependencies.jar,$SPARK_HOME/hadoop-pcap-lib-1.1.jar --verbose --num-executors 10 --driver-class-path=/usr/local/spark/* using the same cmd line in spark-shell works good. |
That won't work. hadoop-pcap is a Java library. So you will have to figure out how to use it with something like Py4J or similar. Haven't done that myself yet so cannot help on how. |
I have tried using py4j. It imports the library. |
Is this Module compatible to PySpark. Every time I try to import it, it fails. It works ok on Scala.
The text was updated successfully, but these errors were encountered: