Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Help please: I get error trying to feed data via spark RDD - Jupyter notebook #86
I am running TFOS_spark_demo.ipynb on a Standalone 6-node Spark cluster.
Py4JJavaError Traceback (most recent call last)
/usr/lib/python2.7/site-packages/tensorflowonspark/TFCluster.pyc in train(self, dataRDD, num_epochs, qname)
/opt/mapr/spark/spark-2.0.1/python/pyspark/rdd.pyc in foreachPartition(self, f)
/opt/mapr/spark/spark-2.0.1/python/pyspark/rdd.pyc in count(self)
/opt/mapr/spark/spark-2.0.1/python/pyspark/rdd.pyc in sum(self)
/opt/mapr/spark/spark-2.0.1/python/pyspark/rdd.pyc in fold(self, zeroValue, op)
/opt/mapr/spark/spark-2.0.1/python/pyspark/rdd.pyc in collect(self)
/opt/mapr/spark/spark-2.0.1/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py in call(self, *args)
/opt/mapr/spark/spark-2.0.1/python/pyspark/sql/utils.pyc in deco(*a, **kw)
/opt/mapr/spark/spark-2.0.1/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
Does the number of executors have to be EXACT? Meaning does it have to be 6 if I have 6 available? Or can I have 3,4,5? Sorry for the typos. Sent from my mobile device.…
On May 26, 2017, at 12:30 PM, leewyang ***@***.***> wrote: @jaideepjoshi can you try modifying the ipynb script to set num_executors = 6? In that particular script, we aren't inferring the number of executors from the environment, so you'll have to tell the app how many executors are actually present. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
At the moment, yes, since we're mostly targeting YARN use cases internally. In these use cases, you would typically use