pixiedust.enableJobMonitor() does not work on Spark 2.1 #223

markwatsonatx · 2017-03-02T00:33:39Z

As a user, I want to enable the job monitor on Spark 2.1

Expected behavior

Works as expected

Actual behavior

Results in following stacktrace:

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/opt/conda/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/opt/conda/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/pixiedust/pixiedust/utils/sparkJobProgressMonitor.py", line 42, in startSparkJobProgressMonitor
    progressMonitor = SparkJobProgressMonitor()
  File "/pixiedust/pixiedust/utils/sparkJobProgressMonitor.py", line 166, in __init__
    self.addSparkListener()
  File "/pixiedust/pixiedust/utils/sparkJobProgressMonitor.py", line 195, in addSparkListener
    _env.getTemplate("sparkJobProgressMonitor/addSparkListener.scala").render()
  File "/opt/conda/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2115, in run_cell_magic
    result = fn(magic_arg_s, cell)
  File "<decorator-gen-123>", line 2, in scala
  File "/opt/conda/lib/python2.7/site-packages/IPython/core/magic.py", line 188, in <lambda>
    call = lambda f, *a, **k: f(*a, **k)
  File "/pixiedust/pixiedust/utils/scalaBridge.py", line 179, in scala
    runnerObject.callMethod("init", pd_getJavaSparkContext(), None if self.hasLineOption(line, "noSqlContext") else self.interactiveVariables.getVar("sqlContext")._ssql_ctx )
  File "/pixiedust/pixiedust/utils/javaBridge.py", line 135, in callMethod
    jMethodParams[i] = None if arg is None else (arg if arg.__class__.__name__ == "JavaClass" else arg.getClass())
  File "/root/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_collections.py", line 228, in __setitem__
    return self.__set_item(key, value)
  File "/root/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_collections.py", line 211, in __set_item
    return get_return_value(answer, self._gateway_client)
  File "/root/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 323, in get_return_value
    format(target_id, ".", name, value))
Py4JError: An error occurred while calling None.None. Trace:
java.lang.NullPointerException
	at py4j.commands.ArrayCommand.convertArgument(ArrayCommand.java:154)
	at py4j.commands.ArrayCommand.setArray(ArrayCommand.java:144)
	at py4j.commands.ArrayCommand.execute(ArrayCommand.java:97)
	at py4j.GatewayConnection.run(GatewayConnection.java:214)
	at java.lang.Thread.run(Thread.java:745)



ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/root/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 883, in send_command
    response = connection.send_command(command)
  File "/root/spark/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1040, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving

Steps to reproduce the behavior

Run pixiedust.enableJobMonitor() in notebook running Spark 2.1.

The text was updated successfully, but these errors were encountered:

ptitzler · 2017-03-08T01:11:34Z

I saw the same error running locally.

jjloesch · 2017-07-22T22:24:08Z

Any update on this issue ?

I also have the exact same issue running locally, so I guess reproducibility is not an issue and the outcome is a major feature advertised on your IBM website is simply not working at all ?

https://ibm-watson-data-lab.github.io/pixiedust/sparkmonitor.html
#vaporware ?

DTAIEB · 2017-07-24T13:22:03Z

This was in Icebox and fell off the radar. Tracking it for 1.0.9 #forreal

…value in py4j array

#223 Changes in PY4J 0.10.4 caused the error. Fix is to not set None value in py4j array

markwatsonatx mentioned this issue Mar 2, 2017

enableSparkJobProgressMonitor() isn't working on dsx #195

Closed

DTAIEB added this to the 1.1 milestone Mar 8, 2017

DTAIEB modified the milestones: 1.0.9, 1.1 Jul 24, 2017

DTAIEB self-assigned this Jul 24, 2017

DTAIEB pushed a commit that referenced this issue Jul 27, 2017

#223 Changes in PY4J 0.10.4 caused the error. Fix is to not set None …

9b57e1d

…value in py4j array

DTAIEB pushed a commit that referenced this issue Jul 27, 2017

Merge pull request #416 from ibm-watson-data-lab/david-working-branch

7ca49c2

#223 Changes in PY4J 0.10.4 caused the error. Fix is to not set None value in py4j array

DTAIEB closed this as completed Jul 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pixiedust.enableJobMonitor() does not work on Spark 2.1 #223

pixiedust.enableJobMonitor() does not work on Spark 2.1 #223

markwatsonatx commented Mar 2, 2017

ptitzler commented Mar 8, 2017

jjloesch commented Jul 22, 2017

DTAIEB commented Jul 24, 2017

pixiedust.enableJobMonitor() does not work on Spark 2.1 #223

pixiedust.enableJobMonitor() does not work on Spark 2.1 #223

Comments

markwatsonatx commented Mar 2, 2017

Expected behavior

Actual behavior

Steps to reproduce the behavior

ptitzler commented Mar 8, 2017

jjloesch commented Jul 22, 2017

DTAIEB commented Jul 24, 2017