Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make python docker interpreter work using py4j #16

Merged
merged 4 commits into from
Mar 18, 2017

Conversation

Leemoonsoo
Copy link

Bringing PythonDockerInterpreter back!

@astroshim
Copy link
Owner

Really appreciate for backing up! 👍
and I think docker interpreter setting should be added on python/src/main/resources/interpreter-setting.json.

@Leemoonsoo
Copy link
Author

@astroshim Done!

@astroshim
Copy link
Owner

In my testing, I got following error when i try to run activate eboraas/tensorflow.
and python process killed.

 INFO [2017-03-18 01:35:22,629] ({pool-3-thread-2} PythonInterpreter.java[setPythonCommand]:439) - Set Python Command : docker run -i --rm -v /tmp:/_zeppelin_tmp -v /Users/shim/review/zeppelin/zeppelin:/_zeppelin -e PYTHONPATH=":/_zeppelin/interpreter/python/py4j-0.9.2/src::/_zeppelin/interpreter/lib/python" eboraas/tensorflow python /_zeppelin_tmp/zeppelin_python-6550196246324960641.py
ERROR [2017-03-18 01:35:22,636] ({Exec Default Executor} PythonInterpreter.java[onProcessFailed]:515) - python process failed
org.apache.commons.exec.ExecuteException: Process exited with an error: 143 (Exit value: 143)
	at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
	at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
	at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
	at java.lang.Thread.run(Thread.java:745)
 INFO [2017-03-18 01:35:24,134] ({pool-3-thread-2} PythonInterpreter.java[createPythonScript]:126) - File /tmp/zeppelin_python-6550196246324960641.py created
 INFO [2017-03-18 01:35:24,137] ({pool-3-thread-2} PythonInterpreter.java[createGatewayServerAndStartScript]:205) - cmd = [docker, run, -i, --rm, -v, /tmp:/_zeppelin_tmp, -v, /Users/shim/review/zeppelin/zeppelin:/_zeppelin, -e, PYTHONPATH=:/_zeppelin/interpreter/python/py4j-0.9.2/src::/_zeppelin/interpreter/lib/python, eboraas/tensorflow, python, /_zeppelin_tmp/zeppelin_python-6550196246324960641.py, 49212, 127.0.0.1]
 INFO [2017-03-18 01:35:24,141] ({pool-3-thread-2} SchedulerFactory.java[jobFinished]:137) - Job remoteInterpretJob_1489768518588 finished by scheduler org.apache.zeppelin.python.PythonInterpreter589903169

@Leemoonsoo
Copy link
Author

Can you try with image activate gcr.io/tensorflow/tensorflow:0.11.0 ?

@Leemoonsoo
Copy link
Author

I also tried with activate eboraas/tensorflow and it works, too.

The error Process exited with an error is error i also see.
Once activate runs, it kills python process and open new process. And the error is coming from kill python process. We can address the error later i think.

@astroshim
Copy link
Owner

In my case, when i activate docker interpreter then python process killed and doesn't open new process.
so I requested test to @jl and @cloverhearts. :)

}

@Override
public InterpreterResult interpret(String st, InterpreterContext context) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Leemoonsoo Can you check this method? I think you miss actual running part of executing code. you'd check only "st == null", "activate", "deactivate" except "running code"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PythonDockerInterpreter does not run any python code. It just replace python command for PythonInterpreter from 'python' to 'docker -i run ... 'python'.

So I think this method is okay.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But... in case you try to run "print(1)", that method always return error, isn't it?

@astroshim astroshim merged commit c9b195b into astroshim:py4jPythonInterpreter Mar 18, 2017
astroshim pushed a commit that referenced this pull request Mar 26, 2017
### What is this PR for?
I've been testing the python interpreter and I found at least 4 major issues in the current python interpreter.

1. not working streaming output.
 - https://issues.apache.org/jira/browse/ZEPPELIN-2225

2. printed "..." when there is indent in the python code.
 - https://issues.apache.org/jira/browse/ZEPPELIN-1929

3. very slow output of matplotlib
 - https://issues.apache.org/jira/browse/ZEPPELIN-1894
 - https://issues.apache.org/jira/browse/ZEPPELIN-1360

4. Unexpected output of matplotlib.
 - https://issues.apache.org/jira/browse/ZEPPELIN-2107

so I changed python interpreter to use py4j  based on pyspark interpreter and would be fixed above issues.
and I am going to recreate conda, docker for python interpreter ASAP.

### What type of PR is it?
Bug Fix | Hot Fix | Refactoring

### How should this be tested?
1. not working streaming output.
```
import time
for x in range(0, 5):
    print x
    time.sleep(1)
```

2. printed "..." when there is indent in the python code.
```
def fn():
    print("hi")

fn()
```

3. very slow output of matplotlib.
```
import matplotlib
import sys
import matplotlib.pyplot as plt
plt.plot([1,2,3])
```

4. Unexpected output of matplotlib.
```

import matplotlib.pyplot as plt
import matplotlib as mpl

# Make a figure and axes with dimensions as desired.
fig = plt.figure(figsize=(8, 3))
ax1 = fig.add_axes([0.05, 0.80, 0.9, 0.15])
ax2 = fig.add_axes([0.05, 0.475, 0.9, 0.15])
ax3 = fig.add_axes([0.05, 0.15, 0.9, 0.15])

# Set the colormap and norm to correspond to the data for which
# the colorbar will be used.
cmap = mpl.cm.cool
norm = mpl.colors.Normalize(vmin=5, vmax=10)

# ColorbarBase derives from ScalarMappable and puts a colorbar
# in a specified axes, so it has everything needed for a
# standalone colorbar.  There are many more kwargs, but the
# following gives a basic continuous colorbar with ticks
# and labels.
cb1 = mpl.colorbar.ColorbarBase(ax1, cmap=cmap,
                                norm=norm,
                                orientation='horizontal')
cb1.set_label('Some Units')

# The second example illustrates the use of a ListedColormap, a
# BoundaryNorm, and extended ends to show the "over" and "under"
# value colors.
cmap = mpl.colors.ListedColormap(['r', 'g', 'b', 'c'])
cmap.set_over('0.25')
cmap.set_under('0.75')

# If a ListedColormap is used, the length of the bounds array must be
# one greater than the length of the color list.  The bounds must be
# monotonically increasing.
bounds = [1, 2, 4, 7, 8]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
cb2 = mpl.colorbar.ColorbarBase(ax2, cmap=cmap,
                                norm=norm,
                                # to use 'extend', you must
                                # specify two extra boundaries:
                                boundaries=[0] + bounds + [13],
                                extend='both',
                                ticks=bounds,  # optional
                                spacing='proportional',
                                orientation='horizontal')
cb2.set_label('Discrete intervals, some other units')

# The third example illustrates the use of custom length colorbar
# extensions, used on a colorbar with discrete intervals.
cmap = mpl.colors.ListedColormap([[0., .4, 1.], [0., .8, 1.],
                                  [1., .8, 0.], [1., .4, 0.]])
cmap.set_over((1., 0., 0.))
cmap.set_under((0., 0., 1.))

bounds = [-1., -.5, 0., .5, 1.]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
cb3 = mpl.colorbar.ColorbarBase(ax3, cmap=cmap,
                                norm=norm,
                                boundaries=[-10] + bounds + [10],
                                extend='both',
                                # Make the length of each extension
                                # the same as the length of the
                                # interior colors:
                                extendfrac='auto',
                                ticks=bounds,
                                spacing='uniform',
                                orientation='horizontal')
cb3.set_label('Custom extension lengths, some other units')

plt.show()
```

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <hsshim@zepl.com>
Author: Lee moon soo <moon@apache.org>
Author: HyungSung <hsshim@nflabs.com>

Closes apache#2106 from astroshim/py4jPythonInterpreter and squashes the following commits:

c9b195b [HyungSung] Merge pull request #16 from Leemoonsoo/py4jdocker
e511ebe [Lee moon soo] add PythonDockerInterpreter to interpreter-setting.json
a76b0d8 [Lee moon soo] fix test on python3
2eb5de7 [Lee moon soo] Fix PythonDockerInterpreterTest.java test
9fcf144 [Lee moon soo] Make python docker interpreter work using py4j
8a016c9 [astroshim] Merge branch 'master' into py4jPythonInterpreter
aad7ee8 [astroshim] fix testcase
ac92cdb [astroshim] fix python interpreter testcase
e8570d2 [astroshim] fix ci for pandassql
be5db4d [astroshim] fix pandas sql testcase
f8e19be [astroshim] fix matplotlib testcase
046db88 [astroshim] add testcase
e49ad24 [astroshim] add pandas
60e9820 [astroshim] bug fix about copying library
574bd21 [astroshim] fix interpreter-setting error
a48df58 [astroshim] Merge branch 'master' into py4jPythonInterpreter
3c9585f [astroshim] update interpreter-setting.json
a50179e [astroshim] add conda interpreter
cbbc15c [astroshim] fix py4j path
5ae5120 [astroshim] fix interpreter-setting
f17bff4 [astroshim] fix testcase failure.
af097ac [astroshim] add testcase
c3f5b78 [astroshim] Merge branch 'master' of https://github.com/apache/zeppelin into py4jPythonInterpreter
1395875 [astroshim] removed unnecessary code.
276011e [astroshim] add py4j lib
7304919 [astroshim] initialize python interpreter using py4j
astroshim pushed a commit that referenced this pull request Sep 12, 2017
### What is this PR for?
I've been testing the python interpreter and I found at least 4 major issues in the current python interpreter.

1. not working streaming output.
 - https://issues.apache.org/jira/browse/ZEPPELIN-2225

2. printed "..." when there is indent in the python code.
 - https://issues.apache.org/jira/browse/ZEPPELIN-1929

3. very slow output of matplotlib
 - https://issues.apache.org/jira/browse/ZEPPELIN-1894
 - https://issues.apache.org/jira/browse/ZEPPELIN-1360

4. Unexpected output of matplotlib.
 - https://issues.apache.org/jira/browse/ZEPPELIN-2107

so I changed python interpreter to use py4j  based on pyspark interpreter and would be fixed above issues.
and I am going to recreate conda, docker for python interpreter ASAP.

### What type of PR is it?
Bug Fix | Hot Fix | Refactoring

### How should this be tested?
1. not working streaming output.
```
import time
for x in range(0, 5):
    print x
    time.sleep(1)
```

2. printed "..." when there is indent in the python code.
```
def fn():
    print("hi")

fn()
```

3. very slow output of matplotlib.
```
import matplotlib
import sys
import matplotlib.pyplot as plt
plt.plot([1,2,3])
```

4. Unexpected output of matplotlib.
```

import matplotlib.pyplot as plt
import matplotlib as mpl

# Make a figure and axes with dimensions as desired.
fig = plt.figure(figsize=(8, 3))
ax1 = fig.add_axes([0.05, 0.80, 0.9, 0.15])
ax2 = fig.add_axes([0.05, 0.475, 0.9, 0.15])
ax3 = fig.add_axes([0.05, 0.15, 0.9, 0.15])

# Set the colormap and norm to correspond to the data for which
# the colorbar will be used.
cmap = mpl.cm.cool
norm = mpl.colors.Normalize(vmin=5, vmax=10)

# ColorbarBase derives from ScalarMappable and puts a colorbar
# in a specified axes, so it has everything needed for a
# standalone colorbar.  There are many more kwargs, but the
# following gives a basic continuous colorbar with ticks
# and labels.
cb1 = mpl.colorbar.ColorbarBase(ax1, cmap=cmap,
                                norm=norm,
                                orientation='horizontal')
cb1.set_label('Some Units')

# The second example illustrates the use of a ListedColormap, a
# BoundaryNorm, and extended ends to show the "over" and "under"
# value colors.
cmap = mpl.colors.ListedColormap(['r', 'g', 'b', 'c'])
cmap.set_over('0.25')
cmap.set_under('0.75')

# If a ListedColormap is used, the length of the bounds array must be
# one greater than the length of the color list.  The bounds must be
# monotonically increasing.
bounds = [1, 2, 4, 7, 8]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
cb2 = mpl.colorbar.ColorbarBase(ax2, cmap=cmap,
                                norm=norm,
                                # to use 'extend', you must
                                # specify two extra boundaries:
                                boundaries=[0] + bounds + [13],
                                extend='both',
                                ticks=bounds,  # optional
                                spacing='proportional',
                                orientation='horizontal')
cb2.set_label('Discrete intervals, some other units')

# The third example illustrates the use of custom length colorbar
# extensions, used on a colorbar with discrete intervals.
cmap = mpl.colors.ListedColormap([[0., .4, 1.], [0., .8, 1.],
                                  [1., .8, 0.], [1., .4, 0.]])
cmap.set_over((1., 0., 0.))
cmap.set_under((0., 0., 1.))

bounds = [-1., -.5, 0., .5, 1.]
norm = mpl.colors.BoundaryNorm(bounds, cmap.N)
cb3 = mpl.colorbar.ColorbarBase(ax3, cmap=cmap,
                                norm=norm,
                                boundaries=[-10] + bounds + [10],
                                extend='both',
                                # Make the length of each extension
                                # the same as the length of the
                                # interior colors:
                                extendfrac='auto',
                                ticks=bounds,
                                spacing='uniform',
                                orientation='horizontal')
cb3.set_label('Custom extension lengths, some other units')

plt.show()
```

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: astroshim <hsshim@zepl.com>
Author: Lee moon soo <moon@apache.org>
Author: HyungSung <hsshim@nflabs.com>

Closes apache#2106 from astroshim/py4jPythonInterpreter and squashes the following commits:

c9b195b [HyungSung] Merge pull request #16 from Leemoonsoo/py4jdocker
e511ebe [Lee moon soo] add PythonDockerInterpreter to interpreter-setting.json
a76b0d8 [Lee moon soo] fix test on python3
2eb5de7 [Lee moon soo] Fix PythonDockerInterpreterTest.java test
9fcf144 [Lee moon soo] Make python docker interpreter work using py4j
8a016c9 [astroshim] Merge branch 'master' into py4jPythonInterpreter
aad7ee8 [astroshim] fix testcase
ac92cdb [astroshim] fix python interpreter testcase
e8570d2 [astroshim] fix ci for pandassql
be5db4d [astroshim] fix pandas sql testcase
f8e19be [astroshim] fix matplotlib testcase
046db88 [astroshim] add testcase
e49ad24 [astroshim] add pandas
60e9820 [astroshim] bug fix about copying library
574bd21 [astroshim] fix interpreter-setting error
a48df58 [astroshim] Merge branch 'master' into py4jPythonInterpreter
3c9585f [astroshim] update interpreter-setting.json
a50179e [astroshim] add conda interpreter
cbbc15c [astroshim] fix py4j path
5ae5120 [astroshim] fix interpreter-setting
f17bff4 [astroshim] fix testcase failure.
af097ac [astroshim] add testcase
c3f5b78 [astroshim] Merge branch 'master' of https://github.com/apache/zeppelin into py4jPythonInterpreter
1395875 [astroshim] removed unnecessary code.
276011e [astroshim] add py4j lib
7304919 [astroshim] initialize python interpreter using py4j

(cherry picked from commit 287ffd5)
Signed-off-by: Jongyoul Lee <jongyoul@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants