New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite PythonInterpreter. #2106
Conversation
@astroshim Have you checked #1495 which also rewrite the python interpreter ? And strongly +1 to rewrite python interpreter. And it would be better to consider extensibility for python interpreter. So that other places can reuse it such as pyspark interpreter. |
@zjffdu Thank you for your opinion! |
…PythonInterpreter
I just added conda interpreter. |
What a nice fix! Let me test it out and give you a feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested and it works well.
I left some comments. Also binary file python/src/main/resources/python/py4j-0.9-src.zip
shouldn't be in the source tree. Need to download this file in build time.
@@ -135,6 +135,7 @@ private void changePythonEnvironment(String envName) | |||
|
|||
private void restartPythonProcess() { | |||
PythonInterpreter python = getPythonInterpreter(); | |||
logger.info("-----------> " + python); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logger.debug can be more appropriate.
* | ||
* Match experience of %sparpk.sql over Spark DataFrame | ||
*/ | ||
public class PythonInterpreterPandasSql extends Interpreter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell we keep this PandasSql interpreter?
* mvn -Dpython.test.exclude='' test -pl python -am | ||
* </code> | ||
*/ | ||
public class PythonInterpreterMatplotlibTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason removing the test?
@Leemoonsoo thanks for reviewing.. Let me fix the issues you commented and add PandasSql. |
@Leemoonsoo fixed the issues you commented and added PandasSql. |
Tested To make it work, i think we need to let docker container mount directory that py4j exists using -v option, and then set PYTHONPATH env variable to load py4j library using -e option. So this line setPythonCommand("docker run -i --rm " + image + " python -iu"); can be updated to something like String py4jDir = ZEPPELIN_HOME + "/interpreter/python";
String py4jFile = "py4j-{version}.zip"; // read directory py4jDir and get file starts with py4j
setPythonCommand("docker run -i --rm -v " + py4jDir + ":/zeppelin_python -e /zeppelin_python/" + py4jFile + " " + image + " python -iu"); |
I tested with the first two scripts and worked well. |
@astroshim made a pullrequest to this branch, to bringing docker support back . Please take a look and merge if you think it's okay. |
Make python docker interpreter work using py4j Thank you!!
### What is this PR for? I've been testing the python interpreter and I found at least 4 major issues in the current python interpreter. 1. not working streaming output. - https://issues.apache.org/jira/browse/ZEPPELIN-2225 2. printed "..." when there is indent in the python code. - https://issues.apache.org/jira/browse/ZEPPELIN-1929 3. very slow output of matplotlib - https://issues.apache.org/jira/browse/ZEPPELIN-1894 - https://issues.apache.org/jira/browse/ZEPPELIN-1360 4. Unexpected output of matplotlib. - https://issues.apache.org/jira/browse/ZEPPELIN-2107 so I changed python interpreter to use py4j based on pyspark interpreter and would be fixed above issues. and I am going to recreate conda, docker for python interpreter ASAP. ### What type of PR is it? Bug Fix | Hot Fix | Refactoring ### How should this be tested? 1. not working streaming output. ``` import time for x in range(0, 5): print x time.sleep(1) ``` 2. printed "..." when there is indent in the python code. ``` def fn(): print("hi") fn() ``` 3. very slow output of matplotlib. ``` import matplotlib import sys import matplotlib.pyplot as plt plt.plot([1,2,3]) ``` 4. Unexpected output of matplotlib. ``` import matplotlib.pyplot as plt import matplotlib as mpl # Make a figure and axes with dimensions as desired. fig = plt.figure(figsize=(8, 3)) ax1 = fig.add_axes([0.05, 0.80, 0.9, 0.15]) ax2 = fig.add_axes([0.05, 0.475, 0.9, 0.15]) ax3 = fig.add_axes([0.05, 0.15, 0.9, 0.15]) # Set the colormap and norm to correspond to the data for which # the colorbar will be used. cmap = mpl.cm.cool norm = mpl.colors.Normalize(vmin=5, vmax=10) # ColorbarBase derives from ScalarMappable and puts a colorbar # in a specified axes, so it has everything needed for a # standalone colorbar. There are many more kwargs, but the # following gives a basic continuous colorbar with ticks # and labels. cb1 = mpl.colorbar.ColorbarBase(ax1, cmap=cmap, norm=norm, orientation='horizontal') cb1.set_label('Some Units') # The second example illustrates the use of a ListedColormap, a # BoundaryNorm, and extended ends to show the "over" and "under" # value colors. cmap = mpl.colors.ListedColormap(['r', 'g', 'b', 'c']) cmap.set_over('0.25') cmap.set_under('0.75') # If a ListedColormap is used, the length of the bounds array must be # one greater than the length of the color list. The bounds must be # monotonically increasing. bounds = [1, 2, 4, 7, 8] norm = mpl.colors.BoundaryNorm(bounds, cmap.N) cb2 = mpl.colorbar.ColorbarBase(ax2, cmap=cmap, norm=norm, # to use 'extend', you must # specify two extra boundaries: boundaries=[0] + bounds + [13], extend='both', ticks=bounds, # optional spacing='proportional', orientation='horizontal') cb2.set_label('Discrete intervals, some other units') # The third example illustrates the use of custom length colorbar # extensions, used on a colorbar with discrete intervals. cmap = mpl.colors.ListedColormap([[0., .4, 1.], [0., .8, 1.], [1., .8, 0.], [1., .4, 0.]]) cmap.set_over((1., 0., 0.)) cmap.set_under((0., 0., 1.)) bounds = [-1., -.5, 0., .5, 1.] norm = mpl.colors.BoundaryNorm(bounds, cmap.N) cb3 = mpl.colorbar.ColorbarBase(ax3, cmap=cmap, norm=norm, boundaries=[-10] + bounds + [10], extend='both', # Make the length of each extension # the same as the length of the # interior colors: extendfrac='auto', ticks=bounds, spacing='uniform', orientation='horizontal') cb3.set_label('Custom extension lengths, some other units') plt.show() ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: astroshim <hsshim@zepl.com> Author: Lee moon soo <moon@apache.org> Author: HyungSung <hsshim@nflabs.com> Closes #2106 from astroshim/py4jPythonInterpreter and squashes the following commits: c9b195b [HyungSung] Merge pull request #16 from Leemoonsoo/py4jdocker e511ebe [Lee moon soo] add PythonDockerInterpreter to interpreter-setting.json a76b0d8 [Lee moon soo] fix test on python3 2eb5de7 [Lee moon soo] Fix PythonDockerInterpreterTest.java test 9fcf144 [Lee moon soo] Make python docker interpreter work using py4j 8a016c9 [astroshim] Merge branch 'master' into py4jPythonInterpreter aad7ee8 [astroshim] fix testcase ac92cdb [astroshim] fix python interpreter testcase e8570d2 [astroshim] fix ci for pandassql be5db4d [astroshim] fix pandas sql testcase f8e19be [astroshim] fix matplotlib testcase 046db88 [astroshim] add testcase e49ad24 [astroshim] add pandas 60e9820 [astroshim] bug fix about copying library 574bd21 [astroshim] fix interpreter-setting error a48df58 [astroshim] Merge branch 'master' into py4jPythonInterpreter 3c9585f [astroshim] update interpreter-setting.json a50179e [astroshim] add conda interpreter cbbc15c [astroshim] fix py4j path 5ae5120 [astroshim] fix interpreter-setting f17bff4 [astroshim] fix testcase failure. af097ac [astroshim] add testcase c3f5b78 [astroshim] Merge branch 'master' of https://github.com/apache/zeppelin into py4jPythonInterpreter 1395875 [astroshim] removed unnecessary code. 276011e [astroshim] add py4j lib 7304919 [astroshim] initialize python interpreter using py4j (cherry picked from commit 287ffd5) Signed-off-by: Jongyoul Lee <jongyoul@apache.org>
### What is this PR for? I've been testing the python interpreter and I found at least 4 major issues in the current python interpreter. 1. not working streaming output. - https://issues.apache.org/jira/browse/ZEPPELIN-2225 2. printed "..." when there is indent in the python code. - https://issues.apache.org/jira/browse/ZEPPELIN-1929 3. very slow output of matplotlib - https://issues.apache.org/jira/browse/ZEPPELIN-1894 - https://issues.apache.org/jira/browse/ZEPPELIN-1360 4. Unexpected output of matplotlib. - https://issues.apache.org/jira/browse/ZEPPELIN-2107 so I changed python interpreter to use py4j based on pyspark interpreter and would be fixed above issues. and I am going to recreate conda, docker for python interpreter ASAP. ### What type of PR is it? Bug Fix | Hot Fix | Refactoring ### How should this be tested? 1. not working streaming output. ``` import time for x in range(0, 5): print x time.sleep(1) ``` 2. printed "..." when there is indent in the python code. ``` def fn(): print("hi") fn() ``` 3. very slow output of matplotlib. ``` import matplotlib import sys import matplotlib.pyplot as plt plt.plot([1,2,3]) ``` 4. Unexpected output of matplotlib. ``` import matplotlib.pyplot as plt import matplotlib as mpl # Make a figure and axes with dimensions as desired. fig = plt.figure(figsize=(8, 3)) ax1 = fig.add_axes([0.05, 0.80, 0.9, 0.15]) ax2 = fig.add_axes([0.05, 0.475, 0.9, 0.15]) ax3 = fig.add_axes([0.05, 0.15, 0.9, 0.15]) # Set the colormap and norm to correspond to the data for which # the colorbar will be used. cmap = mpl.cm.cool norm = mpl.colors.Normalize(vmin=5, vmax=10) # ColorbarBase derives from ScalarMappable and puts a colorbar # in a specified axes, so it has everything needed for a # standalone colorbar. There are many more kwargs, but the # following gives a basic continuous colorbar with ticks # and labels. cb1 = mpl.colorbar.ColorbarBase(ax1, cmap=cmap, norm=norm, orientation='horizontal') cb1.set_label('Some Units') # The second example illustrates the use of a ListedColormap, a # BoundaryNorm, and extended ends to show the "over" and "under" # value colors. cmap = mpl.colors.ListedColormap(['r', 'g', 'b', 'c']) cmap.set_over('0.25') cmap.set_under('0.75') # If a ListedColormap is used, the length of the bounds array must be # one greater than the length of the color list. The bounds must be # monotonically increasing. bounds = [1, 2, 4, 7, 8] norm = mpl.colors.BoundaryNorm(bounds, cmap.N) cb2 = mpl.colorbar.ColorbarBase(ax2, cmap=cmap, norm=norm, # to use 'extend', you must # specify two extra boundaries: boundaries=[0] + bounds + [13], extend='both', ticks=bounds, # optional spacing='proportional', orientation='horizontal') cb2.set_label('Discrete intervals, some other units') # The third example illustrates the use of custom length colorbar # extensions, used on a colorbar with discrete intervals. cmap = mpl.colors.ListedColormap([[0., .4, 1.], [0., .8, 1.], [1., .8, 0.], [1., .4, 0.]]) cmap.set_over((1., 0., 0.)) cmap.set_under((0., 0., 1.)) bounds = [-1., -.5, 0., .5, 1.] norm = mpl.colors.BoundaryNorm(bounds, cmap.N) cb3 = mpl.colorbar.ColorbarBase(ax3, cmap=cmap, norm=norm, boundaries=[-10] + bounds + [10], extend='both', # Make the length of each extension # the same as the length of the # interior colors: extendfrac='auto', ticks=bounds, spacing='uniform', orientation='horizontal') cb3.set_label('Custom extension lengths, some other units') plt.show() ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: astroshim <hsshim@zepl.com> Author: Lee moon soo <moon@apache.org> Author: HyungSung <hsshim@nflabs.com> Closes apache#2106 from astroshim/py4jPythonInterpreter and squashes the following commits: c9b195b [HyungSung] Merge pull request apache#16 from Leemoonsoo/py4jdocker e511ebe [Lee moon soo] add PythonDockerInterpreter to interpreter-setting.json a76b0d8 [Lee moon soo] fix test on python3 2eb5de7 [Lee moon soo] Fix PythonDockerInterpreterTest.java test 9fcf144 [Lee moon soo] Make python docker interpreter work using py4j 8a016c9 [astroshim] Merge branch 'master' into py4jPythonInterpreter aad7ee8 [astroshim] fix testcase ac92cdb [astroshim] fix python interpreter testcase e8570d2 [astroshim] fix ci for pandassql be5db4d [astroshim] fix pandas sql testcase f8e19be [astroshim] fix matplotlib testcase 046db88 [astroshim] add testcase e49ad24 [astroshim] add pandas 60e9820 [astroshim] bug fix about copying library 574bd21 [astroshim] fix interpreter-setting error a48df58 [astroshim] Merge branch 'master' into py4jPythonInterpreter 3c9585f [astroshim] update interpreter-setting.json a50179e [astroshim] add conda interpreter cbbc15c [astroshim] fix py4j path 5ae5120 [astroshim] fix interpreter-setting f17bff4 [astroshim] fix testcase failure. af097ac [astroshim] add testcase c3f5b78 [astroshim] Merge branch 'master' of https://github.com/apache/zeppelin into py4jPythonInterpreter 1395875 [astroshim] removed unnecessary code. 276011e [astroshim] add py4j lib 7304919 [astroshim] initialize python interpreter using py4j
### What is this PR for? #2106 rewrote python interpreter. But dynamic form feature is not rewritten correctly. ### What type of PR is it? Hot Fix ### Todos * [x] - Bring dynamic form back ### What is the Jira issue? #2106 ### How should this be tested? run ``` %python print("Hello "+z.input("name", "sun")) ``` ``` %python print("Hello "+z.select("day", [("1","mon"), ("2","tue"), ("3","wed"), ("4","thurs"), ("5","fri"), ("6","sat"), ("7","sun")])) ``` ``` %python options = [("apple","Apple"), ("banana","Banana"), ("orange","Orange")] print("Hello "+ " and ".join(z.checkbox("fruit", options, ["apple"]))) ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes #2155 from Leemoonsoo/python_get_interpreter_context and squashes the following commits: c5e584a [Lee moon soo] fix matplotlib display error on python 3.4 3e6603b [Lee moon soo] correctly handle zeppelin.python property. 5be8db4 [Lee moon soo] Expose a method to get InterpreterOutput, so user can call InterpreterOutput.clear() a405a93 [Lee moon soo] implement dynamic form
### What is this PR for? #2106 rewrote python interpreter. But dynamic form feature is not rewritten correctly. ### What type of PR is it? Hot Fix ### Todos * [x] - Bring dynamic form back ### What is the Jira issue? #2106 ### How should this be tested? run ``` %python print("Hello "+z.input("name", "sun")) ``` ``` %python print("Hello "+z.select("day", [("1","mon"), ("2","tue"), ("3","wed"), ("4","thurs"), ("5","fri"), ("6","sat"), ("7","sun")])) ``` ``` %python options = [("apple","Apple"), ("banana","Banana"), ("orange","Orange")] print("Hello "+ " and ".join(z.checkbox("fruit", options, ["apple"]))) ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes #2155 from Leemoonsoo/python_get_interpreter_context and squashes the following commits: c5e584a [Lee moon soo] fix matplotlib display error on python 3.4 3e6603b [Lee moon soo] correctly handle zeppelin.python property. 5be8db4 [Lee moon soo] Expose a method to get InterpreterOutput, so user can call InterpreterOutput.clear() a405a93 [Lee moon soo] implement dynamic form (cherry picked from commit 1972a58) Signed-off-by: Lee moon soo <moon@apache.org>
### What is this PR for? apache#2106 rewrote python interpreter. But dynamic form feature is not rewritten correctly. ### What type of PR is it? Hot Fix ### Todos * [x] - Bring dynamic form back ### What is the Jira issue? apache#2106 ### How should this be tested? run ``` %python print("Hello "+z.input("name", "sun")) ``` ``` %python print("Hello "+z.select("day", [("1","mon"), ("2","tue"), ("3","wed"), ("4","thurs"), ("5","fri"), ("6","sat"), ("7","sun")])) ``` ``` %python options = [("apple","Apple"), ("banana","Banana"), ("orange","Orange")] print("Hello "+ " and ".join(z.checkbox("fruit", options, ["apple"]))) ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes apache#2155 from Leemoonsoo/python_get_interpreter_context and squashes the following commits: c5e584a [Lee moon soo] fix matplotlib display error on python 3.4 3e6603b [Lee moon soo] correctly handle zeppelin.python property. 5be8db4 [Lee moon soo] Expose a method to get InterpreterOutput, so user can call InterpreterOutput.clear() a405a93 [Lee moon soo] implement dynamic form
### What is this PR for? apache#2106 rewrote python interpreter. But dynamic form feature is not rewritten correctly. ### What type of PR is it? Hot Fix ### Todos * [x] - Bring dynamic form back ### What is the Jira issue? apache#2106 ### How should this be tested? run ``` %python print("Hello "+z.input("name", "sun")) ``` ``` %python print("Hello "+z.select("day", [("1","mon"), ("2","tue"), ("3","wed"), ("4","thurs"), ("5","fri"), ("6","sat"), ("7","sun")])) ``` ``` %python options = [("apple","Apple"), ("banana","Banana"), ("orange","Orange")] print("Hello "+ " and ".join(z.checkbox("fruit", options, ["apple"]))) ``` ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: Lee moon soo <moon@apache.org> Closes apache#2155 from Leemoonsoo/python_get_interpreter_context and squashes the following commits: c5e584a [Lee moon soo] fix matplotlib display error on python 3.4 3e6603b [Lee moon soo] correctly handle zeppelin.python property. 5be8db4 [Lee moon soo] Expose a method to get InterpreterOutput, so user can call InterpreterOutput.clear() a405a93 [Lee moon soo] implement dynamic form
What is this PR for?
I've been testing the python interpreter and I found at least 4 major issues in the current python interpreter.
so I changed python interpreter to use py4j based on pyspark interpreter and would be fixed above issues.
and I am going to recreate conda, docker for python interpreter ASAP.
What type of PR is it?
Bug Fix | Hot Fix | Refactoring
How should this be tested?
Questions: