Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LIVY-322. Catch JsonParseExceptions in PythonInterpreter on rawtext response from fake_shell. #304

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ghost
Copy link

@ghost ghost commented Mar 7, 2017

subprocess.call() commands in a PySpark snippet can potentially insert raw text into the sys_stdout in the fake_shell main(). This will then fail to be correctly parsed by PythonInterpreter in the sendRequest, as it will trigger a JsonParseException that is not caught. Added code to catch the JsonParseException and then retry reads of stdout until a valid line of JSON is reached, or 100 retries have been attempted.

…ially insert raw text into the sys_stdout in the fake_shell main(). This will then fail to be correctly parsed by PythonInterpreter in the sendRequest, as it will trigger a JsonParseException that is not caught. Added code to catch the JsonParseException and then retry reads of stdout until a valid line of JSON is reached, or 100 retries have been attempted.
@codecov-io
Copy link

Codecov Report

Merging #304 into master will decrease coverage by -0.21%.
The diff coverage is 20%.

@@             Coverage Diff              @@
##             master     #304      +/-   ##
============================================
- Coverage     70.43%   70.22%   -0.21%     
+ Complexity      685      683       -2     
============================================
  Files            93       93              
  Lines          4843     4850       +7     
  Branches        727      728       +1     
============================================
- Hits           3411     3406       -5     
- Misses          943      955      +12     
  Partials        489      489
Impacted Files Coverage Δ Complexity Δ
...ala/com/cloudera/livy/repl/PythonInterpreter.scala 51% <20%> (-2.52%) 1 <0> (ø)
...la/com/cloudera/livy/scalaapi/ScalaJobHandle.scala 52.94% <0%> (-2.95%) 0% <0%> (ø)
...n/java/com/cloudera/livy/rsc/driver/RSCDriver.java 77.15% <0%> (-1.3%) 39% <0%> (-1%)
...in/scala/com/cloudera/livy/server/LivyServer.scala 34% <0%> (-0.67%) 7% <0%> (ø)
.../java/com/cloudera/livy/rsc/driver/JobWrapper.java 80.64% <0%> (ø) 7% <0%> (-1%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 126b57e...1796ccb. Read the comment docs.

@jerryshao
Copy link
Contributor

@rickbernotas Can you please fix the title?

@jerryshao
Copy link
Contributor

I'm not sure if it is a thorough fix or bandaid for this specific issue, because your scenario is little different.

@zjffdu can you please take a look?

@ghost ghost changed the title LIVY-322 - subprocess.call() commands in a PySpark snippet can potent… LIVY-322 - Catch JsonParseExceptions in PythonInterpreter on rawtext response from fake_shell Mar 8, 2017
@ghost
Copy link
Author

ghost commented Mar 8, 2017

Title updated.

@alex-the-man alex-the-man changed the title LIVY-322 - Catch JsonParseExceptions in PythonInterpreter on rawtext response from fake_shell LIVY-322. Catch JsonParseExceptions in PythonInterpreter on rawtext response from fake_shell. Mar 10, 2017
@alex-the-man
Copy link
Contributor

I can't help but think our current implementation in fake_shell.py should be revamped. I don't think fake_shell.py should use stdout/stderr to communicate with livy-sever.

@jerryshao @zjffdu Do you 2 happen to know how Zeppelin handles this?

@alex-the-man
Copy link
Contributor

alex-the-man commented Mar 15, 2017

@rickbernotas I'm so sorry for the late reply.
Could you educate me on the issue? Do processes started by subprocess.call() inherit the original stdin/out/error? Not the fake one created by fake_shell.py?

If the answer is yes, I think we can fix this problem in another way:

  • Modify PythonInterpreter to create a temp input/output files or 2 sets of mkfifo. Pass their path to ProcessBuilder as a command line arugment.
  • Modify fake_shell.py to use the temp input/output files instead of stdin/stdout.
  • Modify ProcessInterpreter to send requests and read output from temp input/output files rather than stdin/stdout.

@ghost
Copy link
Author

ghost commented Mar 15, 2017

@alex-the-man I suspect that is what is happening, although I don't know for sure the exact behavior of subprocess.call(). subprocess.call() will return a result (return code like 0 or 1 based on success or failure of the command) but then also starts dumping output into stdout. The problem is that output from subprocess.call() ends up in this sys_stdout without ever getting caught and parsed to JSON:

https://github.com/cloudera/livy/blob/master/repl/src/main/resources/fake_shell.py#L631

...so you end up with non-JSON formatted rawtext in the flushed response to the PythonInterpreter.

The problem we were having was specific to "hadoop fs -rm" subprocess calls on Hadoop 2.7. In Hadoop 2.7 the response from those commands includes control characters in the output (like tabs, newlines, etc) that I also suspect might have been a contributing factor. If you do the same thing on Hadoop 2.8 (which does not have control chars in the response to hadoop fs commands) the subprocess.call() works fine with Livy. It is as though the combination of how subprocess.call() handles stdout, and the inclusion of control characters in the stdout, creates the problem with the response. I didn't have time to dig deeper on the issue with the stdout in the fake_shell, which is why I ended up just catching the JsonParseException in the PythonInterpreter.

@alex-the-man
Copy link
Contributor

Would you mind sharing how are you calling subprocess.call()?

@ghost
Copy link
Author

ghost commented Mar 16, 2017

Statement 1 (works fine):

import subprocess

Statement 2 (works fine):

print(1)

Statement 3 (works fine):

subprocess.call(["hadoop", "fs", "-touchz", "foo.tmp"])
print(subprocess.check_output(["hadoop", "fs", "-ls", "foo.tmp"]))

Statement 4 (JsonParseException):

subprocess.call(["hadoop", "fs", "-rm", "foo.tmp"])

Statement 5 (fails to return 1, instead returns nothing, so the responses are messed up going forward after statement 4):

print(1)


This is only reproducible with Hadoop 2.7 due to the formatting of the response from hadoop fs -rm in Hadoop 2.7 (which includes control characters like tab and newline). In Hadoop 2.8 the formatting of the response from hadoop fs commands changed and you will not get the JsonParseException. Note that the hadoop fs -touchz also works fine on Hadoop 2.7, but that response is not breaking the fake_shell response like the -rm is. subprocess.check_output() also works as expected, i.e. if you give the hadoop fs -rm to subprocess.check_output() on Hadoop 2.7, it's okay...but the whole response from subprocess.check_output() is returned as the result as a string, which is different than the behavior of subprocess.call() (which returns a return code as the result, and then dumps the text reponse straight into stdout).

Theoretically, however, any command run by subprocess.call() that dumps text output straight to stdout, and also includes control characters in the output, will break the parser in the same fashion.

@alex-the-man
Copy link
Contributor

Can you change your code to the following just to root cause the problem?

subprocess.call(["hadoop", "fs", "-touchz", "foo.tmp"], stdout=open("out", "w"), stderr=open("err", "w"))

@ghost
Copy link
Author

ghost commented Mar 17, 2017

Yes, that works, the problematic line was the hadoop fs -rm, but it does work with that change to the call() and returns 0 as result. No JsonParseException seen.

subprocess.call(["hadoop", "fs", "-rm", "foo.tmp"], stdout=open("out", "w"), stderr=open("err", "w"))

@alex-the-man
Copy link
Contributor

@rickbernotas Cool. Processes started by subprocess.call() inherit the original stdin/out/error then.

Is this workaround good enough for now?

@ghost
Copy link
Author

ghost commented Mar 21, 2017

Yes, the workarounds are sufficient for us, we were also using subprocess.check_output alternatively (as it is also a workaround). I'm leaving the contents of the PR in place on my end just because it minimally catches users who go ahead and use subprocess.call anyways, and prevents the JsonParseException. But if there is a change in how stdin/out/err is handled by the fake_shell in the future I'll definitely be interested in that.

@alex-the-man
Copy link
Contributor

alex-the-man commented Mar 21, 2017

I think the proper fix is:

  • Modify PythonInterpreter to create a temp input/output files or 2 sets of mkfifo. Pass their path to ProcessBuilder as a command line arugment.
  • Modify fake_shell.py to use the temp input/output files instead of stdin/stdout.
  • Modify ProcessInterpreter to send requests and read output from temp input/output files rather than stdin/stdout.

If you are interested, I will do my best to help you on this :).

@johnchenghk01
Copy link

Any update on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants