Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZEPPELIN-1197. Should print output directly without invoking function print in pyspark interpreter #1232

Closed
wants to merge 2 commits into from

Conversation

zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Jul 27, 2016

What is this PR for?

For now, user need to invoke print to make the output displayed on the notebook. This behavior is not natural and consistent with other notebooks. This PR is to make the pyspark interpreter in zeppelin behave the same as other notebook. 2 main changes

  • use single mode to compile the last statement, so that the evaluation result of the last statement will be printed to stdout, this is consistent with other notebooks (like jupyter)
  • Make SparkOutputStream extends LogOutputStream so that we can see the output of inner process (Python/R), it is helpful for diagnosing.

What type of PR is it?

[Bug Fix]

What is the Jira issue?

How should this be tested?

Tested it manually. Input the following text in pyspark paragraph,

1+1
sc.version

And get the following output

u'1.6.1'

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? User don't need to call print explicitly.
  • Does this needs documentation? Yes

@Leemoonsoo
Copy link
Member

Thanks @zjffdu for great improvement.

I have tested bit and i could see some inconsistent behavior.

image

Is it expected result?

@zjffdu
Copy link
Contributor Author

zjffdu commented Jul 28, 2016

@Leemoonsoo Thanks for the careful checking. I compare it with jupyter, only the second case is different. Let me investigate how to fix it.

@zjffdu
Copy link
Contributor Author

zjffdu commented Jul 28, 2016

I also compare it with native python repl, the second case is consistent. So I think this behvior is fine, although it is different from jyputer.
2016-07-29_0746

@Leemoonsoo
Copy link
Member

Thanks you for explanation. LGTM

@Leemoonsoo
Copy link
Member

Merge into master if there're no more discussions.

@asfgit asfgit closed this in b885f43 Aug 3, 2016
@Leemoonsoo
Copy link
Member

@zjffdu @bzz How about bring this change to python interpreter as well?

@zjffdu
Copy link
Contributor Author

zjffdu commented Aug 3, 2016

Sure, let me do it for python interpreter as well.

@zjffdu
Copy link
Contributor Author

zjffdu commented Aug 4, 2016

Just take a look at python interpreter, it uses a different way with pyspark interpreter. Might need to take more time to investigate that.

PhilippGrulich pushed a commit to SWC-SENSE/zeppelin that referenced this pull request Aug 8, 2016
… print in pyspark interpreter

### What is this PR for?
For now, user need to invoke print to make the output displayed on the notebook. This behavior is not natural and consistent with other notebooks. This PR is to make the pyspark interpreter in zeppelin behave the same as other notebook. 2 main changes
* use single mode to compile the last statement, so that the evaluation result of the last statement will be printed to stdout, this is consistent with other notebooks (like jupyter)
* Make SparkOutputStream extends LogOutputStream so that we can see the output of inner process (Python/R), it is helpful for diagnosing.

### What type of PR is it?
[Bug Fix]

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1197

### How should this be tested?
Tested it manually. Input the following text in pyspark paragraph,
```
1+1
sc.version
```
And get the following output
```
u'1.6.1'
```

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? User don't need to call print explicitly.
* Does this needs documentation? Yes

Author: Jeff Zhang <zjffdu@apache.org>

Closes apache#1232 from zjffdu/ZEPPELIN-1197 and squashes the following commits:

3771245 [Jeff Zhang] fix and add test
10182e6 [Jeff Zhang] ZEPPELIN-1197. Should print output directly without invoking function print in pyspark interpreter
asfgit pushed a commit that referenced this pull request Aug 10, 2016
… print in pyspark interpreter

### What is this PR for?
For now, user need to invoke print to make the output displayed on the notebook. This behavior is not natural and consistent with other notebooks. This PR is to make the pyspark interpreter in zeppelin behave the same as other notebook. 2 main changes
* use single mode to compile the last statement, so that the evaluation result of the last statement will be printed to stdout, this is consistent with other notebooks (like jupyter)
* Make SparkOutputStream extends LogOutputStream so that we can see the output of inner process (Python/R), it is helpful for diagnosing.

### What type of PR is it?
[Bug Fix]

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-1197

### How should this be tested?
Tested it manually. Input the following text in pyspark paragraph,
```
1+1
sc.version
```
And get the following output
```
u'1.6.1'
```

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? User don't need to call print explicitly.
* Does this needs documentation? Yes

Author: Jeff Zhang <zjffdu@apache.org>

Closes #1232 from zjffdu/ZEPPELIN-1197 and squashes the following commits:

3771245 [Jeff Zhang] fix and add test
10182e6 [Jeff Zhang] ZEPPELIN-1197. Should print output directly without invoking function print in pyspark interpreter

(cherry picked from commit b885f43)
Signed-off-by: Mina Lee <minalee@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants