New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-2582][DOCS] docs for interpreter binding modes #2437
[ZEPPELIN-2582][DOCS] docs for interpreter binding modes #2437
Conversation
CI failed but irrelevant. |
Thanks for doing this. I looked at the screenshots above. The diagrams are very helpful! Are you also going to add the table I created at https://issues.apache.org/jira/browse/ZEPPELIN-2582?focusedCommentId=16024796&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16024796 or do you think it doesn't add anything useful? |
@cacti77 Yes, it would be a good summary. Let me add it. |
CI failed but irrelevant. |
It is better to mention in what dimension of these modes. (per user or per note). |
@zjffdu Thanks. Let me add the description into the doc. |
@zjffdu enhanced description according to your suggestion. could you check that? |
</div> | ||
<br/> | ||
|
||
In Scoped mode, Zeppelin still runs single interpreter JVM process but multiple Interpreter Group serve each Note. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zeppelin still runs single interpreter JVM process but multiple Interpreter Group serve each Note.
This is only correct when set scoped per note. As in another comment, we should mention in which dimension the scope is. And what does it mean for Per User and Per Note. The same for isolated mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zjffdu Thanks for the review. I emphasized per note in the description and added code perspective to the isolated mode as well. Now it looks like |
CI failed but irrelevant. |
Is this correct ? AFAIK, only sparkcontext is shared in scoped mode. Variable defined in session 1 can not be used in session 2
Sorry, I think we can remove this before we find a more proper way to explain. This might confuse users too. |
@zjffdu regarding that, I have a question as well.
|
We can, but this is via |
Yes, correct. Except |
@jongyoul Thanks! I fixed the incorrect description. Now it looks like |
Thanks for the review. @felixcheung Resolved all comments. |
Mode | Each notebook... | Benefits | Disadvantages | Sharing objects | ||
--- | --- | --- | --- | --- | ||
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly | ||
**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scoped | Has its own session in the same interpreter process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if the interpreter process dies | Can't share directly, but it's possible to share objects via ResourcePool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
--- | --- | --- | --- | --- | ||
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly | ||
**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange)) | ||
**isolated** | Has its own Interpreter Process | One notebook not affected directly by other notebooks (**per note**) | Can't share data between notebooks easily (**per note**) | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isolated | Has its own Interpreter Process | One notebook is not affected directly by other notebooks (per note) | Can't share data between notebooks easily (per note) | Can't share directly, but it's possible to share objects via ResourcePool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
<br/> | ||
|
||
Interpreter is a JVM process that communicates with Zeppelin daemon using thrift. | ||
Each Interpreter process can have a interpreter group, and each interpreter instance belongs to this interpreter group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each interpreter process has a single interpreter group, and this interpreter group can have one or more instances of an interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
(See [here](../../development/writing_zeppelin_interpreter.html) to understand more about its internal structure.) | ||
|
||
Zeppelin provides 3 different modes to run interpreter process: **shared**, **scoped** and **isolated**. | ||
Also, user can specify the scope of these mode as well: **per user** or **per note**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the user can specify the scope of these modes: per user or per note.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Also, user can specify the scope of these mode as well: **per user** or **per note**. | ||
These 3 modes give flexibility to fit Zeppelin into any type of use cases. | ||
|
||
In this documentation, we mainly discuss the combination of **per note** mode with **shared**, **scoped** and **isolated** modes for explanation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this documentation, we mainly discuss the per note scope in combination with the shared, scoped and isolated modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
<br/> | ||
|
||
In Scoped mode, Zeppelin still runs single interpreter JVM process but multiple sessions serve each note. (in case of **per note**) | ||
So, each note have their own dedicated session. (but still possible to share objects via [ResourcePool](../../interpreter/spark.html#object-exchange)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Scoped mode, Zeppelin still runs a single interpreter JVM process but, in the case of per note scope, each note runs in its own dedicated session. (Note it is still possible to share objects between these notes via ResourcePool.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
<br/> | ||
|
||
**Isolated** mode runs separate interpreter process for each note. (in case of **per note**) | ||
So, each note have absolutely isolated session. (but still possible to share objects via [ResourcePool](../../interpreter/spark.html#object-exchange)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isolated mode runs a separate interpreter process for each note in the case of per note scope.
So, each note has an absolutely isolated session. (But it is still possible to share objects via ResourcePool.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
Mode | Each notebook... | Benefits | Disadvantages | Sharing objects | ||
--- | --- | --- | --- | --- | ||
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shared | Shares a single session in a single interpreter process (JVM) | Low resource utilization and it's easy to share data between notebooks | All notebooks are affected if the interpreter process dies | Can share directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange)) | ||
**isolated** | Has its own Interpreter Process | One notebook not affected directly by other notebooks (**per note**) | Can't share data between notebooks easily (**per note**) | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange)) | ||
|
||
In case of **per user** (available on multi-user environment), Zeppelin manages interpreter sessions per user. For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of the per user scope (available in a multi-user environment), Zeppelin manages interpreter sessions on a per user basis rather than a per note basis. For example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
<br/> | ||
|
||
Each Interpreter implementation may have different characteristics depending on the back end system that they integrate. And 3 interpreter modes can be used differently. | ||
Let’s take a look how Spark Interpreter implementation uses these 3 interpreter modes with **per note** mdoe, as an example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s take a look how Spark Interpreter implementation uses these 3 interpreter modes with per note mode, as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
In Scoped mode, each note has its own Scala REPL. | ||
So variable defined in a note can not be read or overridden in another note. | ||
However, still single SparkContext serves all the sessions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, a single SparkContext still serves all the sessions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
In Scoped mode, each note has its own Scala REPL. | ||
So variable defined in a note can not be read or overridden in another note. | ||
However, still single SparkContext serves all the sessions. | ||
And all the jobs are submitted to this SparkContext and fair scheduler schedules the job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And all the jobs are submitted to this SparkContext and the fair scheduler schedules the jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@1ambda Thanks for finishing the diagrams. I would probably put the SparkContext inside the interpreter process but outside the InterpreterGroup; but if @zjffdu is happy with the diagrams I guess it doesn't matter. I've gone through interpreter_binding_mode.md, tidying up the English some more. I'm done with my edits now, thanks. |
Apologies @1ambda for not interacting with this PR properly; I may have overlooked some of your recent edits in the version of interpreter_binding_mode.md I commented on yesterday. |
resolved @cacti77's all comments |
|
||
<br/> | ||
|
||
Interpreter is a JVM process that communicates with Zeppelin daemon using thrift. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interpreter -> Interpreter Process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
resolved @zjffdu's comment |
Please let me know if we need to add / improve / modify something. |
LGTM |
@1ambda And it looks good to me too; thanks for being patient and making all those changes! |
it was an honor to make more correct, concise documentation with your help. Merge if no more discussion. |
@1ambda One question please: Do these docs constitute a new web page in Zeppelin's docs? I'm just wondering how a user would navigate to this new information on binding modes from the Docs home page. Will the new page be linked to from under http://zeppelin.apache.org/docs/0.7.2/manual/interpreters.html#interpreter-binding-mode for example? Or will it become part of the interpreters.html page itself (i.e., Interpreter > Overview)? |
@cacti77 it's provided as of 0.8.0-SNAPSHOT. here is the background
|
Thanks @1ambda . So, looking at #2371, it looks like a user would simply click on the dropdown and then Interpreter > Interpreter Binding Mode to access the new web page. Is that correct? Basically, as long as all the info about interpreter binding modes is in one place and easy to find, that's all I care about:). |
Sure, users can easily access the doc. |
What is this PR for?
Updated
interpreter_binding_mode.md
since users are sometimes confused what this mode means and there is already opened JIRA issue. This documentation will be helpful to Zeppelin users.disclaimer: content was copied from here with author's consent.
What type of PR is it?
[Documentation]
Todos
DONE
What is the Jira issue?
ZEPPELIN-2582
How should this be tested?
docs/
Screenshots (if appropriate)
Questions: