Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-2582][DOCS] docs for interpreter binding modes #2437

Closed

Conversation

1ambda
Copy link
Member

@1ambda 1ambda commented Jun 26, 2017

What is this PR for?

Updated interpreter_binding_mode.md since users are sometimes confused what this mode means and there is already opened JIRA issue. This documentation will be helpful to Zeppelin users.

disclaimer: content was copied from here with author's consent.

What type of PR is it?

[Documentation]

Todos

DONE

What is the Jira issue?

ZEPPELIN-2582

How should this be tested?

  1. setup rvm 2.1.0+ and install required bundles for docs/
  2. bundle exec jekyll serve --watch
  3. http://localhost:4000/usage/interpreter/interpreter_binding_mode.html

Screenshots (if appropriate)

image

image

image

image

image

image

Questions:

  • Does the licenses files need update? - NO
  • Is there breaking changes for older versions? - NO
  • Does this needs documentation? - This PR is about docs.

@1ambda
Copy link
Member Author

1ambda commented Jun 26, 2017

CI failed but irrelevant.

@cacti77
Copy link

cacti77 commented Jun 26, 2017

Thanks for doing this. I looked at the screenshots above. The diagrams are very helpful! Are you also going to add the table I created at https://issues.apache.org/jira/browse/ZEPPELIN-2582?focusedCommentId=16024796&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16024796 or do you think it doesn't add anything useful?

@1ambda
Copy link
Member Author

1ambda commented Jun 26, 2017

@cacti77 Yes, it would be a good summary. Let me add it.

@1ambda
Copy link
Member Author

1ambda commented Jun 27, 2017

I added your summary table.

image

@1ambda
Copy link
Member Author

1ambda commented Jun 27, 2017

CI failed but irrelevant.

@zjffdu
Copy link
Contributor

zjffdu commented Jun 27, 2017

It is better to mention in what dimension of these modes. (per user or per note).
And from the code perspective, there's only one InterpreterGroup for scoped mode, but multiple sessions in one InterpreterGroup.

@1ambda
Copy link
Member Author

1ambda commented Jun 27, 2017

@zjffdu Thanks. Let me add the description into the doc.

@1ambda 1ambda closed this Jun 27, 2017
@1ambda 1ambda reopened this Jun 27, 2017
@1ambda
Copy link
Member Author

1ambda commented Jun 27, 2017

@zjffdu enhanced description according to your suggestion. could you check that?

image

image

</div>
<br/>

In Scoped mode, Zeppelin still runs single interpreter JVM process but multiple Interpreter Group serve each Note.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeppelin still runs single interpreter JVM process but multiple Interpreter Group serve each Note.

This is only correct when set scoped per note. As in another comment, we should mention in which dimension the scope is. And what does it mean for Per User and Per Note. The same for isolated mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Here is the description for that. If we need to repeat again there, I will add. Do we need?

@1ambda
Copy link
Member Author

1ambda commented Jun 27, 2017

@zjffdu Thanks for the review.

I emphasized per note in the description and added code perspective to the isolated mode as well.

Now it looks like

image

image

@1ambda
Copy link
Member Author

1ambda commented Jun 28, 2017

CI failed but irrelevant.

@zjffdu
Copy link
Contributor

zjffdu commented Jun 28, 2017

So, each Note have their own dedicated session but still it’s possible to share objects between different Interpreter Groups while they’re in the same JVM process.

Is this correct ? AFAIK, only sparkcontext is shared in scoped mode. Variable defined in session 1 can not be used in session 2

From the code perspective, there is only one InterpreterGroup for the scoped mode, but multiple sessions in one InterpreterGroup.

Sorry, I think we can remove this before we find a more proper way to explain. This might confuse users too.

@1ambda
Copy link
Member Author

1ambda commented Jun 28, 2017

@zjffdu regarding that, I have a question as well.

  • In isolated mode / shared mode, can't we share variables via z.put? and z.get?

@zjffdu
Copy link
Contributor

zjffdu commented Jun 28, 2017

We can, but this is via ResourcePool. We might need to say that we could share object via resource pool. Otherwise users might be confused.

@1ambda
Copy link
Member Author

1ambda commented Jun 28, 2017

@zjffdu yes, it would be more clear.

@jongyoul. As zjffdu said, variables (e.g scala variables, ...) in the scope mode can't be shared directly. is it correct?

@jongyoul
Copy link
Member

Yes, correct. Except Shared, user cannot share anything directly.

@1ambda
Copy link
Member Author

1ambda commented Jun 28, 2017

@jongyoul Thanks!

I fixed the incorrect description. Now it looks like

image

@1ambda
Copy link
Member Author

1ambda commented Jul 3, 2017

Thanks for the review.

@felixcheung Resolved all comments.
@zjffdu @cacti77 Update imgs and description. Please check the last 2 commits.

Mode | Each notebook... | Benefits | Disadvantages | Sharing objects
--- | --- | --- | --- | ---
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly
**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange))
Copy link

@cacti77 cacti77 Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scoped | Has its own session in the same interpreter process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if the interpreter process dies | Can't share directly, but it's possible to share objects via ResourcePool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

--- | --- | --- | --- | ---
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly
**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange))
**isolated** | Has its own Interpreter Process | One notebook not affected directly by other notebooks (**per note**) | Can't share data between notebooks easily (**per note**) | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange))
Copy link

@cacti77 cacti77 Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isolated | Has its own Interpreter Process | One notebook is not affected directly by other notebooks (per note) | Can't share data between notebooks easily (per note) | Can't share directly, but it's possible to share objects via ResourcePool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

<br/>

Interpreter is a JVM process that communicates with Zeppelin daemon using thrift.
Each Interpreter process can have a interpreter group, and each interpreter instance belongs to this interpreter group.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each interpreter process has a single interpreter group, and this interpreter group can have one or more instances of an interpreter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

(See [here](../../development/writing_zeppelin_interpreter.html) to understand more about its internal structure.)

Zeppelin provides 3 different modes to run interpreter process: **shared**, **scoped** and **isolated**.
Also, user can specify the scope of these mode as well: **per user** or **per note**.
Copy link

@cacti77 cacti77 Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the user can specify the scope of these modes: per user or per note.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Also, user can specify the scope of these mode as well: **per user** or **per note**.
These 3 modes give flexibility to fit Zeppelin into any type of use cases.

In this documentation, we mainly discuss the combination of **per note** mode with **shared**, **scoped** and **isolated** modes for explanation.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this documentation, we mainly discuss the per note scope in combination with the shared, scoped and isolated modes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

<br/>

In Scoped mode, Zeppelin still runs single interpreter JVM process but multiple sessions serve each note. (in case of **per note**)
So, each note have their own dedicated session. (but still possible to share objects via [ResourcePool](../../interpreter/spark.html#object-exchange))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scoped mode, Zeppelin still runs a single interpreter JVM process but, in the case of per note scope, each note runs in its own dedicated session. (Note it is still possible to share objects between these notes via ResourcePool.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

<br/>

**Isolated** mode runs separate interpreter process for each note. (in case of **per note**)
So, each note have absolutely isolated session. (but still possible to share objects via [ResourcePool](../../interpreter/spark.html#object-exchange))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isolated mode runs a separate interpreter process for each note in the case of per note scope.
So, each note has an absolutely isolated session. (But it is still possible to share objects via ResourcePool.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


Mode | Each notebook... | Benefits | Disadvantages | Sharing objects
--- | --- | --- | --- | ---
**shared** | Shares a single sessions in a single interpreter process (JVM) | Low resource utilization and Easy to share data between notebooks | All notebooks are affected if Interpreter Process dies | can share directly
Copy link

@cacti77 cacti77 Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared | Shares a single session in a single interpreter process (JVM) | Low resource utilization and it's easy to share data between notebooks | All notebooks are affected if the interpreter process dies | Can share directly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

**scoped** | Has its own note sessions in the same Interpreter Process (JVM) | Less resource utilization than isolated mode | All notebooks are affected if Interpreter Process dies | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange))
**isolated** | Has its own Interpreter Process | One notebook not affected directly by other notebooks (**per note**) | Can't share data between notebooks easily (**per note**) | can't share directly, but possible to share objets via [ResourcePool](../../interpreter/spark.html#object-exchange))

In case of **per user** (available on multi-user environment), Zeppelin manages interpreter sessions per user. For example,
Copy link

@cacti77 cacti77 Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of the per user scope (available in a multi-user environment), Zeppelin manages interpreter sessions on a per user basis rather than a per note basis. For example:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

<br/>

Each Interpreter implementation may have different characteristics depending on the back end system that they integrate. And 3 interpreter modes can be used differently.
Let’s take a look how Spark Interpreter implementation uses these 3 interpreter modes with **per note** mdoe, as an example.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s take a look how Spark Interpreter implementation uses these 3 interpreter modes with per note mode, as an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


In Scoped mode, each note has its own Scala REPL.
So variable defined in a note can not be read or overridden in another note.
However, still single SparkContext serves all the sessions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, a single SparkContext still serves all the sessions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

In Scoped mode, each note has its own Scala REPL.
So variable defined in a note can not be read or overridden in another note.
However, still single SparkContext serves all the sessions.
And all the jobs are submitted to this SparkContext and fair scheduler schedules the job.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And all the jobs are submitted to this SparkContext and the fair scheduler schedules the jobs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@cacti77
Copy link

cacti77 commented Jul 3, 2017

@1ambda Thanks for finishing the diagrams. I would probably put the SparkContext inside the interpreter process but outside the InterpreterGroup; but if @zjffdu is happy with the diagrams I guess it doesn't matter. I've gone through interpreter_binding_mode.md, tidying up the English some more. I'm done with my edits now, thanks.

@cacti77
Copy link

cacti77 commented Jul 4, 2017

Apologies @1ambda for not interacting with this PR properly; I may have overlooked some of your recent edits in the version of interpreter_binding_mode.md I commented on yesterday.

@1ambda
Copy link
Member Author

1ambda commented Jul 5, 2017

resolved @cacti77's all comments


<br/>

Interpreter is a JVM process that communicates with Zeppelin daemon using thrift.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interpreter -> Interpreter Process

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@1ambda
Copy link
Member Author

1ambda commented Jul 5, 2017

resolved @zjffdu's comment

@1ambda
Copy link
Member Author

1ambda commented Jul 5, 2017

Please let me know if we need to add / improve / modify something.

@zjffdu
Copy link
Contributor

zjffdu commented Jul 5, 2017

LGTM

@cacti77
Copy link

cacti77 commented Jul 5, 2017

@1ambda And it looks good to me too; thanks for being patient and making all those changes!

@1ambda
Copy link
Member Author

1ambda commented Jul 5, 2017

it was an honor to make more correct, concise documentation with your help.

Merge if no more discussion.

@cacti77
Copy link

cacti77 commented Jul 6, 2017

@1ambda One question please: Do these docs constitute a new web page in Zeppelin's docs? I'm just wondering how a user would navigate to this new information on binding modes from the Docs home page. Will the new page be linked to from under http://zeppelin.apache.org/docs/0.7.2/manual/interpreters.html#interpreter-binding-mode for example? Or will it become part of the interpreters.html page itself (i.e., Interpreter > Overview)?

@1ambda
Copy link
Member Author

1ambda commented Jul 6, 2017

@cacti77 it's provided as of 0.8.0-SNAPSHOT.

here is the background

@cacti77
Copy link

cacti77 commented Jul 6, 2017

Thanks @1ambda . So, looking at #2371, it looks like a user would simply click on the dropdown and then Interpreter > Interpreter Binding Mode to access the new web page. Is that correct? Basically, as long as all the info about interpreter binding modes is in one place and easy to find, that's all I care about:).

@1ambda
Copy link
Member Author

1ambda commented Jul 7, 2017

Sure, users can easily access the doc.

@asfgit asfgit closed this in 86bc933 Jul 7, 2017
@1ambda
Copy link
Member Author

1ambda commented Jul 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants