Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZEPPELIN-2685. Improvement on Interpreter class #2592

Closed
wants to merge 2 commits into from

Conversation

zjffdu
Copy link
Contributor

@zjffdu zjffdu commented Sep 19, 2017

What is this PR for?

Follow up of #2577. Main changes on Interpreter

  • Add throw InterpreterException which is checked exception for the abstract methods of Interpreter, this would enforce the interpreter implementation to throw InterpreterException.

  • field name refactoring.

    • property -> properties
    • getProperty() --> getProperties()
  • Introduce launcher layer for interpreter launching. Currently we only use shell script to launch interpreter, but it could be any other service or component to launch interpreter, such as livy server , other 3rd party tools or even we may create a separate module for interpreter launcher

    • abstract cass InterpreterLauncher
    • For now, only 2 implementation: ShellScriptLauncher & SparkInterpreterLauncher. We could add method in class Interpreter to allow interpreter to specify its own launcher class, but it could be future work.

What type of PR is it?

[Improvement | Refactoring]

Todos

  • - Task

What is the Jira issue?

How should this be tested?

Unit test is covered. ShellScriptLauncherTest & SparkInterpreterLauncherTest

Screenshots (if appropriate)

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@zjffdu zjffdu force-pushed the ZEPPELIN-2685 branch 8 times, most recently from 2421b85 to 3ce8544 Compare September 21, 2017 11:10
@zjffdu
Copy link
Contributor Author

zjffdu commented Sep 21, 2017

@Leemoonsoo @jongyoul @felixcheung Could you help review it ? Thanks. It is followup of #2577

@jongyoul
Copy link
Member

Launcher concept looks good to me.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two comments:

  • IMO it's useful to have a interpreter.sh script for interpreters. Even for Spark - I've seen a lot of cases where people are injecting dependencies, CLASS_PATH etc with it.
  • Looks like a few of these changes will break the interpreter interface - for custom interpreters not committed or released with Zeppelin, doesn't this break all of them? I have seen a few such interpreters having their own git repo and getting installed as jar or similar.

try {
repl.cancel(getInterpreterContextWithoutRunner(null));
} catch (InterpreterException e) {
throw new RuntimeException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why rewrap the exception here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because InterpreterException is checked exception. If I throw it, I have to change signature of the caller method, this would cause lots of code change. In order to keep the PR small, I just throw it as RuntimeException.

try {
notebook.getInterpreterSettingManager().restart(setting.getId());
} catch (InterpreterException e) {
logger.warn("Fail to resetart interpreter: " + setting.getId(), e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resetart -> restart

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an error...

} else if (master.equals("yarn-cluster")) {
return "cluster";
} else if (master.startsWith("local")) {
return "client";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is "local" == "client"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this never going to hit? isYarnMode is true before calling this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's only 2 valid value for spark.submit.deployMode (client or cluster). It is client for local mode IIUC.

if (master == null) {
master = properties.getProperty("spark.master");
if (master == null) {
master = System.getenv("SPARK_HOME");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait? master != SPARK_HOME?!?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my mistake. will fix it

setupPropertiesForPySpark(sparkProperties);
setupPropertiesForSparkR(sparkProperties, properties.getProperty("SPARK_HOME"));
if (isYarnMode() && getDeployMode().equals("cluster")) {
env.put("SPARK_YARN_CLUSTER", "true");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking more about this, I think we should avoid Zeppelin specific env with a SPARK_ prefix, to avoid potential conflict

int connectTimeout =
zConf.getInt(ZeppelinConfiguration.ConfVars.ZEPPELIN_INTERPRETER_CONNECT_TIMEOUT);
if (option.isExistingProcess()) {
// TODO(zjffdu) remove the existing process approach seems no one is using this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is no one using existing process?

Copy link
Contributor Author

@zjffdu zjffdu Sep 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this todo, as someone is using the existing process for restarting scenario [ZEPPELIN-2879]

@@ -353,7 +353,7 @@ private DepInterpreter getDepInterpreter() {
public boolean isYarnMode() {
String master = getProperty("master");
if (master == null) {
master = getProperty().getProperty("spark.master", "local[*]");
master = getProperty("spark.master", "local[*]");
}
return master.startsWith("yarn");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this be refactored to share code with SparkInterpreterLauncher.java?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are in different module (spark, zeppelin-zengine). so I am afraid there's no chance to reuse code.

@zjffdu
Copy link
Contributor Author

zjffdu commented Sep 22, 2017

Looks like a few of these changes will break the interpreter interface - for custom interpreters not committed or released with Zeppelin, doesn't this break all of them? I have seen a few such interpreters having their own git repo and getting installed as jar or similar.

It would not break the existing interpreter. I introduce Launcher concept to allow customize interpreter to launch. But for now all the interpreter still use interpreter.sh to launch it even for spark. But I have a plan to create a specific script to launch spark, because there's lots of specific logic for spark interpreter.

Looks like a few of these changes will break the interpreter interface - for custom interpreters not committed or released with Zeppelin, doesn't this break all of them? I have seen a few such interpreters having their own git repo and getting installed as jar or similar.

Yes, it would break custom interpreter not committed with zeppelin. But it should be very easy to update the code.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public void open() throws InterpreterException will break the signature of any inheriting classes right?

@zjffdu
Copy link
Contributor Author

zjffdu commented Oct 10, 2017

@Leemoonsoo Do you have any concern about the compatibility break in this PR ?

  • Add throw InterpreterException which is checked exception for the abstract methods of Interpreter, this would enforce the interpreter implementation to throw InterpreterException.

  • field name refactoring.
    ** property -> properties
    ** getProperty() --> getProperties()

@Leemoonsoo
Copy link
Member

@zjffdu I don't have much concern about compatibility break you have mentioned. Left few comments, please check.

@@ -626,6 +640,7 @@ public void clearNoteIdAndParaMap() {
}
return interpreters;
}
<<<<<<< 0c64d9ca676e48a749db9879fa3cebc06eb78b54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unexpected line

@@ -266,6 +272,14 @@ public InterpreterSetting(InterpreterSetting o) {
this.conf = o.getConf();
}

private void createLauncher() {
if (group.equals("spark")) {
this.launcher = new SparkInterpreterLauncher(this.conf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zjffdu Do you think it's possible to make this part more generic? For example, read class name of launcher from interpreter-setting.json and instantiate it using reflection. So any interpreter can define their own launcher without changing zeppelin core modules?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah,I think about it before. But it may involve more work. Because I don't want to make interpreter jar mix with zeppelin server jar. Although I can use URLClassLoader, still feel a little risky to do it right now because Interpreter may not be instantiated correctly in zeppelin server process if it has some other dependencies, it requires interpreter do initialization in method open instead of its constructor.

@zjffdu
Copy link
Contributor Author

zjffdu commented Oct 13, 2017

Will merge it if no more discussion.

@asfgit asfgit closed this in 9812e26 Oct 14, 2017
jongyoul pushed a commit to jongyoul/zeppelin that referenced this pull request Oct 24, 2017
### What is this PR for?
Fix build interpreters (apark and R) which broken after apache#2592 and
apache#2596

### What type of PR is it?
[Hot Fix]

### How should this be tested?
build interpreters

### Questions:
* Does the licenses files need update? no
* Is there breaking changes for older versions? no
* Does this needs documentation? no

Author: tinkoff-dwh <tinkoff.dwh@gmail.com>

Closes apache#2630 from tinkoff-dwh/fix_r_interpretator and squashes the following commits:

0c9828b [tinkoff-dwh] [HOTFIX] fix build spark and R interpreters
asfgit pushed a commit that referenced this pull request Oct 26, 2017
### What is this PR for?

This is for the bug hotfix introduced in #2592 . The issue is that new interpreter created can not run properly because the incorrect interpreter dir. Thanks tinkoff-dwh for reporting this issue.

### What type of PR is it?
[ Hot Fix]

### Todos
* [ ] - Task

### How should this be tested?
* First time? Setup Travis CI as described on https://zeppelin.apache.org/contribution/contributions.html#continuous-integration
* Strongly recommended: add automated unit tests for any new or changed behavior
* Outline any manual steps to test the PR here.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Jeff Zhang <zjffdu@apache.org>

Closes #2632 from zjffdu/HotFix_Interpreter and squashes the following commits:

2d7ab03 [Jeff Zhang] [HotFix] - Incorrect interpreter dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants