Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-2149] Each interpreter should have a seperate log file #2107

Closed
wants to merge 4 commits into from

Conversation

prabhjyotsingh
Copy link
Contributor

What is this PR for?

In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file.
Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging.

What type of PR is it?

[Improvement]

What is the Jira issue?

How should this be tested?

Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.).
Now on running these different interpreters should log to different files.

Questions:

  • Does the licenses files need update? N/A
  • Is there breaking changes for older versions? N/A
  • Does this needs documentation? N/A

@@ -86,6 +89,9 @@ ZEPPELIN_SERVER=org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
INTERPRETER_ID=$(basename "${INTERPRETER_DIR}")
ZEPPELIN_PID="${ZEPPELIN_PID_DIR}/zeppelin-interpreter-${INTERPRETER_ID}-${ZEPPELIN_IDENT_STRING}-${HOSTNAME}.pid"
ZEPPELIN_LOGFILE="${ZEPPELIN_LOG_DIR}/zeppelin-interpreter-"
if [[ ! -z "$INTERPRETER_NAME" ]]; then
ZEPPELIN_LOGFILE+="${INTERPRETER_NAME}-"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INTERPRETER_NAME and INTERPRETER_ID are used together ? I assume the spark1 log file should be something like zeppelin-interpreter-spark1-jzhang-jzhangMBPr.local.log and spark2 log is zeppelin-interpreter-spark2-jzhang-jzhangMBPr.local.log

Copy link
Contributor

@zjffdu zjffdu Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could pass interpreter name as environment variable INTERPRETER_ID to the interpreter process, so that we just need a minor change on interpreter.sh

Copy link
Contributor Author

@prabhjyotsingh prabhjyotsingh Mar 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, even I initially thought about removing INTERPRETER_ID but then no harm in it, and I thought about the case where the user would want to tail -f *jdbc* for reading all the interpreters log created as HIVE, PHOENIX or POSTGRES, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zjffdu Any thoughts on this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prabhjyotsingh I verified the PR, seems the log file name format is zeppelin-interpreter-<interpeter_name>-<interpreter_id>-<host>.log. But I still think we don't need to differentiate interpreter_name and interpreter_id as interperter_name should already been unique, zeppelin-interpreter-<interpeter_name>-<host>.log should be enough. The scenario you mentioned about jdbc, I don't think user would use this command tail -f *jdbc* to read all the jdbc logs.
Let's hear more comments from others @felixcheung @Leemoonsoo @jongyoul
BTW, we need to document for this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we may also need to consider to include username into log file name if isolation is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, username does get included if impersonation in is enabled https://github.com/apache/zeppelin/blob/master/bin/interpreter.sh#L90.

@r-kamath
Copy link
Member

r-kamath commented Mar 8, 2017

Tested. LGTM.

@felixcheung
Copy link
Member

Is name the right name for this? I seem to recall we call it something else in the code for the %thing

@prabhjyotsingh
Copy link
Contributor Author

I picked up "Interpreter Name" from UI while creating new interpreter, let me rename this to InterpreterGroup.

@prabhjyotsingh
Copy link
Contributor Author

Any thoughts on this ?

@prabhjyotsingh
Copy link
Contributor Author

Merging this if no more discussion.

@asfgit asfgit closed this in 1ec6e12 Mar 15, 2017
asfgit pushed a commit that referenced this pull request Mar 22, 2017
### What is this PR for?
This fixes the typo is #2107.

### What type of PR is it?
[Bug Fix]

### How should this be tested?
Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.).
Now on running these different interpreters should log to different files.

### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? N/A
* Is there breaking changes for older versions? N/A
* Does this needs documentation? N/A

Author: Prabhjyot Singh <prabhjyotsingh@gmail.com>

Closes #2170 from prabhjyotsingh/ZEPPELIN-2149_v2 and squashes the following commits:

7baf81a [Prabhjyot Singh] [ZEPPELIN-2149] correct variable name used in interpreter.sh
prabhjyotsingh added a commit to prabhjyotsingh/zeppelin that referenced this pull request Sep 1, 2017
### What is this PR for?
In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file.
Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging.

### What type of PR is it?
[Improvement]

### What is the Jira issue?
* [ZEPPELIN-2149](https://issues.apache.org/jira/browse/ZEPPELIN-2149)

### How should this be tested?
Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.).
Now on running these different interpreters should log to different files.

### Questions:
* Does the licenses files need update? N/A
* Is there breaking changes for older versions? N/A
* Does this needs documentation? N/A

Author: Prabhjyot Singh <prabhjyotsingh@gmail.com>

Closes apache#2107 from prabhjyotsingh/ZEPPELIN-2149 and squashes the following commits:

4566037 [Prabhjyot Singh] rename interpreterName to  interpreterGroupName
a989f5e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-2149
0cf514f [Prabhjyot Singh] fix tests
934f274 [Prabhjyot Singh] ZEPPELIN-2149: Each interpreter should have a seperate log file

# Conflicts:
#	zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java
#	zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcessTest.java
#	zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
prabhjyotsingh added a commit to prabhjyotsingh/zeppelin that referenced this pull request Oct 23, 2017
### What is this PR for?
In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file.
Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging.

### What type of PR is it?
[Improvement]

### What is the Jira issue?
* [ZEPPELIN-2149](https://issues.apache.org/jira/browse/ZEPPELIN-2149)

### How should this be tested?
Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.).
Now on running these different interpreters should log to different files.

### Questions:
* Does the licenses files need update? N/A
* Is there breaking changes for older versions? N/A
* Does this needs documentation? N/A

Author: Prabhjyot Singh <prabhjyotsingh@gmail.com>

Closes apache#2107 from prabhjyotsingh/ZEPPELIN-2149 and squashes the following commits:

4566037 [Prabhjyot Singh] rename interpreterName to  interpreterGroupName
a989f5e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-2149
0cf514f [Prabhjyot Singh] fix tests
934f274 [Prabhjyot Singh] ZEPPELIN-2149: Each interpreter should have a seperate log file

# Conflicts:
#	zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java
#	zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcessTest.java
#	zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
@prabhjyotsingh prabhjyotsingh deleted the ZEPPELIN-2149 branch February 25, 2018 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants