-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ZEPPELIN-2149] Each interpreter should have a seperate log file #2107
Conversation
bin/interpreter.sh
Outdated
@@ -86,6 +89,9 @@ ZEPPELIN_SERVER=org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer | |||
INTERPRETER_ID=$(basename "${INTERPRETER_DIR}") | |||
ZEPPELIN_PID="${ZEPPELIN_PID_DIR}/zeppelin-interpreter-${INTERPRETER_ID}-${ZEPPELIN_IDENT_STRING}-${HOSTNAME}.pid" | |||
ZEPPELIN_LOGFILE="${ZEPPELIN_LOG_DIR}/zeppelin-interpreter-" | |||
if [[ ! -z "$INTERPRETER_NAME" ]]; then | |||
ZEPPELIN_LOGFILE+="${INTERPRETER_NAME}-" | |||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTERPRETER_NAME
and INTERPRETER_ID
are used together ? I assume the spark1 log file should be something like zeppelin-interpreter-spark1-jzhang-jzhangMBPr.local.log
and spark2 log is zeppelin-interpreter-spark2-jzhang-jzhangMBPr.local.log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we could pass interpreter name as environment variable INTERPRETER_ID
to the interpreter process, so that we just need a minor change on interpreter.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, even I initially thought about removing INTERPRETER_ID
but then no harm in it, and I thought about the case where the user would want to tail -f *jdbc*
for reading all the interpreters log created as HIVE, PHOENIX or POSTGRES, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zjffdu Any thoughts on this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prabhjyotsingh I verified the PR, seems the log file name format is zeppelin-interpreter-<interpeter_name>-<interpreter_id>-<host>.log
. But I still think we don't need to differentiate interpreter_name
and interpreter_id
as interperter_name
should already been unique, zeppelin-interpreter-<interpeter_name>-<host>.log
should be enough. The scenario you mentioned about jdbc, I don't think user would use this command tail -f *jdbc*
to read all the jdbc logs.
Let's hear more comments from others @felixcheung @Leemoonsoo @jongyoul
BTW, we need to document for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we may also need to consider to include username into log file name if isolation is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, username does get included if impersonation in is enabled https://github.com/apache/zeppelin/blob/master/bin/interpreter.sh#L90.
Tested. LGTM. |
Is |
I picked up "Interpreter Name" from UI while creating new interpreter, let me rename this to InterpreterGroup. |
Any thoughts on this ? |
Merging this if no more discussion. |
### What is this PR for? This fixes the typo is #2107. ### What type of PR is it? [Bug Fix] ### How should this be tested? Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.). Now on running these different interpreters should log to different files. ### Screenshots (if appropriate) ### Questions: * Does the licenses files need update? N/A * Is there breaking changes for older versions? N/A * Does this needs documentation? N/A Author: Prabhjyot Singh <prabhjyotsingh@gmail.com> Closes #2170 from prabhjyotsingh/ZEPPELIN-2149_v2 and squashes the following commits: 7baf81a [Prabhjyot Singh] [ZEPPELIN-2149] correct variable name used in interpreter.sh
### What is this PR for? In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file. Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging. ### What type of PR is it? [Improvement] ### What is the Jira issue? * [ZEPPELIN-2149](https://issues.apache.org/jira/browse/ZEPPELIN-2149) ### How should this be tested? Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.). Now on running these different interpreters should log to different files. ### Questions: * Does the licenses files need update? N/A * Is there breaking changes for older versions? N/A * Does this needs documentation? N/A Author: Prabhjyot Singh <prabhjyotsingh@gmail.com> Closes apache#2107 from prabhjyotsingh/ZEPPELIN-2149 and squashes the following commits: 4566037 [Prabhjyot Singh] rename interpreterName to interpreterGroupName a989f5e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-2149 0cf514f [Prabhjyot Singh] fix tests 934f274 [Prabhjyot Singh] ZEPPELIN-2149: Each interpreter should have a seperate log file # Conflicts: # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcessTest.java # zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
### What is this PR for? In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file. Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging. ### What type of PR is it? [Improvement] ### What is the Jira issue? * [ZEPPELIN-2149](https://issues.apache.org/jira/browse/ZEPPELIN-2149) ### How should this be tested? Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.). Now on running these different interpreters should log to different files. ### Questions: * Does the licenses files need update? N/A * Is there breaking changes for older versions? N/A * Does this needs documentation? N/A Author: Prabhjyot Singh <prabhjyotsingh@gmail.com> Closes apache#2107 from prabhjyotsingh/ZEPPELIN-2149 and squashes the following commits: 4566037 [Prabhjyot Singh] rename interpreterName to interpreterGroupName a989f5e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-2149 0cf514f [Prabhjyot Singh] fix tests 934f274 [Prabhjyot Singh] ZEPPELIN-2149: Each interpreter should have a seperate log file # Conflicts: # zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreter.java # zeppelin-interpreter/src/test/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterProcessTest.java # zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
What is this PR for?
In the current implementation, both spark and spark2 uses spark interpreter, both phoenix and hive use jdbc interpreter log file.
Each interpreter should have separate log file. The same log file for multiple interpreters confuses the user and complicated debugging.
What type of PR is it?
[Improvement]
What is the Jira issue?
How should this be tested?
Create multiple interpreters (say shell interpreter) with different names (say sh1, sh2, etc.).
Now on running these different interpreters should log to different files.
Questions: