Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ZEPPELIN-1320] Run zeppelin interpreter process as web front end user #1322

Closed
wants to merge 12 commits into from

Conversation

@prabhjyotsingh
Copy link
Contributor

prabhjyotsingh commented Aug 11, 2016

What is this PR for?

While running a Notebook using shell, spark, python uses same user as which zeppelin server is running. Which means these interprets have same permission on file system as zeppelin server.
IMO users should be able to impersonate themselves as a complete security system.

What type of PR is it?

[Improvement]

Todos

  • - Update doc
  • - FIX NPEs
  • - FIX CI

What is the Jira issue?

How should this be tested?

  • Enable shiro auth in shiro.ini
  • Add ssh key for the same user you want to try and impersonate (say user1).
adduser user1
ssh-keygen
ssh user1@localhost mkdir -p .ssh
cat ~/.ssh/id_rsa.pub | ssh user1@localhost 'cat >> .ssh/authorized_keys'
  • Start zeppelin server, try and run following in paragraph in a notebook
  • Go to interpreter setting page, and enable "User Impersonate" in any of the interpreter (in my example its shell interpreter)
%sh
whoami

Check that it should run as new user, i.e. "user1"

Screenshots (if appropriate)

user impersonate

Questions:

  • Does the licenses files need update? no
  • Is there breaking changes for older versions? no
  • Does this needs documentation? yes
@felixcheung

This comment has been minimized.

Copy link
Member

felixcheung commented Aug 11, 2016

shouldn't interpreter process be impersonating the user logging onto the web front end?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 12, 2016

@felixcheung Fair point, let me try and do it, will change the title to WIP for now.

@prabhjyotsingh prabhjyotsingh changed the title [ZEPPELIN-1320] Security fix for Shell/Spark and Python Interpreter [WIP] [ZEPPELIN-1320] Security fix for Shell/Spark and Python Interpreter Aug 12, 2016
@@ -1094,14 +1075,15 @@ private String getInterpreterClassFromInterpreterSetting(InterpreterSetting sett
return null;
}

private Interpreter getInterpreter(String noteId, InterpreterSetting setting, String name) {
private Interpreter getInterpreter(String noteId, InterpreterSetting setting, String name,
String userName) {

This comment has been minimized.

Copy link
@bzz

bzz Aug 12, 2016

Member

Probably a nitpick, but horizontal alignment is controversial idea and generally is discouraged by the styleguide as it creates a "blast radius" of re-formatting in case of future changes i.e renaming a function.

# Conflicts:
#	zeppelin-web/src/app/interpreter/interpreter.controller.js
#	zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterFactory.java
#	zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/InterpreterOption.java
@prabhjyotsingh prabhjyotsingh force-pushed the prabhjyotsingh:ZEPPELIN-1320 branch from f449fc0 to 590032e Aug 16, 2016
@prabhjyotsingh prabhjyotsingh changed the title [WIP] [ZEPPELIN-1320] Security fix for Shell/Spark and Python Interpreter [ZEPPELIN-1320] Run zeppelin interpreter process as web front end user Aug 16, 2016
@prabhjyotsingh prabhjyotsingh force-pushed the prabhjyotsingh:ZEPPELIN-1320 branch from cd62bc5 to 787a108 Aug 18, 2016
@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 19, 2016

CI green! Ready for review.

@prabhjyotsingh prabhjyotsingh force-pushed the prabhjyotsingh:ZEPPELIN-1320 branch from ac00c5f to 55ccce3 Aug 19, 2016
@@ -42,6 +42,9 @@ while getopts "hp:d:l:v" o; do
. "${bin}/common.sh"
getZeppelinVersion
;;
u)
ZEPPELIN_SSH_COMMAND="ssh ${OPTARG}@localhost "
;;
esac

This comment has been minimized.

Copy link
@zjffdu

zjffdu Aug 19, 2016

Contributor

This requires the login user must exist in the os account and be able to ssh to localhost. I am not sure whether this is a good way, but just feel the approach is a little strange compared to the impersonation implementation in hadoop.

This comment has been minimized.

Copy link
@prabhjyotsingh

prabhjyotsingh Aug 19, 2016

Author Contributor

@zjffdu yes, I agree, its not as implementation in hadoop, would you recommend something else ?

@jongyoul

This comment has been minimized.

Copy link
Member

jongyoul commented Aug 19, 2016

I agree that it's simple way to use ssh to support impersonation. but I'm worried about it. First, we should consider not to use ssh server in a local machine. It's disabled on Mac by default and in case of Windows users, they might not have any ssh server. Second, even if all of users can use connect their machine via ssh, all of users' name should be the same as system users. AFAIK, Some Zeppelin use cases, the system admin uses virtual users as well. Do you think of it?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 19, 2016

Yes, I thought about the usage in mac and windows, and initially started of with using RUNAS ${userName} for windows and su - ${userName} for *nix systems, but then it requires zeppelin server to run as root. Hence, implemented with ssh ${userName}@localhost.

Have not thought about the cases in which system admin uses virtual users.

Now since with this, we are able to propagate end web user to RemoteInterpreterManagedProcess.start, we can choose to use some other mechanism in interpreter.sh/interpreter.cmd instead of "ssh", or may be make it configurable using some extra config in "zeppelin-env.sh"

What do you recommend, that would be a secure and all full proof mechanism by which we can run interpreter as different user ?

@@ -36,8 +36,7 @@
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.google.gson.Gson;
import com.google.gson.reflect.TypeToken;
import java.util.*;

This comment has been minimized.

Copy link
@bzz

bzz Aug 19, 2016

Member

Probably a nitpick but Zeppelin's Java code conventions discourages usage of wildcard imports.

Could you please check all the other changed files to follow this convention as well?

This comment has been minimized.

Copy link
@prabhjyotsingh

prabhjyotsingh Aug 22, 2016

Author Contributor

Sure, I'll revert this, and check that my Editor (Intellij Idea) is also configured properly.

@jongyoul

This comment has been minimized.

Copy link
Member

jongyoul commented Aug 22, 2016

@prabhjyotsingh I don't know how to support different users' environments fully, actually. But I think it's better to use RUNAS ~ and su - ~ and using ssh without password make some security issues. In case Mesos, it uses that way to support restrict resources. But I never see using ssh without password. How do you think of it?

@jongyoul

This comment has been minimized.

Copy link
Member

jongyoul commented Aug 22, 2016

@prabhjyotsingh Without issues above, Could you check this PR support scoped as well which uses multiple threads in one process?

@Leemoonsoo

This comment has been minimized.

Copy link
Member

Leemoonsoo commented Aug 22, 2016

If i add one more,
What do you guys think about adding an option Impersonate in the interpreter setting on GUI?

That'll give user flexibility of selecting current behavior (without impersonation) and new behavior. Otherwise, this PR will make incompatible user behavior change.

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 22, 2016

It's better to use RUNAS ~ and su - ~

@jongyoul How about I make use RUNAS ~ and su - ~ by default, but if in zeppelin-env.sh a property say USE_SSH_IMPERSONATION is set to true, then it will use ssh web-user@localhost in this way user gets to decide, what is best suited for their user case.

Could you check this PR support scoped as well which uses multiple threads in one process?

Yes I've checked this with Shell and Python interpreter it was working as expected.

@Leemoonsoo, yes agreed, I too think this options should be there, and have implemented it as well. If you take a look at GIF attached in this PR description, it's doing that you are asking for :)

@echarles

This comment has been minimized.

Copy link
Member

echarles commented Aug 22, 2016

Whatever su or ssh is used, I feel the main trick is the user provisioning on the host running the interpreter. Until now, the shiro authentication system had no impact on the user provisioning. This PR changes this.

I guess we all agree and are aware that adding user foo to shiro.ini, and enabling impersonation, will require to run adduser foo manually.

We should make this clear in the doc but also stress it in the UI (with a hover, or a clear text/link near the User Impersonate.

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 22, 2016

@echarles , Yes agreed, will need to update in doc, and a extra toolbar near the check box where user can enable User Impersonate.

@echarles

This comment has been minimized.

Copy link
Member

echarles commented Aug 22, 2016

To make ZEPPELIN-1337 Umbrella for multiple user support for zeppelin more readable, should we rename the following:

  • ZEPPELIN-1340: "Run Hadoop-based interpreter process on Kerberos as web front end user"
  • ZEPPELIN-1320: "Run zeppelin interpreter process as web front end user"
@echarles

This comment has been minimized.

Copy link
Member

echarles commented Aug 22, 2016

... and make ZEPPELIN-1320 a subtask of ZEPPELIN-1337

?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 22, 2016

Yes, you are right, let me do it right away.

@jongyoul

This comment has been minimized.

Copy link
Member

jongyoul commented Aug 22, 2016

@prabhjyotsingh I agree @echarles's idea. Interpreter tries to find hadoop dependencies first and if it passes, it uses doAs. Otherwise, let's talk about how to do it. How do you think of it?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 23, 2016

Sure, In this PR I was only thinking about the otherwise case i.e. in the environment where hadoop dependencies where not present, and hence start interpreter as end-web-user.

@echarles

This comment has been minimized.

Copy link
Member

echarles commented Aug 23, 2016

Btw, for the hadoop case (or spark on yarn case), this PR may give an issue for doAs.

Typically, you configure hadoop.proxyuser.foo.hosts and hadoop.proxyuser.foo.group, foo being the os/kerberos user under which you run your java code that calls doAs.

If we run ssh/su as the front-end user, we will not fullfill what the hadoop/yarn cluster is expecting.

We thus should have two checkboxes:

  • One for the OS/kerberos impersonation (this PR only adresses OS).
  • The other for Hadoop impersonation.

If you select one, I would expect the other one to be disabled.

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 23, 2016

Agreed @echarles, the doAs part will be a problem, until ZEPPELIN-1340 is resolved. Until then for security we may have to run half interpreter with "User Impersonate" enable from UI (for example shell, python interpreter), and for others use the standard doAs already implemented (like livy, spark, jdbc)

@Leemoonsoo

This comment has been minimized.

Copy link
Member

Leemoonsoo commented Aug 24, 2016

Instead of USE_SSH_IMPERSONATION, how about let user customize impersonation method?
For example,

ZEPPELIN_INTERPRETER_IMPERSONATION_CMD="su - ${ZEPPELIN_USER_NAME}"

by default. but user can override this env variable, like

ZEPPELIN_INTERPRETER_IMPERSONATION_CMD="ssh -p12345 ${ZEPPELIN_USER_NAME}@localhost"

It gives more flexibility i think. (e.g. give additional options like -p. use different command to impersonate)

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Aug 24, 2016

@Leemoonsoo yes thats a good suggestion. Let me try and do it.

@astroshim

This comment has been minimized.

Copy link
Contributor

astroshim commented Oct 16, 2016

I got following checkstyle error while building source.

[INFO] There are 1 checkstyle errors.
[ERROR] NotebookServer.java[1381] (sizes) LineLength: Line is longer than 100 characters (found 102).

@prabhjyotsingh Could you fix this?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Oct 23, 2016

Closing this, will open a new one with merge of #1265.

asfgit pushed a commit that referenced this pull request Nov 18, 2016
Have recreated this from #1322
### What is this PR for?

While running a Notebook using shell, spark, python uses same user as which zeppelin server is running. Which means these interprets have same permission on file system as zeppelin server.
IMO users should be able to impersonate themselves as a complete security system.
### What type of PR is it?

[Improvement]
### Todos
- [x] - Update doc
- [x] - FIX NPEs
- [x] - FIX CI
### What is the Jira issue?
- [ZEPPELIN-1320](https://issues.apache.org/jira/browse/ZEPPELIN-1320)
### How should this be tested?
- Enable shiro auth in shiro.ini
- Add ssh key for the same user you want to try and impersonate (say user1).

```
adduser user1
ssh-keygen
ssh user1localhost mkdir -p .ssh
cat ~/.ssh/id_rsa.pub | ssh user1localhost 'cat >> .ssh/authorized_keys'
```
- Start zeppelin server, try and run following in paragraph in a notebook
- Go to interpreter setting page, and enable "User Impersonate" in any of the interpreter (in my example its shell interpreter)

```
%sh
whoami
```

Check that it should run as new user, i.e. "user1"
### Screenshots (if appropriate)

![user impersonate](https://cloud.githubusercontent.com/assets/674497/20213127/f32fdc52-a82c-11e6-8e33-aebd6a943c5f.gif)

### Questions:
- Does the licenses files need update? no
- Is there breaking changes for older versions? no
- Does this needs documentation? yes

Author: Prabhjyot Singh <prabhjyotsingh@gmail.org>

Closes #1554 from prabhjyotsingh/ZEPPELIN-1320-2 and squashes the following commits:

dc69c9d [Prabhjyot Singh] @Leemoonsoo review comment: making ZEPPELIN_SSH_COMMAND configurable
1b26cc0 [Prabhjyot Singh] add doc
5a76839 [Prabhjyot Singh] show User Impersonate only when interpreter setting is "per user" and "isolated"
02c3084 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
03b2f20 [Prabhjyot Singh] use user instead of ""
0ff80ec [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
dd0731d [Prabhjyot Singh] fix missing test cases
aff1bf0 [Prabhjyot Singh] user should have option to run these interpreters as different user.
@zjffdu

This comment has been minimized.

Copy link
Contributor

zjffdu commented Nov 23, 2016

Sorry for late comment. I was in vacation in the last 2 weeks. I found this didn't work for spark interpreter. @prabhjyotsingh Did you try it for spark interpreter and other interpreters ?

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Nov 23, 2016

@zjffdu Yes, you are right, with SPARK_HOME/SPARK_SUBMIT it doesn't work.

@zjffdu

This comment has been minimized.

Copy link
Contributor

zjffdu commented Nov 23, 2016

Then I think we should either revert this PR or fix it for spark interpreter as well. Because spark interpreter is the most important interpreter of zeppelin IMO.

@prabhjyotsingh

This comment has been minimized.

Copy link
Contributor Author

prabhjyotsingh commented Nov 23, 2016

Sure make sense I'll try to fix it ASAP. https://issues.apache.org/jira/browse/ZEPPELIN-1701

tae-jun added a commit to tae-jun/zeppelin that referenced this pull request Nov 23, 2016
Have recreated this from apache#1322
### What is this PR for?

While running a Notebook using shell, spark, python uses same user as which zeppelin server is running. Which means these interprets have same permission on file system as zeppelin server.
IMO users should be able to impersonate themselves as a complete security system.
### What type of PR is it?

[Improvement]
### Todos
- [x] - Update doc
- [x] - FIX NPEs
- [x] - FIX CI
### What is the Jira issue?
- [ZEPPELIN-1320](https://issues.apache.org/jira/browse/ZEPPELIN-1320)
### How should this be tested?
- Enable shiro auth in shiro.ini
- Add ssh key for the same user you want to try and impersonate (say user1).

```
adduser user1
ssh-keygen
ssh user1localhost mkdir -p .ssh
cat ~/.ssh/id_rsa.pub | ssh user1localhost 'cat >> .ssh/authorized_keys'
```
- Start zeppelin server, try and run following in paragraph in a notebook
- Go to interpreter setting page, and enable "User Impersonate" in any of the interpreter (in my example its shell interpreter)

```
%sh
whoami
```

Check that it should run as new user, i.e. "user1"
### Screenshots (if appropriate)

![user impersonate](https://cloud.githubusercontent.com/assets/674497/20213127/f32fdc52-a82c-11e6-8e33-aebd6a943c5f.gif)

### Questions:
- Does the licenses files need update? no
- Is there breaking changes for older versions? no
- Does this needs documentation? yes

Author: Prabhjyot Singh <prabhjyotsingh@gmail.org>

Closes apache#1554 from prabhjyotsingh/ZEPPELIN-1320-2 and squashes the following commits:

dc69c9d [Prabhjyot Singh] @Leemoonsoo review comment: making ZEPPELIN_SSH_COMMAND configurable
1b26cc0 [Prabhjyot Singh] add doc
5a76839 [Prabhjyot Singh] show User Impersonate only when interpreter setting is "per user" and "isolated"
02c3084 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
03b2f20 [Prabhjyot Singh] use user instead of ""
0ff80ec [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into ZEPPELIN-1320-2
dd0731d [Prabhjyot Singh] fix missing test cases
aff1bf0 [Prabhjyot Singh] user should have option to run these interpreters as different user.
@prabhjyotsingh prabhjyotsingh deleted the prabhjyotsingh:ZEPPELIN-1320 branch Feb 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.