Add basic RBAC #172

tjcrone · 2018-03-21T15:58:17Z

As per suggestions by @jacobtomlinson, add basic RBAC setup.

mrocklin · 2018-03-21T16:02:12Z

Do we need to change the rbac entry in the jupyter-config.yaml file as well?

tjcrone · 2018-03-21T16:04:24Z

Yes, thanks for catching that. I'll fix.

tjcrone · 2018-03-21T16:14:07Z

If you would like me to squash into a single commit, I can do that. Please let me know. Would also be good to have @jacobtomlinson review if he has time.

mrocklin · 2018-03-21T16:17:34Z

We can always squash with the green button. No reason to do so manually.

rabernat · 2018-03-21T17:57:21Z

🙌

jacobtomlinson

This looks good! I've made a few comments.

One thing to note is I still haven't managed to get access to worker logs in dask-kubernetes with this Role. So it will require some tweaking when I figure it out.

jacobtomlinson · 2018-03-23T09:02:16Z

gce/jupyter-config.yaml


 hub:
  extraConfig: |
-    c.KubeSpawner.singleuser_service_account = 'default'
+    c.KubeSpawner.singleuser_service_account = 'daskkubernetes'


You could move this into the helm config itself instead of the extraConfig code as this config option is supported in z2jh.

singleuser: serviceAccountName: daskkubernetes

I was unable to launch worker pods when the user was set within the singleuser section. I never figured out why. Should I do more testing?

Perhaps. This is working on our stack.

To be honest I'm not worried either way.

Just a thought did you remove the c.KubeSpawner.singleuser_service_account = 'default' line when setting within the singleuser section? Things in the extra config will override the helm config values.

Pretty sure I did, but I can try again.

jacobtomlinson · 2018-03-23T09:03:24Z

gce/setup.sh

@@ -1,5 +1,5 @@
 # Start cluster on Google cloud
-gcloud container clusters create pangeo --num-nodes=10 --machine-type=n1-standard-2 --zone=us-central1-b  --cluster-version=1.9.3-gke.0
+gcloud container clusters create pangeo --no-enable-legacy-authorization --num-nodes=10 --machine-type=n1-standard-2 --zone=us-central1-b  --cluster-version=1.9.3-gke.0


jacobtomlinson · 2018-03-23T09:10:22Z

gce/setup.sh

@@ -17,6 +17,10 @@ helm repo update
 # Install JupyterHub and Dask on the cluster
 helm install jupyterhub/jupyterhub --version=v0.6.0-9701a90 --name=jupyter --namespace=pangeo -f jupyter-config.yaml -f secret-config.yaml

+# create the daskkubernetes service account and role bindings
+echo "Installing service account for daskkubernetes."
+kubectl create -f dask-kubernetes-serviceaccount.yaml


TL;DR This is an anti-pattern, but probably fine for now.

I'm uncomfortable having additional manifests being added manually as you can no longer simply run helm delete <release> --purge to remove Pangeo. It would be better to have the whole deployment contained in a single helm installation. This could be a good opportunity to create a pangeo helm chart.

We currently have a jadepangeo helm chart which has jupyterhub/jupyterhub as a dependency and adds some extra manifests. I would like to get to a position where there is an offical chart (maybe called pangeo/pangeo) which depends on jupyterhub/jupyterhub and then we can make jadepangeo depend on pangeo/pangeo. However this is beyond the scope of this PR so I'll raise a new issue for start work on a PR.

We'll probably rename jadepangeo to metoffice/pangeo or something more sensible as the Pangeo project has superseded the Jade project now.

jacobtomlinson · 2018-03-23T09:12:54Z

gce/jupyter-config.yaml

@@ -24,11 +24,11 @@ singleuser:
    enabled: true

 rbac:
-  enabled: false
+  enabled: true


This defaults to true so you could omit this.

tjcrone · 2018-03-23T09:40:54Z

@jacobtomlinson, one question I have is, does the role as currently defined limit the pod operations only to the pods that the user owns, or can these operations (verbs) be applied to any pod in the cluster?

jacobtomlinson · 2018-03-23T10:02:47Z

They can be applied to any pod in the namespace. Therefore it might be sensible to have a separate namespace for dask workers.

tjcrone · 2018-03-23T10:06:26Z

Okay. A separate namespace for workers sounds like it would be a significant increase in security. Another question I have is, what are the steps to running containers as unprivileged?

jacobtomlinson · 2018-03-23T10:15:34Z

Well it looks like the only requirement for privileged containers currently is to allow users to mount FUSE filesystems within their notebook containers.

On our AWS deployment we are using custom FUSE flex volume drivers which means that S3 is mounted onto the host instead and then volumed into an unprivileged container. So we could create equivalent drivers for the Google platform. Then the containers could be run as unprivileged.

The other option is to stop using FUSE. Our goal is for our tools to work directly with object stores and remove the requirement for FUSE all together, but we are a way off that currently.

tjcrone · 2018-03-23T11:08:00Z

Is it a crazy idea to launch every notebook pod into its own unique namespace, where it has control over the pods in that namespace but zero control over any other pods in any other namespace?

jacobtomlinson · 2018-03-23T13:25:39Z

Not crazy at all. I have considered this before.

It will require changes to the KubeSpawner as it will need to be able to create and remove namespaces too. It also makes assumptions about how the Kubernetes cluster is being managed. For example some people may have been given access to a single namespace on a shared cluster in their org, but others may have dedicated clusters like those used in this project.

Being able to delete a namespace is a pretty big deal as it deletes all resources within it, so giving the KubeSpawner that power is a big deal.

mrocklin · 2018-03-23T14:14:57Z

It will require changes to the KubeSpawner as it will need to be able to create and remove namespaces too

cc @yuvipanda

jacobtomlinson · 2018-03-23T15:14:38Z

This PR is definitely a step in the right direction. Even more can be done going forwards.

I'm keen for it to be merged ASAP so I can incorporate it into my helm PR.

mrocklin · 2018-03-23T15:30:50Z

@jacobtomlinson both you and @tjcrone should have merge privileges. I encourage you both to use them liberally. This system is still experimental. I think it's more imprtant to move quickly than to keep things from breaking, especially in this direction.

If folks are waiting on me to merge things like this we'll probably end up waiting a while. I'll be especially busy in the coming month, and so am keen to ensure that others feel empowered to make large changes here.

jacobtomlinson · 2018-03-23T15:38:10Z

@mrocklin I'm happy to merge but I am currently unable to test on GCE or update the demo platform.

I'm not sure how you want to handle that? Perhaps it would be good to set up Travis CI or some other CI/CD to run tests and keep the platform up to date?

tjcrone · 2018-03-23T15:38:48Z

Roger that @mrocklin. This PR has had enough discussion and review that I am comfortable merging. Cheers.

mrocklin · 2018-03-23T15:47:55Z

@jacobtomlinson I have just given jacob.tomlinson@informaticslab.co.uk edit rights on the project.

rabernat · 2018-03-26T00:28:40Z

This PR has been merged. But has the cluster been re-deployed with the new settings?

jacobtomlinson · 2018-03-26T08:19:16Z

Not as far as I'm aware.

rabernat · 2018-03-26T13:34:55Z

@jacobtomlinson how do you manage deployment at the met office? Do you manually update your cluster every time you change the configuration, or do you have some automated way to do this? (see #131)

I worry that we are making updates to this configuration without actually testing it live.

jacobtomlinson · 2018-03-26T13:57:18Z

I manually do upgrades using helm. As @mrocklin suggests in #131 this could be automated in CI/CD.

Add basic RBAC

dac3eaf

Set rbac to true

3707f67

Fix typos in comments

d17b569

jhamman mentioned this pull request Mar 22, 2018

All zarr gcs datastores are broken on pangeo.pydata.org #176

Closed

jacobtomlinson reviewed Mar 23, 2018

View reviewed changes

Remove rbac section

408a428

jacobtomlinson mentioned this pull request Mar 23, 2018

Create an official helm chart #178

Closed

jacobtomlinson mentioned this pull request Mar 23, 2018

Expand user group to XArray mailing list #130

Closed

jacobtomlinson merged commit 129a482 into pangeo-data:master Mar 23, 2018

jacobtomlinson mentioned this pull request Mar 23, 2018

Add Pangeo helm chart #181

Merged

2 tasks

tjcrone deleted the rbac branch March 24, 2018 15:53

zonca mentioned this pull request Apr 25, 2018

Permissions dask cluster zonca/pangeo-jetstream#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add basic RBAC #172

Add basic RBAC #172

tjcrone commented Mar 21, 2018

mrocklin commented Mar 21, 2018

tjcrone commented Mar 21, 2018

tjcrone commented Mar 21, 2018

mrocklin commented Mar 21, 2018

rabernat commented Mar 21, 2018

jacobtomlinson left a comment

jacobtomlinson Mar 23, 2018

tjcrone Mar 23, 2018

jacobtomlinson Mar 23, 2018

jacobtomlinson Mar 23, 2018

tjcrone Mar 23, 2018

jacobtomlinson Mar 23, 2018 •

edited

Loading

jacobtomlinson Mar 23, 2018 •

edited

Loading

jacobtomlinson Mar 23, 2018

jacobtomlinson Mar 23, 2018

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018 •

edited

Loading

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

mrocklin commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

mrocklin commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

tjcrone commented Mar 23, 2018

mrocklin commented Mar 23, 2018

rabernat commented Mar 26, 2018

jacobtomlinson commented Mar 26, 2018

rabernat commented Mar 26, 2018

jacobtomlinson commented Mar 26, 2018

Add basic RBAC #172

Add basic RBAC #172

Conversation

tjcrone commented Mar 21, 2018

mrocklin commented Mar 21, 2018

tjcrone commented Mar 21, 2018

tjcrone commented Mar 21, 2018

mrocklin commented Mar 21, 2018

rabernat commented Mar 21, 2018

jacobtomlinson left a comment

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018

Choose a reason for hiding this comment

tjcrone Mar 23, 2018

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018

Choose a reason for hiding this comment

tjcrone Mar 23, 2018

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018 • edited Loading

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018 • edited Loading

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018

Choose a reason for hiding this comment

jacobtomlinson Mar 23, 2018

Choose a reason for hiding this comment

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018 • edited Loading

tjcrone commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

mrocklin commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

mrocklin commented Mar 23, 2018

jacobtomlinson commented Mar 23, 2018

tjcrone commented Mar 23, 2018

mrocklin commented Mar 23, 2018

rabernat commented Mar 26, 2018

jacobtomlinson commented Mar 26, 2018

rabernat commented Mar 26, 2018

jacobtomlinson commented Mar 26, 2018

jacobtomlinson Mar 23, 2018 •

edited

Loading

jacobtomlinson Mar 23, 2018 •

edited

Loading

jacobtomlinson commented Mar 23, 2018 •

edited

Loading