Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User cannot list pods in the namespace #167

Closed
tjcrone opened this issue Mar 17, 2018 · 13 comments
Closed

User cannot list pods in the namespace #167

tjcrone opened this issue Mar 17, 2018 · 13 comments

Comments

@tjcrone
Copy link
Contributor

tjcrone commented Mar 17, 2018

When trying to execute the following within my own GCP deployment:

from daskernetes import KubeCluster
cluster = KubeCluster()
cluster.scale_up(2)

I get the following error:

---------------------------------------------------------------------------
ApiException                              Traceback (most recent call last)
<ipython-input-4-caa8db8248c6> in <module>()
      1 from daskernetes import KubeCluster
      2 cluster = KubeCluster()
----> 3 cluster.scale_up(2)

/opt/conda/lib/python3.6/site-packages/daskernetes/core.py in scale_up(self, n, pods, **kwargs)
    373         >>> cluster.scale_up(20)  # ask for twenty workers
    374         """
--> 375         pods = pods or self.pods()
    376 
    377         for i in range(3):

/opt/conda/lib/python3.6/site-packages/daskernetes/core.py in pods(self)
    319         return self.core_api.list_namespaced_pod(
    320             self.namespace,
--> 321             label_selector=format_labels(self.pod_template.metadata.labels)
    322         ).items
    323 

/opt/conda/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py in list_namespaced_pod(self, namespace, **kwargs)
  12289             return self.list_namespaced_pod_with_http_info(namespace, **kwargs)
  12290         else:
> 12291             (data) = self.list_namespaced_pod_with_http_info(namespace, **kwargs)
  12292             return data
  12293 

/opt/conda/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py in list_namespaced_pod_with_http_info(self, namespace, **kwargs)
  12392                                         _preload_content=params.get('_preload_content', True),
  12393                                         _request_timeout=params.get('_request_timeout'),
> 12394                                         collection_formats=collection_formats)
  12395 
  12396     def list_namespaced_pod_template(self, namespace, **kwargs):

/opt/conda/lib/python3.6/site-packages/kubernetes/client/api_client.py in call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    319                                    body, post_params, files,
    320                                    response_type, auth_settings,
--> 321                                    _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    322         else:
    323             thread = self.pool.apply_async(self.__call_api, (resource_path, method,

/opt/conda/lib/python3.6/site-packages/kubernetes/client/api_client.py in __call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    153                                      post_params=post_params, body=body,
    154                                      _preload_content=_preload_content,
--> 155                                      _request_timeout=_request_timeout)
    156 
    157         self.last_response = response_data

/opt/conda/lib/python3.6/site-packages/kubernetes/client/api_client.py in request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
    340                                         _preload_content=_preload_content,
    341                                         _request_timeout=_request_timeout,
--> 342                                         headers=headers)
    343         elif method == "HEAD":
    344             return self.rest_client.HEAD(url,

/opt/conda/lib/python3.6/site-packages/kubernetes/client/rest.py in GET(self, url, headers, query_params, _preload_content, _request_timeout)
    229                             _preload_content=_preload_content,
    230                             _request_timeout=_request_timeout,
--> 231                             query_params=query_params)
    232 
    233     def HEAD(self, url, headers=None, query_params=None, _preload_content=True, _request_timeout=None):

/opt/conda/lib/python3.6/site-packages/kubernetes/client/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    220 
    221         if not 200 <= r.status <= 299:
--> 222             raise ApiException(http_resp=r)
    223 
    224         return r

ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b976032d-5ed4-43e0-9dc8-e0c9fe349519', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Sat, 17 Mar 2018 15:36:18 GMT', 'Content-Length': '304'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods is forbidden: User \"system:serviceaccount:pangeo:default\" cannot list pods in the namespace \"pangeo\": Unknown user \"system:serviceaccount:pangeo:default\"","reason":"Forbidden","details":{"kind":"pods"},"code":403}

I'm guessing this is because of some permission issue or missing service account setting on GCP. Any ideas what might be going on?

@tjcrone
Copy link
Contributor Author

tjcrone commented Mar 17, 2018

output of

tjc@europa:~$ kubectl get clusterrolebinding
NAME                                           AGE
cluster-admin                                  1h
cluster-admin-binding                          1h
event-exporter-rb                              1h
gce:beta:kubelet-certificate-bootstrap         1h
gce:beta:kubelet-certificate-rotation          1h
heapster-binding                               1h
kube-apiserver-kubelet-api-admin               1h
kubelet-cluster-admin                          1h
nginx-jupyter                                  1h
npd-binding                                    1h
system:basic-user                              1h
system:controller:attachdetach-controller      1h
system:controller:certificate-controller       1h
system:controller:cronjob-controller           1h
system:controller:daemon-set-controller        1h
system:controller:deployment-controller        1h
system:controller:disruption-controller        1h
system:controller:endpoint-controller          1h
system:controller:generic-garbage-collector    1h
system:controller:horizontal-pod-autoscaler    1h
system:controller:job-controller               1h
system:controller:namespace-controller         1h
system:controller:node-controller              1h
system:controller:persistent-volume-binder     1h
system:controller:pod-garbage-collector        1h
system:controller:replicaset-controller        1h
system:controller:replication-controller       1h
system:controller:resourcequota-controller     1h
system:controller:route-controller             1h
system:controller:service-account-controller   1h
system:controller:service-controller           1h
system:controller:statefulset-controller       1h
system:controller:ttl-controller               1h
system:discovery                               1h
system:kube-controller-manager                 1h
system:kube-dns                                1h
system:kube-dns-autoscaler                     1h
system:kube-scheduler                          1h
system:node                                    1h
system:node-proxier                            1h
tiller                                         1h

Does this mean that the system:serviceaccount:pangeo:default user does not exist? Any suggestions on how to create this account with the CLI if so? Thanks!

@mrocklin
Copy link
Member

I'm not currently able to reproduce. You might want to try with the newer dask_kubernetes library

from dask_kubernetes import KubeCluster

cluster = KubeCluster()
cluster.scale(2)
cluster

@tjcrone
Copy link
Contributor Author

tjcrone commented Mar 17, 2018

Thanks @mrocklin. I tried your suggestion and got the same result. One thing that may be related is that I cannot get

rbac:
    enabled: false

to work, so I commented it out, and that may mean that I need a service account that is not there. Do you have system:serviceaccount:pangeo:default in your clusterrolebinding? If not, maybe I need to troubleshoot the rbac setting in jupyter-config.yaml.

@mrocklin
Copy link
Member

mrocklin commented Mar 17, 2018 via email

@tjcrone
Copy link
Contributor Author

tjcrone commented Mar 17, 2018

I don't know why, but jupyterhub will not install without rbac enabled in my yaml file. Helm throws a strangely worded "timed out waiting for the condition" error. It's not clear that anyone else is having this problem, so I will keep working on the rbac issue and keep you posted. Please let me know if you think of any reason why this might not be working for me. I will also look into role definitions as a possible alternative fix. Thanks!

@mrocklin
Copy link
Member

mrocklin commented Mar 18, 2018 via email

@tjcrone
Copy link
Contributor Author

tjcrone commented Mar 18, 2018

Issue solved. The latest version of gcloud now creates clusters with the --no-enable-legacy-authorization option set by default. To create a cluster that will allow legacy authorization and thus allow jupyterlab to be install with rbac.enabled=false, it is necessary to create the cluster with the --enable-legacy-authorization flag. As you noted, operating without RBAC is super insecure, so we will eventually need to create a single user service account with the appropriate access. Not my area of expertise but I'd be happy to look into how it might work.

@jacobtomlinson
Copy link
Member

I strongly advise against disabling RBAC. It is rather tedious to get everything configured correctly but once you do you will be in a much more secure position.

Here is the service account config we are using to get you started. I am actively tweaking it so it isn't perfect yet, for example I can't seem to get pod logs through dask_kubernetes yet.

dask-kubernetes-serviceaccount.yaml

kind: ServiceAccount
apiVersion: v1
metadata:
  name: daskkubernetes
  namespace: jupyter


---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: daskkubernetes
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "list", "watch", "create", "delete"]

---

kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: daskkubernetes
subjects:
- kind: ServiceAccount
  name: daskkubernetes
  namespace: jupyter
roleRef:
  kind: Role
  name: daskkubernetes
  apiGroup: rbac.authorization.k8s.io

Zero to jupyterhub config

singleuser:
  serviceAccountName: daskkubernetes

@mrocklin
Copy link
Member

@jacobtomlinson if you have any interest in contributing something like this to upstream that would be very welcome.

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Mar 19, 2018

Happy to do so. I'll put a PR into z2jh to allow easy configuration.

In our instance we are talking about dask_kubernetes but this gives anything in the notebook these permissions, so perhaps it should be named something more generic?

@tjcrone
Copy link
Contributor Author

tjcrone commented Mar 19, 2018

@jacobtomlinson, thanks this is great. I managed to get this working mostly as is, using:

 kubectl create -f dask-kubernetes-serviceaccount.yaml

but I had to add the namespace into the metadata section of the role and rolebinding because otherwise these get added to the default namespace.

@stale
Copy link

stale bot commented Jun 25, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 25, 2018
@stale
Copy link

stale bot commented Jul 2, 2018

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.

@stale stale bot closed this as completed Jul 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants