-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't deploy Pach to GKE default k8s version #2787
Comments
Note, that in the cases where Pachyderm fails to deploy and I get the above error, I have checked to ensure that the sa exists, and it does:
|
This issue is not GKE specific. It's related to the Kubernetes version and RBAC settings. |
Thanks for the additional info @DSchmidtDev. Just to clarify for the notes here, I didn't deploy with a Helm chart. I used |
@DSchmidtDev do you know which Kubernetes version and RBAC settings create this issue? The ClusterRole has, among other rules, this:
which seems like it should give the required permissions. In addition this works on both 1.8.0 and 1.9.0 kubernetes clusters with rbac enabled. |
I'm not sure. Thought it was officially introduced (stable) with Kubernetes v1.8 but it depends on the deployment args. GKE disabled the old authorization method with the change to v1.8 as default. So since then you have to configure your roles when needed. With v1.6 and v1.7 RBAC and the old authorization was possible in parallel. Your snippet does not cover all permissions. |
This issue comes up if you use the --no-rbac flag when deploying to a version of GKE later than 1.8. Using this flag removes the rolebindings for the pachyderm service account from the manifest. The docs can be a little misleading in this regard, so we are going to update them to reflect this information. |
As described, I get the following error as described above with the --no-rbac option.
However, without the --no-rbac option, I get the following error.
Then, when setting the cluster version for kubernetes to
|
This looks like you do not have the permissions to create roles in your cluster. You can make yourself cluster admin with this command:
You are probably going to want to stick to not using the --no-rbac flag. |
That didn't fix it either.
|
Did you switch to using a 1.7 version of GKE? If you did, you would need the --no-rbac flag because role based access control is not the default for that version. The error message you were getting in your second deployment attempt was because the deployment was trying to create the pachyderm service account and grant it privileges that you did not have. |
I'm using the 1.8.8-gke.0 default version. I'm following these steps: http://docs.pachyderm.io/en/latest/deployment/google_cloud_platform.html With the following |
Okay. If you have already tried a clean deployment, you might want to jump into our slack users channel and layout your situation. Making yourself cluster admin should have gotten you past the issue you were having in the original post. |
See also #2787 I resorted to starting against |
This is the workaround that worked for me with the current default version [1] http://pachyderm.readthedocs.io/en/latest/deployment/google_cloud_platform.html The workaround is as follows, after running the Pachyderm deployment steps:
The key thing is that the user is set to |
I have got past this stage on GKE by following the above and now when following the tutorial can create a repo but when attempting to add the blah.txt file get: $pachctl put-file myrepo master -c -f blah.txt From the Web GUI I have manually added a file to the bucket I used when deploying pachyderm How do I update the permissions for the service account? Hmm. This may be my problem as I configured the Kubernetes cluster from the GUI and didn't set any scopes :-( https://stackoverflow.com/questions/29837531/changing-permissions-of-google-container-engine-cluster Thus I needed to delete the previous K8s cluster and created another one and I clicked on he More tab and set Full access to the Storage API and all ok :-) |
@stevef1uk the most likely cause here is that the IAM role for the cluster doesn't have the proper permissions. You can change that in the GCE web console. |
We should update the docs with the steps need to ensure the gcp service account has the appropriate permissions to be deployed properly. |
This is no longer an issue |
Pachyderm won't deploy to the latest default version of k8s in GKE. This was reported by a user, and I reproduced the issue. The default version in GKE is
1.8.8-gke.0
. For this version or greater, thepachd
pods errors and goes into CrashLoopBackoff with the followingserviceaccount
related errors:However, if you use
--cluster-version 1.7.14-gke.1
or earlier, everything seems to be ok.To reproduce following the GCP docs for deployment with Pach version 1.7.0rc2.
The text was updated successfully, but these errors were encountered: