Helm chart: update to traefik v2.5.x#431
Conversation
1ea7987 to
e1746d4
Compare
Notes from debugging version bump failureI figured I'd do some general maintenance and bump traefik. It turns out that between v2.1.9 and v2.2.0 there is a breaking change related to k8s RBAC permissions. We still mount the same kind of service account with its associated permissions, but traefik reacts differently and concludes it now lack permissions. I think what's happening is that Traefik starts to access previously not known k8s resources and/or tries to access them with across all namespaces instead of just the local or similar. Perhaps because of a change in a flag or config etc. Here is the changelog: https://github.com/traefik/traefik/blob/master/CHANGELOG.md#v220-2020-03-25 This can be relevant for example: These are the errors as reported by the traefik pod. Fixing this, we got the following similar errors to be fixed in the same way - by adding permissions to read such resources in our ClusterRole that is coupled to a ServiceAccount via a ClusterRoleBinding. Fixing that, we got the errors about those resources not being known, so we had to install the associated CRDs as well. |
|
I can't really comment on the changes here, but I'm happy to test things out (either on PRs if necessary, or prior to a release) |
77d8fa3 to
2897184
Compare
|
@jcrist I just force pushed this with a rebase on master, fixing a trivial merge conflict in values.yaml |
|
Have you tested this locally at all? It generally looks good, but our k8s tests may not be sufficient to test everything. |
|
Not more than the traefik instance and so actually starts running properly with these changes and version 2.5. I've not created dask-gateway clusters via k8s etc, so if that is missing from tests then it could be relevant to check manually as well which I haven't. |
|
No, we do create clusters as part of the test. I'm mainly wondering if the traefik logs are filling up with errors or warnings or anything - sometimes traefik will complain about issues but still keep working. I'll need to do a full QA before release anyway to clean up some things, so I'll check this then. Fine to merge as is. Thanks! |
Ah, the logs didn't fill up with errors at least - I had also monitored the logs as I had to debug lots of errors before getting things functional to a running state. Thanks for review and merge!! |
I bumped Traefik to the latest version and unpinned the patch version of the image to
v2.5which I find reasonable at least until we have more regular releases.To make traefik start and function properly, I had to also install a few CRDs and grant Traefik permission to inspect those CRDs. Specifically, I've added the following CRDs and associciated read permissions (get, list, watch).