Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Scheduler Incompatible with Kubernetes 1.27 #3128

Closed
xrl opened this issue May 30, 2023 · 5 comments
Closed

Custom Scheduler Incompatible with Kubernetes 1.27 #3128

xrl opened this issue May 30, 2023 · 5 comments
Labels

Comments

@xrl
Copy link

xrl commented May 30, 2023

Bug description

AWS EKS recently released Kubernetes 1.27 for general availability. I upgraded and today jupyterhub users were saying their singleuser notebook servers couldn't boot. The pods appeared in the correct namespace but never left Pending, no Events attached to debug or anything.

Expected behaviour

Notebook servers boot on 1.27.

Actual behaviour

Turns out the pods used a custom scheduler and that scheduler had a unrecoverable error:

W0530 17:41:37.219051       1 reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0530 17:41:37.219068       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource

which makes sense, on Kubernetes 1.27 those APIs have been upgraded:

W0530 17:41:37.219051       1 reflector.go:324] k8s.io/client-go/informers/factory.go:134: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource
E0530 17:41:37.219068       1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.CSIStorageCapacity: failed to list *v1beta1.CSIStorageCapacity: the server could not find the requested resource

after manually patching the scheduler deployment to use 1.27.1:

    image: registry.k8s.io/kube-scheduler:v1.27.1

and that had another error about lease type, fixed by setting the scheduler configmap to change endpoints to leases:

data:         
  config.yaml: |
    apiVersion: kubescheduler.config.k8s.io/v1beta3
    kind: KubeSchedulerConfiguration
    leaderElection:
      resourceLock: leases # used to be endpoints
      resourceName: jupyterhub-user-scheduler-lock
      resourceNamespace: "datascience"

How to reproduce

  1. Use Kubernetes 1.27

  2. Run the latest stable helm chart with:

    helm upgrade --install
    jupyter-hub jupyterhub/jupyterhub
    --namespace datascience
    --version 2.0.0
    --values config.yaml

Your personal set up

EKS 1.27

@xrl xrl added the bug label May 30, 2023
@manics
Copy link
Member

manics commented May 30, 2023

Is this bug still present in the latest dev release of this chart?
https://hub.jupyter.org/helm-chart/#development-releases-jupyterhub

@consideRatio
Copy link
Member

Yeah this is fixed in latest dev release of the chart, there are breaking changes to see in #3113

@xrl
Copy link
Author

xrl commented May 31, 2023

Yep, fixed! Closing the ticket. Hopefully this ticket helps folks when searching the issue. My invocation:

helm upgrade --install \
  jupyter-hub jupyterhub/jupyterhub \
  --namespace datascience  \
  --version 3.0.0-0.dev.git.6175.hf9af31a3 \
  --values config.yaml

and I was off to the races.

@xrl xrl closed this as completed May 31, 2023
@seanturner026
Copy link

@consideRatio Any insight when this change will be implemented in a non-dev release of the chart?

@consideRatio
Copy link
Member

3.0.0-beta.1 is out, changelog written for it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants