New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Applying cluster.yaml on v1.13.8: failed calling webhook "cephcluster-wh-rook-ceph-admission-controller-rook-ceph.rook.io": connect: connection refused #14116
Comments
My |
Operator shows no errors or warnings. |
@subhamkrai What are the steps to manually disable the admission controller? I can't seem to find it from previous issues. |
not exactly remember but setting the value
|
Thank you for your replies.
Are those supposed to be pods? I don't have any of those. I'm currently at v1.13.8: there is no |
https://github.com/rook/rook/blob/release-1.12/deploy/examples/operator.yaml#L509 it was there till 1.12 and in 1.13 we removed it.
|
@subhamkrai Thank you for pointing me in the right direction. I can see those resources:
So none of the ones you mentioned, or?
So no chance to set it to |
@maon-fp could you also share svc list in rook-ceoh namespace? |
Also could you share the top 10lines of rook operator pods logs |
Yes, of course. List of services:
First lines of operator log:
|
logs didn't help much but yeah delete the following resources in rook-ceph namespace(probably)
Also if you could share the -o yaml output of certificate and issue mentioned above to make sure that you are deleting the right resources. But yes we need to clean above three resources. |
rook-admission-controller-cert: apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
creationTimestamp: "2022-04-23T18:45:33Z"
generation: 1
name: rook-admission-controller-cert
namespace: rook-ceph
resourceVersion: "301286319"
uid: 22aa348f-e223-4f98-870e-aab4ef1f71a9
spec:
dnsNames:
- rook-ceph-admission-controller
- rook-ceph-admission-controller.rook-ceph.svc
- rook-ceph-admission-controller.rook-ceph.svc.cluster.local
issuerRef:
kind: Issuer
name: selfsigned-issuer
secretName: rook-ceph-admission-controller
status:
conditions:
- lastTransitionTime: "2022-04-23T18:45:34Z"
message: Certificate is up to date and has not expired
observedGeneration: 1
reason: Ready
status: "True"
type: Ready
notAfter: "2024-07-11T18:45:34Z"
notBefore: "2024-04-12T18:45:34Z"
renewalTime: "2024-06-11T18:45:34Z"
revision: 13 selfsigned-issuer: apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
creationTimestamp: "2022-04-23T18:45:32Z"
generation: 1
name: selfsigned-issuer
namespace: rook-ceph
resourceVersion: "138597982"
uid: 68162730-aade-4670-b830-1cf97005ef5c
spec:
selfSigned: {}
status:
conditions:
- lastTransitionTime: "2022-04-23T18:45:32Z"
observedGeneration: 1
reason: IsReady
status: "True"
type: Ready rook-ceph-admission-controller: apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2022-04-23T18:45:34Z"
name: rook-ceph-admission-controller
namespace: rook-ceph
resourceVersion: "214711462"
uid: b62cac4d-ce0c-4f3d-aa19-ff2f9d9d553c
spec:
clusterIP: 10.99.221.127
clusterIPs:
- 10.99.221.127
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- port: 443
protocol: TCP
targetPort: 9443
selector:
app: rook-ceph-operator
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {} |
I deleted those resources but still get (a slightly different) error:
I've also listed all resources in the namespace list_rook_ceph.txt and can find some admission controller resources: $ grep admission list_rook_ceph.txt
secret/rook-ceph-admission-controller kubernetes.io/tls 3 2y3d
secret/rook-ceph-admission-controller-token-s47d8 kubernetes.io/service-account-token 3 3y105d
serviceaccount/rook-ceph-admission-controller 1 3y105d |
try deleting the resources mentioned above |
As stated before: the resource are already deleted. But now it complains about: |
kubectl get validatingwebhookconfigurations -A (search this in all namespace once). Also I'm on holiday today so will look on Monday. Edit: I hope it's not something blocking you |
Thank you. Take your free time! I'm not really blocked.
|
I see the issue you need to delete the rook-ceph-webhook (I forgot that webhooks are cluster based resouce) also here is the code rook/pkg/operator/ceph/webhook-config.go Lines 258 to 282 in b32948c
|
Alright. I'm not into Go but I'll figure it out. Thank you for your help! |
Just to be 100% sure. Are you asking to run:
? I'm a bit worried as I can see 5 webhooks there. |
yess, delete rook-ceph-webhook only |
It worked. Thanks a lot for the quick and competent answers! 🙇 |
Good to know it is working now @maon-fp |
I've upgraded rook from v1.10.11 to v1.13.8 step by step (v1.10.11 -> v1.11.11 -> v1.12.11 -> v1.13.8). On https://rook.github.io/docs/rook/v1.13/Upgrade/rook-upgrade/ I've read that the admission controller is gone (which was enabled in my setup by
ROOK_DISABLE_ADMISSION_CONTROLLER: "false"
). So I changed this toROOK_DISABLE_ADMISSION_CONTROLLER: "true"
when still running v1.12.11.Upgrade to v1.13.8 went smoothly. Now I want to upgrade to Reef and try to apply the
cluster.yaml
. But this gives me:Environment:
Ubuntu 20.04.6 LTS (Focal Fossa)
uname -a
):5.15.0-105-generic
rook version
inside of a Rook Pod):v1.13.8
ceph -v
):ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
kubectl version
):v1.29.2
ceph health
in the Rook Ceph toolbox):HEALTH_OK
The text was updated successfully, but these errors were encountered: