New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Rancher can no longer provision harvester machines after restart #44912
Comments
maybe related to #44929 ? |
Seems to occur even after the fix for #44929 , both on scaling and creating a new cluster |
And I am on rancher v2.8.2 |
Looking at the created job (for a worker node scaleup):
the first thing the driver tries is to delete the non-exisiting pod and fails.... I would expect a create instead. I just don't know in where this command is generated |
I could manually fix it:
|
@bpedersen2 do you have rancher running inside a nested VM or in the same kubernetes cluster of Harvester itself? |
Following the manual fix steps by getting the kubeconfig and manually updating the secret in Rancher worked for me! |
No, it is running standalone. |
What I observe is that the token in harvester changes. Rancher is configured to use OIDC, and in the rancher logs I get
With a local user, it seems to work |
I reregistred the harvester cluster using a non-oidc admin account, now the connections seems to be stable again. It looks like a problem with token expiration to me. |
I have the same problem: Failed creating server [fleet-default/rke2-rc-control-plane-2aae5bdf-2m48z] of kind (HarvesterMachine) for machine rke2-rc-control-plane-5b74797746x4dpcs-ncdxf in infrastructure provider: CreateError: Downloading driver from https://HOST/assets/docker-machine-driver-harvester Doing /etc/rancher/ssl docker-machine-driver-harvester docker-machine-driver-harvester: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped Trying to access option which does not exist THIS ***WILL*** CAUSE UNEXPECTED BEHAVIOR Type assertion did not go smoothly to string for key Running pre-create checks... Error with pre-create check: "the server has asked for the client to provide credentials (get settings.harvesterhci.io server-version)" The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag. Rancher v2.8.2 |
I have loop for many hours: |
OK that's worked for me. I have Rancher with users provided by Active Directory. |
Now i have this error: Failed deleting server [fleet-default/rke2-rc-control-plane-3fba9236-dxptf] of kind (HarvesterMachine) for machine rke2-rc-control-plane-77f9455c9dx9xgsk-4kcwf in infrastructure provider: DeleteError: Downloading driver from https://HOST/assets/docker-machine-driver-harvester Doing /etc/rancher/ssl docker-machine-driver-harvester docker-machine-driver-harvester: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped About to remove rke2-rc-control-plane-3fba9236-dxptf WARNING: This action will delete both local reference and remote instance. Error removing host "rke2-rc-control-plane-3fba9236-dxptf": the server has asked for the client to provide credentials (get virtualmachines.kubevirt.io rke2-rc-control-plane-3fba9236-dxptf) |
Hi, |
I am on harvester 1.2.1 and rancher 2.8.3 ( and waiting for 1.2.2 to be able to upgrade to 1.3.x eventually) |
Rancher Server Setup
Information about the Cluster
User Information
Describe the bug
After one of my harvester nodes was unexpected rebooted, rancher is no longer able to provision machines in the upstream harvester HCI infrastructure.
Trying to scale up an existing managed RKE2 cluster from rancher gets the following error:
And creating a brand new cluster has a different error:
Looks like the connection between Rancher and Harvester is broken?
The text was updated successfully, but these errors were encountered: