-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rack update stuck #422
Comments
If it helps, the full URL to the rack is https://console.convox.com/organizations/583f0d00-02a5-41e4-badf-1815b7623eda/racks#cea49617-08ca-4afe-a6b1-96635fbfaca7 |
Hello Nick, can you confirm in the EKS UI if all node groups and the cluster itself are with status Next time feel free to use our forum https://community.convox.com/ |
@heronrs Thanks for the quick reply. The EKS dashboard shows all node groups show as Ready, all workloads are status green, and the k8s version is 1.17. I checked the update log in the Convox Console and it's still showing the same state, stuck on that last line:
|
@nickfishman thanks for all the information. the lock error you see is at the console level so I removed the lock and you should be able to try the update again. |
(Apologies in advance if this isn't the right place to report this. Please let me know if there's a better place!)
This evening I tried to update a v3 rack running a fairly old version (3.0.38). I first ran
convox rack update 3.0.54
to bring it to the latest version running k8s 1.17. This ran quickly and succeeded without issues.I then ran
convox rack update 3.2.5
(last version before k8s 1.19). Unfortunately, this update has been stuck for several hours with no new updates. Here are the last lines from the terraform run that show up in the https://console.convox.com logs:It's been like this for several hours. It looks like the rack is in a stuck state as well:
After several hours, I tried a variety of approaches to try to unstick the update (including killing the underlying EC2 instances) but none have been successful.
How can I cancel and retry this update? Is there something I can do to ensure the update succeeds next time?
For reference, I am able to run
kubectl
and run various k8s commands according to the docs here: https://docs.convox.com/management/direct-k8s-access/. I also have access to the EKS cluster info through the AWS web console (I followed the instructions at https://community.convox.com/t/resolved-how-can-i-get-permission-to-access-the-eks-cluster-from-the-aws-console/828/2).The text was updated successfully, but these errors were encountered: