Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rack update stuck #422

Closed
nickfishman opened this issue Feb 23, 2022 · 4 comments
Closed

Rack update stuck #422

nickfishman opened this issue Feb 23, 2022 · 4 comments
Assignees

Comments

@nickfishman
Copy link

(Apologies in advance if this isn't the right place to report this. Please let me know if there's a better place!)

This evening I tried to update a v3 rack running a fairly old version (3.0.38). I first ran convox rack update 3.0.54 to bring it to the latest version running k8s 1.17. This ran quickly and succeeded without issues.

I then ran convox rack update 3.2.5 (last version before k8s 1.19). Unfortunately, this update has been stuck for several hours with no new updates. Here are the last lines from the terraform run that show up in the https://console.convox.com logs:

module.system.module.rack.module.api.data.aws_iam_policy_document.assume_api: Refreshing state...
module.system.module.rack.module.router.module.nginx.kubernetes_config_map.nginx-configuration: Refreshing state... [id=convoxprod-system/nginx-configuration]
module.system.module.rack.module.router.module.nginx.kubernetes_config_map.tcp-services: Refreshing state... [id=convoxprod-system/tcp-services]
module.system.module.rack.module.router.module.nginx.kubernetes_config_map.udp-services: Refreshing state... [id=convoxprod-system/udp-services]
module.system.module.rack.module.router.module.nginx.kubernetes_horizontal_pod_autoscaler.router: Refreshing state... [id=convoxprod-system/nginx]
module.system.module.rack.module.router.module.nginx.kubernetes_cluster_role_binding.ingress-nginx: Refreshing state... [id=ingress-nginx]
module.system.module.rack.module.router.module.nginx.kubernetes_deployment.ingress-nginx: Refreshing state... [id=convoxprod-system/ingress-nginx]

It's been like this for several hours. It looks like the rack is in a stuck state as well:

$ cx rack params
ERROR: state is locked for rack: <rackname>

After several hours, I tried a variety of approaches to try to unstick the update (including killing the underlying EC2 instances) but none have been successful.

How can I cancel and retry this update? Is there something I can do to ensure the update succeeds next time?

For reference, I am able to run kubectl and run various k8s commands according to the docs here: https://docs.convox.com/management/direct-k8s-access/. I also have access to the EKS cluster info through the AWS web console (I followed the instructions at https://community.convox.com/t/resolved-how-can-i-get-permission-to-access-the-eks-cluster-from-the-aws-console/828/2).

@nickfishman
Copy link
Author

@heronrs
Copy link
Contributor

heronrs commented Feb 23, 2022

Hello Nick, can you confirm in the EKS UI if all node groups and the cluster itself are with status Active and also what's the current k8s version it's displaying?

Next time feel free to use our forum https://community.convox.com/

@nickfishman
Copy link
Author

@heronrs Thanks for the quick reply.

The EKS dashboard shows all node groups show as Ready, all workloads are status green, and the k8s version is 1.17.

I checked the update log in the Convox Console and it's still showing the same state, stuck on that last line:

module.system.module.rack.module.router.module.nginx.kubernetes_deployment.ingress-nginx: Refreshing state... [id=convoxprod-system/ingress-nginx]

@heronrs
Copy link
Contributor

heronrs commented Feb 24, 2022

@nickfishman thanks for all the information. the lock error you see is at the console level so I removed the lock and you should be able to try the update again.
Meanwhile, we'll investigate what might have happened, although I can't give you an ETA sadly.
I'm closing this for now but if you still experience problems, feel free to create a thread in our forum https://community.convox.com/

@heronrs heronrs closed this as completed Feb 24, 2022
@heronrs heronrs self-assigned this Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants