Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy the support chart to meom-ige #1320

Merged
merged 6 commits into from
May 26, 2022

Conversation

GeorgianaElena
Copy link
Member

@GeorgianaElena GeorgianaElena commented May 19, 2022

  • This adds config for deploying the support chart for meom-ige
  • Enables prometheus authentication
  • Removes the LoadBalancer + autohttps config since we'll be using nginx-ingress

The following manual steps are required:

  • deploy the support chart with:
    python3 deployer deploy-support meom-ige
  • retrieve the external IP address for the nginx-ingress load balancer.
    kubectl --namespace support get svc support-ingress-nginx-controller
  • point the dns record to this ip instead of proxy-public
  • delete the proxy-public service
  • do a deploy of the cluster
    python3 deployer deploy meom-ige
  • merge this PR

@2i2c-org/tech-team, can you please confirm if the manual steps mentioned above ⬆️ are ok please?

Ref: #1278 and #594

@sgibson91
Copy link
Member

Manual steps also sound reasonable to me! 🙌🏻

@GeorgianaElena
Copy link
Member Author

@2i2c-org/tech-team, note that I plan to run the manual steps today, as soon as there is no user on the hub.

@yuvipanda, I would really appreciate it if you could confirm whether the resource limits I chose for prometheus are sensible enough for this cluster 👀

@GeorgianaElena
Copy link
Member Author

@2i2c-org/tech-team, I believe I need some help debugging an issue during the support chart deploy.

The support-ingress-nginx-admission-create is hanging and I see this events:

 Normal   Scheduled               11m    gke.io/optimize-utilization-scheduler  Successfully assigned support/support-ingress-nginx-admission-create-xx5r7 to gke-meom-ige-cluster-core-pool-d987f2c6-jw1c
  Warning  FailedCreatePodSandBox  11m    kubelet                                Failed to create pod sandbox: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory"
  Normal   Pulled                  8m17s  kubelet                                Container image "docker.io/jettech/kube-webhook-certgen:v1.5.1" already present on machine

@yuvipanda
Copy link
Member

@GeorgianaElena can you try deleting the pod and trying again? I've seen similar issues caused by race conditions go away on restart.

@GeorgianaElena
Copy link
Member Author

Deleted that pod then ran the deploy again and now I'm getting a x509: certificate signed by unknown authority
from validate.nginx.ingress.kubernetes.io

@yuvipanda
Copy link
Member

yuvipanda commented May 26, 2022

@GeorgianaElena delete all pods in the support namespace? 😁

@choldgraf
Copy link
Member

Cattle not pets!

@GeorgianaElena
Copy link
Member Author

Deleted all, then ran the deploy-support again and same error. I'm pasting its full text here:

Error: UPGRADE FAILED: cannot patch "support-prometheus-server" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://support-ingress-nginx-controller-admission.support.svc:443/networking/v1beta1/ingresses?timeout=10s": x509: certificate signed by unknown authority
Traceback (most recent call last):

I'm keep seeing solutions online to delete the support-ingress-nginx-admission webhook. But there has to be a better way?

@yuvipanda
Copy link
Member

@GeorgianaElena if you delete the hook, helm should just recreate it next time, right? It's a job so might work ok.

@GeorgianaElena
Copy link
Member Author

Deleting the hook did the trick! I feel like I still need to wrap my head around how first approach in solving something in k8s world, most of the times seem to be to delete things until they come up healthy and until a certain dependency order is achieved 🤯

Thanks a lot for the help @yuvipanda <3

@choldgraf
Copy link
Member

For what it's worth @GeorgianaElena my strategy for debugging literally any issue in binder when we were first operating it was "Something wrong? Delete everything in kubernetes except for the helm installation" 😂

@GeorgianaElena
Copy link
Member Author

This is now done, so I'll merge it and then add this clustrer to the central grafana.

@GeorgianaElena GeorgianaElena merged commit 4db823c into 2i2c-org:master May 26, 2022
@GeorgianaElena
Copy link
Member Author

🚀 Thanks everybody

@github-actions
Copy link

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/workflows/deploy-hubs.yaml?query=branch%3Amaster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants