New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploy support chart for the openscapes hub #827
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although I think there are two follow-ups:
- DNS entry for the grafana DNS you picked: Setup prometheus + grafana for carbonplan #533 (comment)
- The grafana dashboards you mentioned
Thanks @damianavila! :D
Waa, didn't know we had steps for this written somewhere! This is great <3! TY! Update: I'm manually deploying the support chart now! |
@damianavila, I deployed the support chart manually, but finished with a timeout error.
Any ideas on how to spin up a node for the support pods? |
Mmm... I was expecting most of the support-related pods to be scheduled on the master node that also serves as the core node in our kops-based deployments. @yuvipanda might have more details/thoughts/ideas... Also, wondering if we would face this one as well: #594 |
Btw, if things get really complicated here, my inclination would be to default to the current stability and keep the status quo even when that means no support (nor grafana) stuff in the openscapes cluster. We are super close to the event start date and changes should be as minimal and simple as possible if we want to avoid surprises during the event... |
Yeah, you definitely need the same extra tolerations the core dns pod gets on all the pods we schedule in the master nodes. Might help to timebox the support node deployment setup. You don't have to use the nginx ingress for the hubs, so the load balancer can just stay for the hubs - we don't have to touch that. However, we must make sure that the extra pods being present on the master node - particularly prometheus - doesn't put undue extra stress on the master node components. So let's make sure they have cpu and memory limits set, and maybe are on a node by themselves? |
But let's think of this as 'adding Prometheus and grafana' rather than 'migrating a kops based deployment to use the support chart'. Let's migrate off kops soon instead! |
From these conversations, how about this proposal: Proposal
|
This will definitely take more than 1 hour, IMHO (in fact, I have already spent more than 1 hour looking at some of these details). And, more importantly, implies modifications bigger enough I am not conformable with pushing forward on a Friday evening before the Monday start of the event. I would suggest closing this PR now and focusing on the transition to EKS after the event. |
This sounds like a good plan to me. I agree this feels like we are doing too many things "for the first time and with uncertainty about whether it'll work" just before an event... |
Closing this one now to avoid confusion about what things are really active for the upcoming event. Thanks for opening the PR, @GeorgianaElena!! |
After this PR is merged and the support chart is deployed, we should manually trigger the grafana deployer action in order to have the grafana dashboards available for this hub.
Ref: #810
Note:
(from jupyterhub/grafana-dashboards)
NOTE: ANY CHANGES YOU MAKE VIA THE GRAFANA UI WILL BE OVERWRITTEN NEXT TIME YOU RUN deploy.bash. TO MAKE CHANGES, EDIT THE JSONNET FILE AND DEPLOY AGAIN
So just a heads up that if we deploy the dashboards through the action, any changes made to the other's hub grafanas will be ovewriten.