Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy mctc to the cluster via our internal ArgoCD instance #14

Closed
Tracked by #12
david-martin opened this issue Jan 24, 2023 · 11 comments
Closed
Tracked by #12

Deploy mctc to the cluster via our internal ArgoCD instance #14

david-martin opened this issue Jan 24, 2023 · 11 comments
Assignees

Comments

@david-martin
Copy link
Contributor

No description provided.

@david-martin david-martin mentioned this issue Jan 24, 2023
6 tasks
@david-martin david-martin changed the title Deploy mctc to the cluster via our internal instance Deploy mctc to the cluster via our internal ArgoCD instance Jan 24, 2023
@roivaz roivaz self-assigned this Jan 26, 2023
@roivaz
Copy link
Contributor

roivaz commented Jan 31, 2023

@david-martin I have a some questions regarding the deployment of mctc in our HCG cluster:

  • The controller currently uses defaultCtrlNS = "argocd", so I'm assuming argocd needs to be installed also in the HCG cluster as part of the service, is this right?
  • Do we want to re-use the DNS zones and Let's Encrypt accounts that glbc was using in usntable/stable in kcp?

@david-martin
Copy link
Contributor Author

  • The controller currently uses defaultCtrlNS = "argocd", so I'm assuming argocd needs to be installed also in the HCG cluster as part of the service, is this right?

I suspect the reason for this is a for simplicity of setup and reuse of argocd secrets.
There should be no hard dependency on argocd.
It might be good practice to have mctc in a different namespace to avoid accidental dependency.

  • Do we want to re-use the DNS zones and Let's Encrypt accounts that glbc was using in usntable/stable in kcp?

Yes, for simplicity and for LE account re-usage (limits are tied to the account)

@roivaz
Copy link
Contributor

roivaz commented Feb 3, 2023

  • cert-manager installed using kustomize
  • initial deploy of mctc to "unstable" namespace. Things to work on:
    • gateway-api is a requirement
    • cert-manager issuer is hardcoded, need to parameterize this

@roivaz
Copy link
Contributor

roivaz commented Feb 9, 2023

@david-martin @maleck13 @pmccarthy I have been progressing with the deployment of the unstable environment to the HCG cluster and there's already an initial setup. It does not work yet as there are still things to fix, but I wanted to share some thoughts about some problems that we will have if we run both stable and unstable instances in the same cluster:

A couple of thoughts about having two instances of HCG running in the same cluster:

  • cert-manager needs to be shared among both instances. This is not much of a concern as we can have different Issuers/ClusterIssuers. The only downside is that we need to use the same cert-manager release for both envs, so no way of testing a cert-manager upgrade in unstable first.
  • We will face the same problem with gateway API, we can only have one installation per cluster. This case is probably more problematic than cert-manager, being gateway api a less mature technology that will change more over time.
  • The mctc CRDs are shared and there is no way around it. An upgrade to unstable could potentially break stable if the changes to the API are not backwards compatible.
  • The mctc controller right now watches over all namespaces, which means both controllers would be watching over the same set of cluster secrets, ingress, etc. I'm not sure if the idea is keeping the controller with cluster watch or change it to watch over specific namespaces. Let me know about it.
  • Last, I'm not sure what is the webhook setup right now as I think it is being worked on, so I'm not sure if that point could be also problematic.

We could just deploy unstable for the time being and decide later what to do, but to me, specially the shared CRD item, calls for the need of having one cluster per environment.

@david-martin
Copy link
Contributor Author

just deploy unstable for the time being

+1 to this.

  • The mctc controller right now watches over all namespaces, which means both controllers would be watching over the same set of cluster secrets, ingress, etc. I'm not sure if the idea is keeping the controller with cluster watch or change it to watch over specific namespaces. Let me know about it.

The controller could be changed to watch for secrets in specific namespaces so there is no overlap.
Regardless, the model will soon change where a syncer component talks back to the control plane, so those cluster secrets won't be used.

@maleck13
Copy link
Contributor

maleck13 commented Feb 9, 2023

@roivaz Agree on the items you outlined. I think a second cluster makes sense but for now we can just deploy unstable as you suggest

@roivaz
Copy link
Contributor

roivaz commented Feb 10, 2023

@maleck13 @david-martin I'm reviewing the errors I get in the mctc controller logs and there is a lot of noise in there caused by the kcp-glbc envs: there are secrets in the cluster pointing to those kcp api endpoints and mctc is trying to pick them up and failing miserably (or at least I think that's the problem ...). This might be a good time to do some cleanup, it would help me debug what's going on and get the mctc-unstable env to a healthier status. Any reason to not undeploy the kcp-glbc envs and do some cleanup of other test namespaces (some also contain cluster secrets). I would also suggest removing ACM if we are not using it right now as it also deploys some cluster secrets and stuff. We can always bring back anything we need in the future.

@david-martin
Copy link
Contributor Author

Any reason to not undeploy the kcp-glbc envs

No. That makes sense.

removing ACM if we are not using it right now

Agreed. It has served it's purpose already.

@roivaz
Copy link
Contributor

roivaz commented Feb 10, 2023

Cleaned up ACM and kcp-glbc envs and some empty demo namespaces. I haven't deleted the kcp-* namespaces in case any cleanup is required first/also in the kcp api servers.

@roivaz
Copy link
Contributor

roivaz commented Feb 10, 2023

TODO:

  • Fix the webhook setup after the changes done in mctc upstream repo
  • Parameterize cert-manager issuer. Also, go back to a clusterissuer now that only unstable env wil live in the cluster (at least for some time).

@maleck13 maleck13 modified the milestones: m3 Control Plane Tenancy and Policy, m2 Control Plane Gateways Mar 22, 2023
@roivaz
Copy link
Contributor

roivaz commented Mar 23, 2023

@david-martin david-martin closed this as not planned Won't fix, can't repro, duplicate, stale May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants