Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: monitoring multiple clusters #266

Merged
merged 4 commits into from
Jan 24, 2024
Merged

Conversation

prometherion
Copy link
Contributor

@prometherion prometherion commented Nov 16, 2023

Closes #265

📑 Description

⚠️ This is PR that introduces breaking changes, please, read it carefully

  • A new non-mandatory API key has been introduced, k8sgpt.spec.kubeconfig: when defined and a valid kubeconfig is provided, the k8sgpt server will target the configured cluster
  • For each k8sgpt instance, a Deployment matching its name will be used
  • With that said, multiple Deployment means multiple pods that must be labelled with different selectors
  • For previous installations, the Sync function will fail unless the Deployment is manually deleted, since the labelSelector is an immutable field

✅ Checks

  • My pull request adheres to the code style of this project
  • My code requires changes to the documentation
  • I have updated the documentation as required
  • All the tests have passed

ℹ Additional Information

There's no need to bump the k8sgpt Custom Resource Definition version since there are just additions of fields, and not a reorg of them. Of course, to take full advantage of this feature, if the user has installed the Operator through Helm, they must manually update the generated Custom Resource Definition.

@prometherion prometherion requested review from a team as code owners November 16, 2023 20:58
@prometherion prometherion force-pushed the issues/265 branch 5 times, most recently from 3af4434 to af7786d Compare November 20, 2023 16:06
AlexsJones
AlexsJones previously approved these changes Nov 30, 2023
@AlexsJones
Copy link
Member

Overall looks good
A) I'd like to see an example in the readme and samples of how to use this
B) I think there needs to be some thought around how a user would add a kubeconfig after having an existing deployment, or removing one - and if there would be any impact or damage.

We will need to think about how to release this

@prometherion
Copy link
Contributor Author

Thanks for the review, Alex.

I just provided a brief overview of the feature and explained the use cases we're building on our infrastructure.

@AlexsJones
Copy link
Member

Apologies for the delay @prometherion if you can resolve conflicts we will merge this and inform the community of the change

AlexsJones
AlexsJones previously approved these changes Dec 29, 2023
@prometherion
Copy link
Contributor Author

Thanks, @AlexsJones, just rebased.

Copy link
Contributor

@matthisholleville matthisholleville left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution ! Very nice enhancement !

controllers/k8sgpt_controller.go Show resolved Hide resolved
@@ -273,6 +274,35 @@ func GetDeployment(config v1alpha1.K8sGPT) (*appsv1.Deployment, error) {
},
},
}
if outOfClusterMode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we standardize the code based on the Spec ?

if config.Spec.Kubeconfig {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we handle all these changes by updating the version of the k8sgpt resource

It depends on the API governance of the project, although sticking to the best practices of Kubernetes operators development, such as agreed on several Cluster API projects or Kubernetes itself, adding fields to a specification doesn't require a bump of the version since it's a feature addition that would require an update of the deployed Operator.

This would allow us to maintain two versions of the controller,

In operator development, it's highly discouraged to have two controllers for the same resource, even tho this has multiple API versions. Rather, a version is elected as Hub which is going to be the stored one in etcd, which will be used as "hub" to convert back and forth from other versions.

preventing any breaking changes and avoiding complicating the migration process

As I stated before, adding fields is not considered as a breaking change. The migration process can be achieved by leveraging the Operator SDK/controller-runtime machinery, such as the conversion webhook which will translate API objects between versions in a transparent mode for the end-user. Adding a conversion requires a lot of more code base, as well as webhooks management (which require CA handling): it should be put in place if and only if definitely required, this doesn't seem the case to me.

Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
Signed-off-by: Dario Tranchitella <dario@tranchitella.eu>
@prometherion
Copy link
Contributor Author

@AlexsJones I'd like to do my best to get this merged, and I tested it thoroughly.

In the case of a previous k8sgt-operator installation, the previous Deployment won't be modified since the change introduced with this PR requires a label selector 1:1 named with the K8sGPT instance.

With that said, although there are some errors in the logs of the operator, we're going to preserve the previous behaviour till the cluster administrator deletes manually the k8sgpt instance Deployment to let it deploy according to the new template.

Furthermore, since we have an additional field, there's no need to bump the CRD version since it's a backward-compatible change: I'm raising the seniority flag here since it's the same review comment I received from Vince on the AWS Cluster API provider.

Implementing a new CRD version would be an overkill since it would require a conversion webhook which requires a TLS certificate, also considering the state of the current version (v1alpha1) which accepts breaking changes, although this is not the case.

@arbreezy
Copy link
Member

It looks good and I agree @prometherion we don't need yet a conversion hook.
Let me test it a little bit more, due to recent changes been merged and I will update the review on Wednesday.
Thanks for your patience :)

Copy link
Member

@arbreezy arbreezy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; I think we are good to merge @prometherion @AlexsJones

apologies for the delay again !

pkg/resources/k8sgpt.go Show resolved Hide resolved
@arbreezy arbreezy merged commit 95a67a0 into k8sgpt-ai:main Jan 24, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Monitoring multiple clusters
4 participants