Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Metric and Lifecycle into two distinct operators #823

Closed
11 of 13 tasks
thisthat opened this issue Feb 13, 2023 · 4 comments
Closed
11 of 13 tasks

Split Metric and Lifecycle into two distinct operators #823

thisthat opened this issue Feb 13, 2023 · 4 comments
Labels
enhancement New feature or request epic
Milestone

Comments

@thisthat
Copy link
Member

thisthat commented Feb 13, 2023

Goal

Split the current operator into two separate operators, one for each CRD group.

Technical Details

There are two main reasons why this change is beneficial for end users:

  • we can scale resources/replicas independently
  • if we use only one feature and not the other

Furthermore, we can restrict the permissions for each operator with more fine granularity.

This split can cause issues during upgrades. In the case the new metric operator is up-and-running while the old single-operator runs, there are two controllers on the same resource that try to write. Bumping the Metric CRD version should suffice to avoid this problem.

General action plan

  • split operators
  • remove KeptnEvaluationProvider in favor of having a single provider that points to the metrics operator
  • create new KeptnMetricsProvider CRD which references external o11y providers (prometheus, dynatrace,...) in the new metrics operator
  • move KeptnMetrics to metrics operator
  • have KLT request metric values from the KeptnMetrics CR

Acceptance Criteria

DoD

  • Metric CRDs version is bumped to v1alpha2
  • Two operators are available
  • Tests are adapted

Tasks

  1. enhancement
    odubajDT
  2. enhancement
    bacherfl
  3. enhancement
    odubajDT
  4. enhancement
    odubajDT
  5. enhancement
    bacherfl
  6. enhancement
    odubajDT
  7. enhancement
    mowies
  8. thschue
  9. enhancement
    thisthat
@thisthat thisthat added the enhancement New feature or request label Feb 13, 2023
@thisthat thisthat added this to the 0.7 milestone Feb 13, 2023
@thisthat thisthat added the status: ready-for-refinement Issue is relevant for the next backlog refinment label Feb 14, 2023
@bacherfl
Copy link
Member

Looking into this I found some questions that we need to address:

  • With the KeptnEvaluationProvider CRD being used by both of the operators, we probably need to move this into the new metrics-operator as well to avoid cyclic dependencies between both operators (since the previous operator will also refer to KeptnMetric CRDs). The problem here is however that the KeptnEvaluationProvider CRD is part of the lifecycle api group, so we'd have part of this group in the new metrics-operator as well, which feels unclean. Probably we'll have to move this into a common package, but given that the EvaluationProvider structs represent a CRD that's set up with kubebuilder, we'll have to find a best practise for sharing CRDs between the two modules containing the operators using them

@mowies
Copy link
Member

mowies commented Feb 14, 2023

Sharing CRDs is generally not a best practise I think 😄
But didn't we plan to have some generic endpoint in the metrics operator to query metrics from? I think that would also be the place for KLT to fetch metrics from. And then the rest is handled in the metrics operator. That way we could achieve clear separation of concerns.

In any case I would suggest to keep API groups in one operator and not split them over 2 operators.

@thisthat
Copy link
Member Author

If we can't easily have a shared "CRD", I agree with @mowies. The Providers shall be moved into the metric-operator. The lifecycle-operator should only use keptn-metric for evaluations.
However, we have a shared read and not write of the KeptnEvaluationProvider

@bacherfl
Copy link
Member

I think there was a plan for retrieving metrics from that generic endpoint. However, right now we also support KeptnEvaluationDefinitions referring to e.g. Prometheus and Dynatrace EvaluationProviders directly as well (i.e. not only the KeptnMetrics). So the problem is that in the metrics.KeptnMetrics we currently depend on lifecycle.KeptnEvaluationProvider, which is also a dependency of lifecycle.KeptnEvaluationDefinition.

In my opinion, the cleanest approach would be to introduce something like metrics.KeptnMetricsProvider that can be used by the metrics.KeptnMetrics controller, and only refer to metrics.KeptnMetrics in the lifecycle.KeptnEvaluations - here we could then also go via the generic endpoint, as @mowies suggested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request epic
Projects
Archived in project
Development

No branches or pull requests

3 participants