Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helm prometheus metrics support and handle MultiNamespace scenario for metrics #2603

Conversation

camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Feb 27, 2020

Description of the change:

  • Add support for Metrics with MultiNamespace scenario: the kubemetrics are exports from the operator namespace by default, however, if be the scenario where the WATCH_NAMESPACE has a list of namespaces then all will be used to export the metrics
  • Add Prometheus metrics support to Helm-based operators. (Create the ServiceMonitor as should be)
  • Changed the default scaffold to export metrics in order to allow this and fix issues found as clean the code by centralizing the logic used
  • Keep the Ansible, Helm and GO with the same logic to export metrics

Motivation for the change:

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 27, 2020
@camilamacedo86 camilamacedo86 changed the title Improvements in the scaffolded serveCRMetrics Metrics for MultiNamespace scenario Feb 27, 2020
camilamacedo86 pushed a commit to camilamacedo86/operator-sdk that referenced this pull request Feb 27, 2020
camilamacedo86 pushed a commit to camilamacedo86/operator-sdk that referenced this pull request Feb 27, 2020
@camilamacedo86 camilamacedo86 removed the request for review from lilic February 27, 2020 12:31
camilamacedo86 pushed a commit to camilamacedo86/operator-sdk that referenced this pull request Feb 27, 2020
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 27, 2020
@camilamacedo86 camilamacedo86 added kind/feature Categorizes issue or PR as related to a new feature. kind/bug Categorizes issue or PR as related to a bug. labels Feb 27, 2020
@camilamacedo86 camilamacedo86 added kind/bug Categorizes issue or PR as related to a bug. and removed kind/bug Categorizes issue or PR as related to a bug. labels Feb 28, 2020
@openshift-ci-robot openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 6, 2020
@camilamacedo86 camilamacedo86 removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 10, 2020
@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Mar 10, 2020

Hi @lilic,

Really tks for your review. All your suggestions are addressed, feel free to check it again.

@camilamacedo86 camilamacedo86 changed the title Fix Helm metrics implementation and add support for Metrics handle the MultiNamespace scenario Add helm prometheus metrics support and handle MultiNamespace scenario for metrics Mar 10, 2020
@camilamacedo86 camilamacedo86 removed kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. labels Mar 10, 2020
Copy link
Member

@jmrodri jmrodri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2020
// Generate and serve custom resource specific metrics.
err = kubemetrics.GenerateAndServeCRMetrics(cfg, ns, filteredGVK, metricsHost, operatorMetricsPort)
if err != nil {
return err
}
return nil
}

// getNamespacesForMetrics wil return all namespaces which will be used to export the metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing something, why is this function duplicated here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmd_test.go needs to be == cmd.go. It will verify the scaffold impl.

Copy link
Member

@fabianvf fabianvf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a lot of the metrics code is duplicated, is there a way we can centralize the pieces of logic that are shared to make this easier to maintain in the future? If that sort of refactor is out of scope I'd still like to track it somewhere

@@ -285,3 +283,20 @@ func getAnsibleDebugLog() bool {
}
return val
}

// getNamespacesForMetrics wil return all namespaces which will be used to export the metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can centralize this logic? It seems like it's duplicated several times

Copy link
Contributor Author

@camilamacedo86 camilamacedo86 Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to the lib.

pkg/helm/run.go Outdated
}
return nil
}

// getNamespacesForMetrics wil return all namespaces which will be used to export the metrics
func getNamespacesForMetrics(operatorNs string) ([]string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fourth time this is implemented

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same implementation to generate the metrics is used for Ansible, Helm and GO.
The other is the tests which are updated with the changes made.

// Generate and serve custom resource specific metrics.
err = kubemetrics.GenerateAndServeCRMetrics(cfg, ns, filteredGVK, metricsHost, operatorMetricsPort)
if err != nil {
return err
}
return nil
}

// getNamespacesForMetrics wil return all namespaces which will be used to export the metrics
func getNamespacesForMetrics(operatorNs string) ([]string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to a library function and just call it repeatedly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


// serveCRMetrics gets the Operator/CustomResource GVKs and generates metrics based on those types.
// It serves those metrics on "http://metricsHost:operatorMetricsPort".
func serveCRMetrics(cfg *rest.Config, operatorNs string, gvks []schema.GroupVersionKind) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the implementation of this function vary by operator type or can we use a single implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is quite similar for all types.
However, Ansible and Helm will pass the gvks when in the GO we will filter it.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2020

// Generate metrics from the WATCH_NAMESPACES value if it contains multiple namespaces
if strings.Contains(watchNamespace, ",") {
ns = strings.Split(watchNamespace, ",")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to always export metrics in the operatorNs or is it intentional that we leave it out if specific watchNamespaces are set?

Copy link
Contributor Author

@camilamacedo86 camilamacedo86 Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WATCH_NAMESPACES can be the operator namespace (namespaced-scope), empty in the case of cluster-scoped or the multi namespace scenario.

So, the idea here is:
By default export the metrics from the operator namespace unless it is the multi namespace which is the case that we know the List of namespaces that the operator will be dealing wth.

Copy link
Member

@fabianvf fabianvf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2020
CHANGELOG.md Outdated Show resolved Hide resolved
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 10, 2020
@openshift-ci-robot
Copy link

New changes are detected. LGTM label has been removed.

@camilamacedo86 camilamacedo86 merged commit 829bfa2 into operator-framework:master Mar 10, 2020
@camilamacedo86 camilamacedo86 deleted the improve-metrics-namespaces branch March 10, 2020 16:38
camilamacedo86 added a commit that referenced this pull request Mar 10, 2020
…lded files. #2625

- Add info over the changes required in the default scaffold done in the `main.go` file to address bug fixes and improvements made in metrics. Related to PR's #2606, #2603 and #2601
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants