Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Collecting metrics from kapp-controller to observe cluster's app resources #789

Open
kartiklunkad26 opened this issue Jul 14, 2022 · 6 comments
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed documentation This issue indicates a change to the docs should be considered good first issue An issue that will be a good candidate for a new contributor priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.

Comments

@kartiklunkad26
Copy link

Context
Users want to monitor the deployed app resources in their cluster, and be able to consume the metrics in their dashboard tool of choice (for eg; Prometheus)

We have the capability for it, but haven't exposed a meaningful of way of a user to use them.

Acceptance Criteria
Given I want to see a the metrics about app resources in my dashboard
Then I can follow a guide on carvel.dev to collect those metrics

@kartiklunkad26 kartiklunkad26 added the carvel-triage This issue has not yet been reviewed for validity label Jul 14, 2022
@joe-kimmel-vmw
Copy link
Contributor

Thanks for filing this issue @kartiklunkad26! Can you give us some specific examples of how you're using kapp-controller metrics? I think we will welcome you to contribute on this topic

@joe-kimmel-vmw joe-kimmel-vmw added good first issue An issue that will be a good candidate for a new contributor carvel-accepted This issue should be considered for future work and that the triage process has been completed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed carvel-triage This issue has not yet been reviewed for validity labels Jul 18, 2022
@ProNibs
Copy link

ProNibs commented Aug 14, 2022

I was thinking about a similar thing:

I would love to be able to make a Grafana Dashboard that displays every App I have in my cluster and its status. However, the only port(s) used currently in the deployment of kapp-controller are strictly for the kubernetes api-server.

Looks like there is already App metrics configured to be collected in the prometheus format in the app_metrics.go file -- however, I cannot get Prometheus to scrape them as they are not externally exposed in any way that I know about at least.

@joe-kimmel-vmw
Copy link
Contributor

Thanks @ProNibs -

a Grafana Dashboard that displays every App I have in my cluster and its status

can you help us understand your goals with this dashboard? who is going to refer to it, when, and why? What kinds of questions will it help them answer?

@ProNibs
Copy link

ProNibs commented Aug 18, 2022

The dashboard would mainly be for Kubernetes Operators to see the overall health without having to go through multiple clusters and running a kubectl get apps -A each time and to see trends.

Most App CRDs are based off either OCI pulling or Git pulling, so the wrong push could break one or many App CRDs or could be everything depending on the OCI registry is down indicating that it's down.

ArgoCD has a nice dashboard they provide which gives insight into how often items are in flux over time and would like something similar: https://grafana.com/grafana/dashboards/14584-argocd/

We've had cases where App CRDs are sitting broken for a long time without people knowing as we didn't check that cluster often.

@neil-hickey neil-hickey added the documentation This issue indicates a change to the docs should be considered label Feb 22, 2023
@sjentzsch
Copy link

As mentioned in carvel Slack (https://kubernetes.slack.com/archives/CH8KCCKA5/p1688053039988119) we did build around it by using kube-state-metrics to build metrics based on the kapp-controller custom resources (https://github.com/kubernetes/kube-state-metrics/blob/main/docs/customresourcestate-metrics.md) and came up with the following dashboard.

The custom-resource-state-config used for kube-state-metrics looks as follows:

            - --custom-resource-state-config
            - |
              spec:
                resources:
                  - groupVersionKind:
                      group: packaging.carvel.dev
                      version: "*"
                      kind: PackageRepository
                    labelsFromPath:
                      name: [metadata, name]
                      namespace: [metadata, namespace]
                    metricNamePrefix: kapp
                    metrics:
                      - name: packagerepository_info
                        help: PackageRepository info on status and fetch target
                        each:
                          type: Info
                          info:
                            labelsFromPath:
                              type: [status, conditions, "0", type]
                              status: [status, conditions, "0", status]
                              oci_image: [spec, fetch, imgpkgBundle, image]
                              git_url: [spec, fetch, git, url]
                              git_ref: [spec, fetch, git, ref]
                  - groupVersionKind:
                      group: packaging.carvel.dev
                      version: "*"
                      kind: PackageInstall
                    labelsFromPath:
                      name: [metadata, name]
                      namespace: [metadata, namespace]
                    metricNamePrefix: kapp
                    metrics:
                      - name: packageinstall_info
                        help: PackageInstall info on status and fetch target
                        each:
                          type: Info
                          info:
                            labelsFromPath:
                              type: [status, conditions, "0", type]
                              status: [status, conditions, "0", status]
                              package_name: [spec, packageRef, refName]
                              package_version: [status, version]
                  - groupVersionKind:
                      group: kappctrl.k14s.io
                      version: "*"
                      kind: App
                    labelsFromPath:
                      name: [metadata, name]
                      namespace: [metadata, namespace]
                    metricNamePrefix: kapp
                    metrics:
                      - name: app_info
                        help: App info on status and fetch target
                        each:
                          type: Info
                          info:
                            labelsFromPath:
                              type: [status, conditions, "0", type]
                              status: [status, conditions, "0", status]
                              oci_image: [spec, fetch, "0", imgpkgBundle, image]
                              git_url: [spec, fetch, "0", git, url]
                              git_ref: [spec, fetch, "0", git, ref]

Note that it might be a bit opinionated, as e.g. for the App we fetch data only from the first fetch list item (as in our case, we only have one). Also, for this to work, you have to extend the RBAC of the kube-state-metrics SA with the following permissions to fetch data in the first place:

  - apiGroups: ["packaging.carvel.dev"]
    resources:
    - packagerepositories
    - packageinstalls
    verbs: ["list", "watch"]
  - apiGroups: ["kappctrl.k14s.io"]
    resources:
    - apps
    verbs: ["list", "watch"]
  - apiGroups: ["apiextensions.k8s.io"]
    resources:
    - customresourcedefinitions
    verbs: ["list", "watch"]

image

@cppforlife
Copy link
Contributor

x-link to this repo from @vrabbi: https://github.com/vrabbi-tap/tap-kube-state-metrics (values yml contains similar pattern of CRs status extraction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
carvel-accepted This issue should be considered for future work and that the triage process has been completed documentation This issue indicates a change to the docs should be considered good first issue An issue that will be a good candidate for a new contributor priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done.
Projects
Status: Unprioritized
Development

No branches or pull requests

6 participants