Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics to show the current GitRepository #3106

Closed
1 task done
qudongfang opened this issue Sep 13, 2022 · 8 comments
Closed
1 task done

metrics to show the current GitRepository #3106

qudongfang opened this issue Sep 13, 2022 · 8 comments

Comments

@qudongfang
Copy link

qudongfang commented Sep 13, 2022

Describe the bug

People may flip GitRepository::spec.ref.branch to a dev branch to test/verify something and forget to revert back after.

It would be great if flux2 can export metrics to show the current GitRepository::spec.ref.branch.
and then we can alert(warn) if GitRepository::spec.ref.branch is not on main/master/prod branch.

Steps to reproduce

N/A

Expected behavior

N/A

Screenshots and recordings

No response

OS / Distro

N/A

Flux version

N/A

Flux check

N/A

Git provider

No response

Container Registry provider

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@makkes
Copy link
Member

makkes commented Sep 13, 2022

The GitRepository manifest should rather itself live in Git and be protected by e.g. PRs or MRs.

@kingdonb
Copy link
Member

kingdonb commented Sep 15, 2022

I think we're suggesting that you would use a static analysis to prevent this configuration before it reaches the cluster. I don't know if that means we shouldn't expose this as a metric, there are a lot of different things we could expose as a metric.

This one is very specific to the GitRepository kind. The branch name is not something I think we would usually consider as a metric to export.

More examples of metrics we might collect:

  • How many cross-namespace access references are in use
  • How many gitrepos are using submodules
  • How many sources are verified by cryptography
  • How many are suspended
  • How many use the go-git backend / how many are using libgit2

Maybe we can provide a more generic way to determine which metrics are exported, or maybe we should just spend some focus time on this and build the right metrics in so all use cases are happy 👍

Or, we could build some kind of external reporter that monitors the Flux objects through GitOps Toolkit API and provides those metrics on-demand. That would be a really good use of the Flux APIs!

@dmichel1
Copy link

The GitRepository manifest should rather itself live in Git and be protected by e.g. PRs or MRs.

In our current setup, terraform is managing the GitRepository manifest on the cluster with DataFluxSync during bootstrap. I don't see any immediate issues with moving away from doing that and having terraform push the GitRepository manifest into git instead.

My only concern is loosing the ability to do pre-merge testing. As part of our workflow we often flip a cluster to a custom branch to test out changes. That workflow can be changed to opening a PR and merging it to update the ref but then we still have the issue of folks forgetting to flip that ref back to the original one.

Or, we could build some kind of external reporter that monitors the Flux objects through GitOps Toolkit API and provides those metrics on-demand. That would be a really good use of the Flux APIs!

This is what I was thinking about doing too but wanted to see what the others thought about adding the ref as a label or its own metric first.

And not to get off-topic but since our GitRepository is managed outside of git right now, I was actually thinking about building a deployment pipeline to rollout changes in a staggered fashion using Spinnaker (which we use for app deploys). That pipeline would take the new git ref and slowly roll it out and check metrics along the way. I could build a service to do automatic commits to git too but that would be a bit more work vs using the tooling I have access to today.

@stefanprodan
Copy link
Member

stefanprodan commented Sep 19, 2022

I'm not for adding the ref to metrics, a ref in spec can be a semver rage or a commit SHA, if we report the resolved ref then our metrics will suffer from high cardinality, each commit will result in a unique metric breaking our dashboards and all the current alerts people are using. Also having a different set of metrics for Git means we need to drop our generic metric and come up with dedicated ones for each custom resource kind and all their combined fields.

@dmichel1
Copy link

I was thinking this could be implemented as an additional metric and not added as a label to existing metrics because of all the reasons you listed (and I 100% agree).

kube and istio do this with the kube_node_info and istio_build metrics which have all the version information.

kube_node_info{container_runtime_version="containerd://1.4.13", internal_ip="10.3.67.211", job="kube-state-metrics", kernel_version="5.4.202+", kubelet_version="v1.21.14-gke.2100", kubeproxy_version="v1.21.14-gke.2100", node="gke-kube-1", os_image="Container-Optimized OS from Google", pod_cidr="10.3.73.0/26", provider_id="gce://kube-1", system_uuid="1faaf414-4b03-bc45"}

e.g. we could have something like gotk_gitrepository_info

@mtparet
Copy link

mtparet commented May 5, 2023

Why not adding the current sha1 (not the ref) so it is alway the same kind of value ?

@kingdonb
Copy link
Member

Please refer to:

This has been addressed, now you can create any metrics that you wish if the information is in the CRD spec or status, you can create a metric to report on it.

See also:

https://fluxcd.io/flux/monitoring/custom-metrics/

@mtparet
Copy link

mtparet commented Aug 31, 2023

Wonderful, will try it soon! Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants