Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics: Add go_* metrics #19153

Merged
merged 1 commit into from
Mar 17, 2022
Merged

Conversation

chancez
Copy link
Contributor

@chancez chancez commented Mar 15, 2022

Prometheus provides metrics collectors that expose go runtime and go build
information, which can be useful to server administrators, lets expose
them.

metrics: Add go_* metrics

@chancez chancez requested a review from a team as a code owner March 15, 2022 18:45
@chancez chancez requested a review from sayboras March 15, 2022 18:45
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 15, 2022
@chancez chancez added release-note/minor This PR changes functionality that users may find relevant to operating Cilium. and removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Mar 15, 2022
@chancez chancez requested a review from a team as a code owner March 15, 2022 18:55
@chancez chancez requested a review from rolinh March 15, 2022 18:55
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. In the back of my mind, I'm wondering if there's a performance cost (increased mem; of course CPU will increase slightly as there's more to "process") to enabling these in general or is it negligible?

@chancez
Copy link
Contributor Author

chancez commented Mar 15, 2022

LGTM. In the back of my mind, I'm wondering if there's a performance cost (increased mem; of course CPU will increase slightly as there's more to "process") to enabling these in general or is it negligible?

They should be negligible overall. These are enabled by default normally, and are used in most Go apps, including etcd, prometheus, etc which as DBs likely care quite a bit if there was a performance cost.

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

For other reviewers, please find below sample output.

Also, there is a small discussion about capability to disable any metrics in general #18692, in my opinion, this will require some refactor, which can be tackled seperately.

$ ksysex ds/cilium -- cilium metrics list -p "go_*"
Defaulted container "cilium-agent" out of: cilium-agent, ebpf-mount (init), clean-cilium-state (init)
Metric                             Labels                                                          Value
go_build_info                      checksum="" path="github.com/cilium/cilium" version="(devel)"   1.000000
go_gc_duration_seconds                                                                             0.002105
go_goroutines                                                                                      203.000000
go_info                            version="go1.17.8"                                              1.000000
go_memstats_alloc_bytes                                                                            16713944.000000
go_memstats_alloc_bytes_total                                                                      237617752.000000
go_memstats_buck_hash_sys_bytes                                                                    1629976.000000
go_memstats_frees_total                                                                            2678372.000000
go_memstats_gc_cpu_fraction                                                                        0.000007
go_memstats_gc_sys_bytes                                                                           6083000.000000
go_memstats_heap_alloc_bytes                                                                       16713944.000000
go_memstats_heap_idle_bytes                                                                        17219584.000000
go_memstats_heap_inuse_bytes                                                                       25313280.000000
go_memstats_heap_objects                                                                           90959.000000
go_memstats_heap_released_bytes                                                                    11173888.000000
go_memstats_heap_sys_bytes                                                                         42532864.000000
go_memstats_last_gc_time_seconds                                                                   1647398585.866132
go_memstats_lookups_total                                                                          0.000000
go_memstats_mallocs_total                                                                          2769331.000000
go_memstats_mcache_inuse_bytes                                                                     19200.000000
go_memstats_mcache_sys_bytes                                                                       32768.000000
go_memstats_mspan_inuse_bytes                                                                      459272.000000
go_memstats_mspan_sys_bytes                                                                        507904.000000
go_memstats_next_gc_bytes                                                                          29935344.000000
go_memstats_other_sys_bytes                                                                        3715023.000000
go_memstats_stack_inuse_bytes                                                                      3604480.000000
go_memstats_stack_sys_bytes                                                                        3604480.000000
go_memstats_sys_bytes                                                                              58106015.000000
go_threads                                                                                         25.000000

pkg/metrics/metrics.go Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
Prometheus provides metrics collectors that expose go runtime and go build
information, which can be useful to server administrators, lets expose
them.

Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
@sayboras sayboras added the area/metrics Impacts statistics / metrics gathering, eg via Prometheus. label Mar 16, 2022
@sayboras sayboras changed the title metrics: Add go_* metrics and go_build_info metrics metrics: Add go_* metrics Mar 16, 2022
@sayboras
Copy link
Member

The changes in this PR was testing by github action (i.e. checking prom metric), I don't think full test is required here :shipit:

@sayboras sayboras added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 16, 2022
@ldelossa ldelossa merged commit 6c5e2d6 into cilium:master Mar 17, 2022
@chancez chancez deleted the register_go_collectors branch March 17, 2022 17:47
@christarazi
Copy link
Member

Marking for backport to all stable branches as it's useful for debugging info and is low-risk.

@christarazi
Copy link
Member

Backporting to v1.9 actually causes other dependency bumps that are too risky. Here's the error msg:

❯ go mod tidy && go mod vendor
go: downloading go.etcd.io/etcd v0.5.0-alpha.5.0.20211015134708-72d3e382e73c
go: downloading github.com/tmc/grpc-websocket-proxy v0.0.0-20200427203606-3cfed13b9966
go: downloading github.com/golang-jwt/jwt v3.2.1+incompatible
go: finding module for package github.com/prometheus/client_golang/prometheus/collectors
go: found github.com/prometheus/client_golang/prometheus/collectors in github.com/prometheus/client_golang v1.12.1
go: finding module for package google.golang.org/grpc/examples/helloworld/helloworld
go: finding module for package google.golang.org/grpc/naming
go: found google.golang.org/grpc/examples/helloworld/helloworld in google.golang.org/grpc/examples v0.0.0-20220413171549-7567a5d96538
go: finding module for package google.golang.org/grpc/naming
github.com/cilium/cilium/pkg/kvstore imports
        go.etcd.io/etcd/clientv3 tested by
        go.etcd.io/etcd/clientv3.test imports
        go.etcd.io/etcd/integration imports
        go.etcd.io/etcd/proxy/grpcproxy imports
        google.golang.org/grpc/naming: module google.golang.org/grpc@latest found (v1.46.0), but does not contain package google.golang.org/grpc/naming

Removing v1.9 from backport.

@joestringer
Copy link
Member

I've dropped the backport to v1.10 for this due to breakage during backport. If you'd like it to be on v1.10, please prepare the backport PR and address the breakages.

@joestringer
Copy link
Member

@chancez for reference I see that this was marked for v1.9 and v1.10 backport, but the backport criteria typically does not cover changes like this on the older branches. I recognize that it can be useful to backport observability changes, so it's always a trade-off of whether it's worthwhile or not. In this case it appears that there's some other unsatisfied dependency & potential risk for introducing the backport to older branches.

@christarazi
Copy link
Member

christarazi commented Apr 27, 2022

@joestringer I was the one who added the labels for backport given its potential usefulness for observability. I didn't anticipate a simple change to cause problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Impacts statistics / metrics gathering, eg via Prometheus. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants