Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitoring: metrics enhancements and proposal for dropping expvar #351

Merged
merged 8 commits into from Oct 21, 2023

Conversation

shipperizer
Copy link
Contributor

@shipperizer shipperizer commented Oct 12, 2023

The aim of this PR is to introduce new metrics for the tcp response time but also to propose a new way of pulling metrics from the LDAP server struct, without using expvar

ldap_metric is fetched on a schedule from the ldap.Server.Stats struct which is how the underlying library instrument its calls, time window is 15s

tcp_response_time_seconds instead is achieved by exploiting the named return parameters and running a defer function in each LDAP operation handler

Example of new metrics added below

# HELP ldap_metric ldap_metric
# TYPE ldap_metric gauge
ldap_metric{library="github.com/glauth/glauth",type="binds"} 21
ldap_metric{library="github.com/glauth/glauth",type="conns"} 21
ldap_metric{library="github.com/glauth/glauth",type="searches"} 20
ldap_metric{library="github.com/glauth/glauth",type="unbinds"} 21

# HELP tcp_response_time_seconds tcp_response_time_seconds
# TYPE tcp_response_time_seconds histogram
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.005"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.01"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.025"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.05"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.1"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.25"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="0.5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="1"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="2.5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="10"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="0",le="+Inf"} 20
tcp_response_time_seconds_sum{library="github.com/glauth/glauth",operation="bind",status="0"} 0.001742719
tcp_response_time_seconds_count{library="github.com/glauth/glauth",operation="bind",status="0"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.005"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.01"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.025"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.05"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.1"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.25"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="0.5"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="1"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="2.5"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="5"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="10"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="bind",status="49",le="+Inf"} 1
tcp_response_time_seconds_sum{library="github.com/glauth/glauth",operation="bind",status="49"} 6.9e-05
tcp_response_time_seconds_count{library="github.com/glauth/glauth",operation="bind",status="49"} 1
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.005"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.01"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.025"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.05"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.1"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.25"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="0.5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="1"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="2.5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="5"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="10"} 20
tcp_response_time_seconds_bucket{library="github.com/glauth/glauth",operation="search",status="0",le="+Inf"} 20
tcp_response_time_seconds_sum{library="github.com/glauth/glauth",operation="search",status="0"} 0.0027715150000000004
tcp_response_time_seconds_count{library="github.com/glauth/glauth",operation="search",status="0"} 20

  • feat: introduce new prometheus monitor object
  • feat: add LDAPMonitorWatcher as a potential replacement for v0 Collector
  • feat: pass monitor object as dependency and instrument core operations
  • feat: instantiate monitor in main

@codesee-maps
Copy link

codesee-maps bot commented Oct 12, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@shipperizer shipperizer changed the title monitoring enhancements Monitoring: metrics enhancements and proposal for dropping expvar Oct 12, 2023
@shipperizer shipperizer changed the title Monitoring: metrics enhancements and proposal for dropping expvar monitoring: metrics enhancements and proposal for dropping expvar Oct 12, 2023
@shipperizer shipperizer force-pushed the prometheus branch 4 times, most recently from 78e4d4e to 64dd452 Compare October 13, 2023 14:07
@Fusion
Copy link
Collaborator

Fusion commented Oct 14, 2023

Some more healthy separation of concerns.
Glad I'm using Codesee to track everything!

@Fusion
Copy link
Collaborator

Fusion commented Oct 14, 2023

@shipperizer It looks lime at CI time, we have a failure because mockgen was not run.

@Fusion
Copy link
Collaborator

Fusion commented Oct 14, 2023

Also, it looks like we should be moving to Uber's fork of the Mock framework.

reason is due to the following happening in the GetStats method of the ldap.Stats struct

```
internal/monitoring/mock_interfaces.go:92:13: assignment copies lock value to ret0: (github.com/nmcclain/ldap.Stats, bool) contains github.com/nmcclain/ldap.Stats contains sync.Mutex
internal/monitoring/mock_interfaces.go:93:9: return copies lock value: github.com/nmcclain/ldap.Stats contains sync.Mutex
internal/monitoring/ldap_test.go:23:56: call of mockLDAPServer.EXPECT().GetStats().MinTimes(1).Return copies lock value: github.com/nmcclain/ldap.Stats contains sync.Mutex
```
@shipperizer
Copy link
Contributor Author

shipperizer commented Oct 14, 2023

@shipperizer It looks lime at CI time, we have a failure because mockgen was not run.

managed to fix that, i forgot to hook the go generate ./...

Also, it looks like we should be moving to Uber's fork of the Mock framework.

happy to move to any mock, main pro of the built-in is the mockgen command, if Uber has the same i'm happy to swap

I noticed that there is a vendored folder that kinda clashes with the right functioning of the go mod vendor command, will have a look if that can be fixed somehow ---> #355

@Fusion
Copy link
Collaborator

Fusion commented Oct 16, 2023

Actually, I was referring to Uber's fork of mockgen as it appears that the original, which you have been using, was EOL'd in July of this year and they recommend switching to that maintained fork.

@sonarcloud
Copy link

sonarcloud bot commented Oct 16, 2023

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 3 Code Smells

No Coverage information No Coverage information
5.5% 5.5% Duplication

idea Catch issues before they fail your Quality Gate with our IDE extension sonarlint SonarLint

@shipperizer
Copy link
Contributor Author

shipperizer commented Oct 16, 2023

Actually, I was referring to Uber's fork of mockgen as it appears that the original, which you have been using, was EOL'd in July of this year and they recommend switching to that maintained fork.

@Fusion my bad, didnt really follow that update

should be fully addressed in here

@Fusion Fusion merged commit 94fe62c into glauth:master Oct 21, 2023
7 of 9 checks passed
@SuperQ
Copy link

SuperQ commented Oct 23, 2023

Hmm, this metric setup is a bit odd and doesn't really follow best practices.

For Go applications, it's discouraged to create metrics in a separate package. Rather you define metrics to be package local. This makes it easier to understand the use and avoids leaking data between packages.

@Fusion
Copy link
Collaborator

Fusion commented Oct 24, 2023

Hi @SuperQ thanks for sharing your knowledge. Could you point us to a bit of literature on this topic?
In the document you linked, I can see "Instantiate the metric classes in the same file you use them" for Prometheus instrumentation but this does not address Go recommendations.

@SuperQ
Copy link

SuperQ commented Oct 24, 2023

"Instantiate the metric classes in the same file you use them"

This exactly for Go as well as other languages. You can see a good example in Prometheus itself. Note how the package vars are not exported. This avoids package level metric leaking. It's much easier for both users and developers to know by default the scope of a metric is within a specific section of the code.

I also noticed that there's a 15 second loop, this is also against the implementation practices. If you use a gauge metric where it's used you can avoid having the goroutine updating the data internally.

@shipperizer shipperizer deleted the prometheus branch October 30, 2023 13:59
@shipperizer
Copy link
Contributor Author

I also noticed that there's a 15 second loop, this is also against the implementation practices. If you use a gauge metric where it's used you can avoid having the goroutine updating the data internally.

@SuperQ agreed, unfortunately i tried to reuse the original struct where those "metrics" are created https://github.com/nmcclain/ldap/blob/master/server.go#L66

being in another package meant that to expose those i can only go this way, or otherwise stick with the previous approach of owning internal counters

anuyway point taken, will try to improve on that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants