Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix format of Prometheus metrics #24

Closed
jakubgs opened this issue Jun 1, 2020 · 10 comments
Closed

Fix format of Prometheus metrics #24

jakubgs opened this issue Jun 1, 2020 · 10 comments

Comments

@jakubgs
Copy link
Contributor

jakubgs commented Jun 1, 2020

Currently you have metrics like these:

dropped_benign_duplicate_envelopes_total 7.0
dropped_expired_envelopes_total 0.0
dropped_from_future_envelopes_total 0.0
dropped_full_queue_new_envelopes_total 0.0
dropped_full_queue_old_envelopes_total 0.0
dropped_low_pow_envelopes_total 0.0
...

Which is not how it should be done in Prometheus. This is the correct way:

dropped_envelopes_total{reason="benign_duplicate"} 7.0
dropped_envelopes_total{reason="expired"} 0.0
dropped_envelopes_total{reason="from_future"} 0.0
dropped_envelopes_total{reason="full_queue_new"} 0.0
dropped_envelopes_total{reason="full_queue_old"} 0.0
dropped_envelopes_total{reason="low_pow"} 0.0
...

Variations of the same metrics should be handled using labels.
https://prometheus.io/docs/practices/naming/#labels
https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels

@kdeme
Copy link
Contributor

kdeme commented Jun 1, 2020

Was not aware of the labels, great, I’ll adjust!

@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 2, 2020

Also, one more note, envelopes_dropped_total is a better name than dropped_envelopes_total, because the subject of the metrics is first, so in the future if you have other metrics related to envelopes - like envelopes_processed_total or envelopes_archived_total - you could find all of them with something like:

 > curl -s localhost:9090/metrics | grep -e '^envelopes_'

Here's how it works for status-go and Whisper envelopes:

 > curl -s localhost:9305/metrics | grep -e '^whisper_envelopes'
whisper_envelopes_cache_failures_total{type="expired"} 1
whisper_envelopes_cached_total{cache="clear"} 10449
whisper_envelopes_cached_total{cache="hit"} 36904
whisper_envelopes_cached_total{cache="miss"} 10451
whisper_envelopes_received_total 47356
whisper_envelopes_size_bytes_bucket{le="256"} 0
whisper_envelopes_size_bytes_bucket{le="1024"} 805
...

@kdeme
Copy link
Contributor

kdeme commented Jun 2, 2020

Yup, realized that one also a while back.

@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 8, 2020

Any plans on this? I'd like to close status-im/infra-nimbus#13 but first I'd like to finish the work on the metrics dashboard for that fleet.

@kdeme
Copy link
Contributor

kdeme commented Jun 9, 2020

I'll fix that now.

kdeme added a commit that referenced this issue Jun 9, 2020
@kdeme kdeme closed this as completed in 37d8720 Jun 9, 2020
@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 9, 2020

Currently I don't see any counts at all:

admin@node-01.do-ams3.waku.test:~ % curl -s localhost:8008/metrics
# HELP process_info CPU and memory usage
# TYPE process_info gauge
process_virtual_memory_bytes 60432384.0
process_resident_memory_bytes 37400576.0
process_start_time_seconds 1591704399.04
process_cpu_seconds_total 0.48
process_max_fds 1048576.0
process_open_fds 13.0

# HELP nim_runtime_info Nim runtime info
# TYPE nim_runtime_info gauge
nim_gc_mem_bytes 1052672.0
nim_gc_mem_occupied_bytes 499360.0
nim_gc_heap_instance_occupied_bytes{type_name="string"} 1706823.0
nim_gc_heap_instance_occupied_bytes{type_name="seq[AsyncCallback]"} 341280.0
nim_gc_heap_instance_occupied_bytes{type_name="seq[Message]"} 311328.0
nim_gc_heap_instance_occupied_bytes{type_name="Future[system.void]"} 311040.0

# HELP connected_peers number of peers in the pool
# TYPE connected_peers gauge
connected_peers 3.0
connected_peers_created 1591704400.0

# HELP envelopes_valid Received & posted valid envelopes
# TYPE envelopes_valid counter
envelopes_valid_total 0.0
envelopes_valid_created 1591704400.0

# HELP envelopes_dropped Dropped envelopes
# TYPE envelopes_dropped counter

I guess because there's no envelopes dropped yet. How could I cause this artificially?

@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 9, 2020

Also, It would be nice to have an envelopes_received_total metric too, to compare against dropped and valid.

@kdeme
Copy link
Contributor

kdeme commented Jun 9, 2020

I guess because there's no envelopes dropped yet. How could I cause this artificially?

I can generate some traffic

@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 9, 2020

Nice, it works:

# HELP envelopes_dropped Dropped envelopes
# TYPE envelopes_dropped counter
envelopes_dropped_total{reason="duplicate"} 200.0
envelopes_dropped_created{reason="duplicate"} 1591705726.0

Lovely.

@jakubgs
Copy link
Contributor Author

jakubgs commented Jun 9, 2020

Even better:

admin@node-01.do-ams3.waku.test:~ % curl -s localhost:8008/metrics | grep '^envelopes_dropped'
envelopes_dropped_total{reason="duplicate"} 8446.0
envelopes_dropped_created{reason="duplicate"} 1591705726.0
envelopes_dropped_total{reason="benign_duplicate"} 743.0
envelopes_dropped_created{reason="benign_duplicate"} 1591706036.0
envelopes_dropped_total{reason="expired"} 651.0
envelopes_dropped_created{reason="expired"} 1591706037.0

staheri14 pushed a commit that referenced this issue Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants