Add extra labels to prometheus report (warn and error level only) #68

sky-philipalmeida · 2020-01-08T00:34:43Z

Basic details about issues for prometheus report.
Afaik it can be largely improved.

Capabilities:

Expose detail messages for the outcome issues
Fine tune the output via lint level flag

Atm the final result on the Prometheus POV would be (example for service):
$popeye -o prometheus --pushgateway-address http://localhost:9091 -l error -n testns --out-detail verbose

popeye_sanitizer_reports_count{cluster="testcluster",issues="testns/kafka [POP-1100] No pods match service selector,testns/kafka [POP-1105] No associated endpoints",job="popeye",level="error",namespace="testns/kafka",resource="service"} | 1

Basic details about issues. Can be largely improved just a POC

derailed · 2020-01-08T14:48:22Z

@sky-philipalmeida Thank you for the PR and thoughts!!

One of the thing I love most about prometheus is it's all about that one thing ie gather metrics and not trying or pretending to be for anything else. IMHO this edges a bit off this essence as the labels are only there to support queries for series. I feel that Popeye prom output should all be about gathering signals and not shoehorn sanitization details. ie labels should be used for dimensionality and not kitchen drawers. Perhaps storing in a label a sanitizer report ID might be a better approach here?

That said, I am open for debate here of course, but these are my initial thoughts.

sky-philipalmeida · 2020-01-08T15:20:28Z

@derailed I understand your pov.

So your are thinking, eg., to use the save option and later on alertmanager expose the saved file using a url by report id? That makes sense for sure and it is the proper way I think...

...but, there are some caveats afaik, I just checked and the current save options do not play well with HTML browser view, also prometheus output format does not support file save. So if this is the case afaik there should be a new HTML compatible report and support for the -s option on prometheus output format (that would also save to HTML if -s enabled).

Sincerely I think both approaches are valid.

Have a flag that the user could force a label with the issue list (basically my initial approach), this would make possible queries to specific namespaces, or only to specifc error codes which also makes sense.

If you are ok with this I have time to refactor in order to overcome the current Codesense as this is a quick win for my goal.

What do you think?

derailed · 2020-01-08T15:47:25Z

@sky-philipalmeida Thank you so much for voicing your thoughts!

I think I agree with your approach so let me make sure we're on the same page.

Currently a user can generate and push prom metrics using the following command:

popeye -o prometheus --pushgateway-address=xxx

We want to store the associated sanitizer report location as a label along with the prom metrics.
So the user could issue the following command

popeye -o prometheus,json --pushgateway-address=xxx --save

So this command would allow the user to generate multiple report outputs and save them to disk and in the case the output is prometheus we would add a new label to the metrics with the location of the sanitizer report. This does also allow for generating different outputs in a single command no matter if prom is specified or not which I think could be useful. We generate the sanitization once and output to different formats.

The -o option could be limited to 2 outputs formats or if more than 2 are specified the first non prometheus formatter would be the location for the metric labels.
If only -o prometheus is specified and --save is given we would default to a json sanitizer report.

Would this satisfy your use cases? or can you think of a better way to achieve this?

Thank you Phil!!

pkg/popeye.go

internal/report/prometheus.go

sky-philipalmeida · 2020-01-09T01:20:04Z

@derailed enabling multiple output formats will require careful refactor and maybe should be addressed in another feature request IMHO.

The current proposal enables:

save option on prometheus output format with default json output.
new label target if --save is present that will have the report path to the json file
new verbose option "--out-detail verbose" that enables new PromSQL queries and labels issues

Having the path on the target label would need maybe a webserver exposing the files in order to be usefull in Alertmanager links but that is just an ideia.

current implementation status:

popeye -o prometheus --pushgateway-address http://localhost:9091 -n test -l error --save

popeye_sanitizer_reports_count{cluster="test",job="popeye",level="error",namespace="test",resource="clusterrole",target="/var/folders/3n/14849dl537vb3ts_x_95l9_mfm4j0j/T/popeye/sanitizer_test_1578531141195068000.json"}

popeye -o prometheus --pushgateway-address http://localhost:9091 -n test -l error --out-detail verbose

popeye_sanitizer_reports_count{cluster="test",issues="test/test [POP-1100] No pods match service selector,test/test [POP-1105] No associated endpoints",job="popeye",level="error",namespace="test",resource="service"}

derailed · 2020-01-10T22:27:56Z

@sky-philipalmeida Thank you for review this! I think you are right we can tackle multiple reports in a separate push. As for the report location, I think we might need a different option. The report url could be provided in many shapes ie via http or others. So I am thinking perhaps --save does not quiet fit the bill. Looping in @guusvw here) are he owns the PR for s3 support.

NOTE: Not keen on the --out-details what happens if you reports are generating 10k+ issues per runs?
I think this should be out. The details are in the reports not the metrics.

I am kind of burnt this week but we need to think about --save being a flag to indicate how we generate the report url. ie file/http/s3 etc.. So I think perhaps --save file|s3|etc... might be more inline with what we're trying to achieve here??

popeye -o prometheus --pushgateway-address http://localhost:9091  --save=s3 -s3bucket=xxx

guusvw · 2020-01-13T08:27:49Z

I really like the idea of defining a "save storage" via --save and also have a flag for the concrete destination (s3 bucket, local file storage, FTP, ...), also be able to report Prometheus metrics in the same run would be awesome. (I'm currently doing this via two jobs)
I would also limit it to one "save report" & "report metrics" combination, anything else would make it to complicated and don't see a real use case for that.
It would also agree @sky-philipalmeida to do the refactoring not in this PR 👍
Maybe writing first a proposal and collect input and maybe we could afterwards split up the work.

guusvw · 2020-01-13T08:28:56Z

pkg/popeye.go

@@ -226,6 +234,7 @@ func (p *Popeye) ensureOutput() error {
 	ext := "txt"
 	switch *p.flags.Output {
 	case "json":
+	case "prometheus":


Suggested change

case "prometheus":

case "json", "prometheus":

sky-philipalmeida · 2020-01-15T09:16:11Z

@guusvw @derailed I will not proceed with the current change as we opted by another way. Using the json report parsing ourselfs into metric and exposing.
Thanks for all your help please feel free to close or change this PR as you see most fitted.

Update prometheus.go

bec7978

Basic details about issues. Can be largely improved just a POC

sky-philipalmeida changed the title ~~Update prometheus.go~~ Add extra labels to prometheus report - Issues Jan 8, 2020

sky-philipalmeida changed the title ~~Add extra labels to prometheus report - Issues~~ Add extra labels to prometheus report (warn and error level only) Jan 8, 2020

Almeida added 2 commits January 8, 2020 09:32

Cli flag --out-detail

ab57c9d

Dinamic lint level specification

53703e0

sky-philipalmeida marked this pull request as ready for review January 8, 2020 10:37

Merge branch 'master' into master

a676fe9

sky-philipalmeida mentioned this pull request Jan 8, 2020

Add outcome issue detail to the prometheus report. #69

Open

golangcibot reviewed Jan 9, 2020

View reviewed changes

pkg/popeye.go Outdated Show resolved Hide resolved

pkg/popeye.go Outdated Show resolved Hide resolved

internal/report/prometheus.go Outdated Show resolved Hide resolved

sky-philipalmeida force-pushed the master branch from 5b51fe2 to 12aece9 Compare January 9, 2020 00:59

Report file in json format

4fc1c4b

sky-philipalmeida force-pushed the master branch from 12aece9 to 4fc1c4b Compare January 9, 2020 01:02

guusvw reviewed Jan 13, 2020

View reviewed changes

sky-philipalmeida closed this Feb 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extra labels to prometheus report (warn and error level only) #68

Add extra labels to prometheus report (warn and error level only) #68

sky-philipalmeida commented Jan 8, 2020 •

edited

Loading

derailed commented Jan 8, 2020

sky-philipalmeida commented Jan 8, 2020

derailed commented Jan 8, 2020

sky-philipalmeida commented Jan 9, 2020 •

edited

Loading

derailed commented Jan 10, 2020

guusvw commented Jan 13, 2020

guusvw Jan 13, 2020

sky-philipalmeida commented Jan 15, 2020

Add extra labels to prometheus report (warn and error level only) #68

Add extra labels to prometheus report (warn and error level only) #68

Conversation

sky-philipalmeida commented Jan 8, 2020 • edited Loading

derailed commented Jan 8, 2020

sky-philipalmeida commented Jan 8, 2020

derailed commented Jan 8, 2020

sky-philipalmeida commented Jan 9, 2020 • edited Loading

derailed commented Jan 10, 2020

guusvw commented Jan 13, 2020

guusvw Jan 13, 2020

Choose a reason for hiding this comment

sky-philipalmeida commented Jan 15, 2020

sky-philipalmeida commented Jan 8, 2020 •

edited

Loading

sky-philipalmeida commented Jan 9, 2020 •

edited

Loading