Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Metrics Endpoint #346

Closed
computeralex92 opened this issue Dec 29, 2019 · 25 comments
Closed

Prometheus Metrics Endpoint #346

computeralex92 opened this issue Dec 29, 2019 · 25 comments
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@computeralex92
Copy link
Contributor

In a server / client setup it would be great if Trivy would expose some metrics about the scans happen with the central server.
Some useful metrics for my implementation:

  • Last DB Update (timestamp)
  • Last DB Update Attempt (timestamp)
  • Sum of Issues found
  • Sum of Issues found splited up in SEVERITY
  • Sum of Issues found splited up in sources (OS, Python, Node etc)

As Trivy is build to scan Docker Images, I would suggest to provide such metrics via a Prometheus metrics endpoint because Prometheus is quite popular in the Docker / Kubernetes community.

@computeralex92 computeralex92 added the kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. label Dec 29, 2019
@knqyf263 knqyf263 added the good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. label Dec 30, 2019
@knqyf263
Copy link
Collaborator

Nice suggestion. I think this improvement can be done step by step. It is not difficult to add Prometheus metrics endpoint. Welcome PR!

@knqyf263 knqyf263 added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. labels Apr 30, 2020
@yashvardhan-kukreja
Copy link
Contributor

yashvardhan-kukreja commented May 8, 2020

Hi @knqyf263 , I am new to open source dev although I have had experience working with git extensively and some experience with golang too. So, can I hop on to developing a PR for this issue considering the "good first issue" label?

@knqyf263
Copy link
Collaborator

Hi @yashvardhan-kukreja, thank you for your interest! Yes, it would be helpful. As a first step, we can just return the database information such as Last DB Update as he mentioned.

Here is the server mux.
https://github.com/aquasecurity/trivy/blob/master/pkg/rpc/server/listen.go#L61-L79

You can get the database metadata like the following.
https://github.com/aquasecurity/trivy/blob/master/internal/operation/operation.go#L84-L93

@yashvardhan-kukreja
Copy link
Contributor

Hi @knqyf263, sorry I was caught up with some crucial work since one month. Now, I am back on this.

@yashvardhan-kukreja
Copy link
Contributor

@knqyf263 , I made some mistakes when I made the pull request number #540 , So, I closed it and re-opened the a new PR (#542 ) for this issues and closed the previous one.
If you find it suitable, then, please delete the #540
Sorry for the inconvenience

@knqyf263
Copy link
Collaborator

Hi @yashvardhan-kukreja, this is OSS project, so you don't have to apologize that you don't have time to work on this issue. I'm so grateful for your contribution! AFAIK, we couldn't delete a PR on GitHub. It is enough to close the PR.

@yashvardhan-kukreja
Copy link
Contributor

yashvardhan-kukreja commented Jun 23, 2020

@knqyf263 , @computeralex92 , I have a few basic doubts with this issue. Please clarify them:

  1. So, first of all, in the first line, what exactly does the "central server" mean? Like does it mean the server/host/computer where the trivy server --listen command got executed?
  2. So, here are we looking to setup a GET /metrics endpoint which would return (respond with) metrics like "Last DB Update" for prometheus?
  3. Finally, to implement these custom metrics, the way I look at it, it seems that I would need to utilise the "promauto" and "prometheus" packages. Am I right?

@computeralex92
Copy link
Contributor Author

computeralex92 commented Jun 23, 2020

@yashvardhan-kukreja First of course thank you for implementing this idea.
Unfortunately I had no time in the last months to do it on my own.

Regarding your questions:

1. So, first of all, in the first line, what exactly does the "central server" mean? Like does it mean the server/host/computer where the `trivy server --listen` command got executed?

Correct.
Use case:
As part of a CI/CD pipeline, I want to monitor the performed scans and the trivy setup e.g. via Grafana.
Since the client (within the pipeline) should not download the DB etc, the scan is happen in a trivy server running with trivy server.

2. So, here are we looking to setup a `GET /metrics` endpoint which would return (respond with) metrics like "Last DB Update" for prometheus?

Correct.
The idea behind is to monitor the status of the DB and e.g. alerted if the DB gets to old or is not able to update anymore.

3. Finally, to implement these custom metrics, the way I look at it, it seems that I would need to utilise the "promauto" and "prometheus" packages. Am I right?

No glue, sorry.

@yashvardhan-kukreja
Copy link
Contributor

@computeralex92 thanks for the quick and well descriptive reply. It cleared out all the things.
No worries regarding 3rd question, I mainly wanted to confirm the first two questions.
I'll start working on implementing this, @knqyf263 😄

@yashvardhan-kukreja
Copy link
Contributor

yashvardhan-kukreja commented Jun 26, 2020

@computeralex92 @knqyf263 , on ideating upon how to export metrics for Last DB Update, I came up with this idea

On GET /metrics, this would be the output:
DBUpdate{time="2020-06-26 14:54:38.198245437 +0000 UTC"} 1
DBUpdate{time="2020-06-26 14:54:38.698289119 +0000 UTC"} 1
DBUpdate{time="2020-06-26 14:54:39.198286756 +0000 UTC"} 1

So, here, I was using DBUpdate metric as a counter with "time" as the label. So, basically, for every timestamp, the counter for it will be created.

So, basically, if I implement this, then, in trivy, whenever a DB Update occurs, for example at 2020-06-26 14:54:38, then an entry DBUpdate{time="2020-06-26 14:54:38"} 1 will be added to the existing metrics of DB Update.

So, with that I believe we would be easily able to fetch the Last DB Update and we can even further plot all the times when DB Update happened and we find something like the first DB Update because we will be storing all the DB Updates for that session in the metrics.

So should I go on and implement this and if not then would you like to suggest any other way of storing DB Update metrics and displaying them at /metrics endpoint?

@strowi
Copy link

strowi commented Jun 26, 2020

Hi,

nice work,so far. If i might.. a suggestion from the prometheus standpoint:
We had sth. very similar implemented at work. The Problem with putting metrics inside the labels is, that it might (or most definitleywill) blow up your TSDB.
If possible, it might be better to put a timestamp for the metrics like:

trivy{action="dbupdate"} 1593184501

You could still see from the metrics when the updates did happen?

PS: you might alsow want to check the prometheus guide about naming convention, but that's probably more cosmetics ;)
https://prometheus.io/docs/practices/naming/

@yashvardhan-kukreja
Copy link
Contributor

yashvardhan-kukreja commented Jun 26, 2020

Thanks for the suggestion, @strowi. So, just to confirm, everytime a DB Update will happen, trivy will just overwrite trivy{action="dbupdate} so whenever we will go to GET /metrics, we can simply look at trivy{action="dbupdate}, to see the latest db update (because that would correspond to the overwritten timestamp of latest db update).

I hope I am right?

@strowi
Copy link

strowi commented Jun 26, 2020

@yashvardhan-kukreja yes, you will always get the latest unix-timestamp in a single metric which gets overwritten. Otherwise if the labels change prometheus sees this as a somewhat different metric.

For Example:
This comes especially into play if you want to get metrics for images + count of vulnerabilities:

Using tagged build, you will get a metric for a specific image:

trivy_container_issues{image="dr.cooking.net/something/nginx:build-master-777",instance="production",job="trivy_scan",monitor="production",namespace="sth"} 123

But if you update the image (maybe fixing the vulnerabilities), you create another metric:

trivy_container_issues{image="dr.cooking.net/something/nginx:build-master-777",instance="production",job="trivy_scan",monitor="production",namespace="sth"} 123
trivy_container_issues{image="dr.cooking.net/something/nginx:build-master-778",instance="production",job="trivy_scan",monitor="production",namespace="sth"} 10

If you have an alert on this, you will still get the alerts for the previous image..

Same principle for DB-updates.

@yashvardhan-kukreja
Copy link
Contributor

yashvardhan-kukreja commented Jun 26, 2020

This seems like a fabulous approach to me @strowi , thanks a lot for this.
@computeralex92 , @knqyf263 this seems perfect to me, to be honest.
What do you think, should I start moving on to implementing this?

@knqyf263
Copy link
Collaborator

@yashvardhan-kukreja It looks fine to me!

yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Jun 29, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Jul 30, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Jul 31, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Aug 5, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Aug 5, 2020
yashvardhan-kukreja pushed a commit to yashvardhan-kukreja/trivy that referenced this issue Aug 12, 2020
parent 4b57c0d
author Simarpreet Singh <simar@linux.com> 1594135002 -0700
committer Yashvardhan Kukreja <yash.kukreja.98@gmail.com> 1597228077 +0530
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEo6kc/h77LUwnQeM/dxKAODWqo7oFAl8zxC0ACgkQdxKAODWq
 o7pG3g//VIXCQt6z8dhORimZEAXLbwI7WuUYxkkGGKceuhCWwEs7HVJLkNBiIml1
 6gDnc8sMkG7FqFGAi5RHvdez9vqWZRxaoWgJ2J39u/sTow3QEwvzIAdjG7+4LHOs
 7mgg82qQp5Vb0UVudEitc3bqukoO61B0pszC3S8wacq3uWfq5IPRvVePBA0SD9+W
 jykmLzVp5NGeKRnOCuJw9HkRP9+lKfCJwb4K8xbTjJjuWUDj9k6oRV1XKNQcyWCi
 KzEEV1snKne8dsUYPf9dN6FuJFi6c+a4L7vX96dlKLKJDQD0y1qQHhdBSNwqP7Wj
 RHL/WuMt3Yx6sZe30dPA3I7Tj2zizodjRs+Qst1Jfyjv/5e4Ap2gqmf39pse4O8n
 Ct4UA+5zTsulyT/5aUa/gIYFUH+luznCqiYoQtQ7TgELtcVOcgGfJciq+kPp6NWP
 GS2IcBH/XSOkQ4nRQrbQ/vutItYNUcE2Oe0xLerTih3+Sx+SKufSecLoSqOTgJdG
 TEqU6UkZB3mV3Y5j9MYmvF2Yvq+Ll2tw5FzxLA6kg+eTa1ochn/xwi11/kDQYqf3
 CkH8Z4/ZgHx5xHwLkLxMleaiQP3EbyxaEBZYgzrOzp8rnT4HU+FeSUrkqlcyBrRN
 HSFMQlKXq+o/yfgVVh51LyGSFlHncVm1Jv6UirsGj7NAvso+BqA=
 =QhX4
 -----END PGP SIGNATURE-----

# This is a combination of 6 commits.
# This is the 1st commit message:

db: Update trivy-db to include CVSS score info (aquasecurity#530)

* mod: Update trivy-db to include CVSS score info

Signed-off-by: Simarpreet Singh <simar@linux.com>

* mod: Update go.mod

Signed-off-by: Simarpreet Singh <simar@linux.com>

* mod: Update trivy-db to latest

Signed-off-by: Simarpreet Singh <simar@linux.com>
# This is the commit message aquasecurity#2:

Adding contrib/junit.tpl to docker image (aquasecurity#554)


# This is the commit message aquasecurity#3:

Fixing `Error retrieving template from path` when --format is not template but template is provided (aquasecurity#556)


# This is the commit message aquasecurity#4:

added: display last db update whenever trivy server is started in trivy client/server setup

# This is the commit message aquasecurity#5:

Added: entry for prometheus/client_golang package

# This is the commit message aquasecurity#6:

Added: prometheus metrics endpoint support for Last DB Update and Last DB Update Attempt metric

# This is the commit message aquasecurity#7:

Added: entry for prometheus/client_golang package

# This is the commit message aquasecurity#8:

Added: prometheus metrics endpoint support for Last DB Update and Last DB Update Attempt metric

# This is the commit message aquasecurity#9:

Refactored: Shifted the GaugeVec global var to config.go . Removed unnecessarily repeated vars. Added nil check for GaugeVec

# This is the commit message aquasecurity#10:

Added: Nil GaugeVec Fail check

# This is the commit message aquasecurity#11:

Added: nil check for metrics registry

# This is the commit message aquasecurity#12:

Modified: tests with respect to nil metrics registry

# This is the commit message aquasecurity#13:

Merge with master

# This is the commit message aquasecurity#14:

Merge branch 'master' into issue-aquasecurity#346

# This is the commit message aquasecurity#15:

Resolved merge conflicts

# This is the commit message aquasecurity#16:

Resolved merge conflicts

# This is the commit message aquasecurity#17:

feat(vulnerability): add CWE-ID (aquasecurity#561)

* chore(mod): update dependency

* test(vulnerability): add CweIDs
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Aug 12, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Sep 15, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Sep 27, 2020
yashvardhan-kukreja added a commit to yashvardhan-kukreja/trivy that referenced this issue Oct 5, 2020
@krol3 krol3 added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Mar 21, 2021
@dzabel
Copy link

dzabel commented Aug 13, 2021

Hi,

i`m very interested in this feature.
What exactly is the current state of the issue and how this will go on?

Cheers,

Daniel

@bygui86
Copy link

bygui86 commented Nov 3, 2021

hi guys, what's the status of this?

@DracoBlue
Copy link

Ping! :)

1 similar comment
@andrisro
Copy link

Ping! :)

liamg pushed a commit that referenced this issue Jun 7, 2022
liamg pushed a commit that referenced this issue Jun 7, 2022
josedonizetti pushed a commit to josedonizetti/trivy that referenced this issue Jun 24, 2022
…quasecurity#346)

* chore: remove general rules to prepare for tfsec scanner decoupling
@bygui86
Copy link

bygui86 commented Jul 19, 2022

hi guys, still no updates on this? :( it would be a really helpful feature!

@DracoBlue
Copy link

We are interested into this to. Maybe one of our endava go developer can create a PR for it.

@nthienan
Copy link

Ping! :)

@jc16180
Copy link

jc16180 commented Nov 3, 2022

Ping !

@knqyf263
Copy link
Collaborator

It is probably not the answer you want, but at the moment we don't have enough maintainers, so we are concentrating our resources on Trivy Operator rather than extending the Trivy server. The operator supports Prometheus. You can use it. We hope for your kind understanding.

@strowi
Copy link

strowi commented May 15, 2023

For anyone stumbling on this.. i threw together a small bash script that can check all images running in a cluster. and pushed the metrics to a pushgateway.
Can be adapted for CI, should be pretty straighforward: https://gitlab.com/strowi/trivy-check
Maybe it helps someone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
Status: No status
Development

No branches or pull requests