Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add monitoring and stats for the raster server #3679

Open
paulmelnikow opened this issue Jul 8, 2019 · 4 comments
Open

Add monitoring and stats for the raster server #3679

paulmelnikow opened this issue Jul 8, 2019 · 4 comments
Labels
operations Hosting, monitoring, and reliability for the production badge servers

Comments

@paulmelnikow
Copy link
Member

It would be good to know how many requests are hitting the server, maybe get errors in Sentry, and have a frequent test that hits the server and ensures it's still responding with valid PNGs.

@paulmelnikow paulmelnikow added the operations Hosting, monitoring, and reliability for the production badge servers label Jul 8, 2019
@paulmelnikow paulmelnikow changed the title Add monitoring and stats for svg-to-image-proxy Add monitoring and stats for the raster server Jul 8, 2019
@platan
Copy link
Member

platan commented Oct 15, 2019

I started to experiment with https://github.com/prometheus/blackbox_exporter. I installed it on metrics.shields.io and added two HTTP targets to Prometheus: https://shields.io/ and https://raster.shields.io/badge/foo-bar-blue.png. Checks are performed every 15 seconds (we can change it). This way we gather a lot of data. I also added https://grafana.com/grafana/dashboards/7587 to our Grafana.
A list of targets being checked is stored in: https://github.com/platan/metrics-shields-io-config/blob/3759b07de7fb1883506a3cededd2023944f00ca2/shields-io-metrics.yml#L63-L64
Final dashboard is available here: https://metrics.shields.io/d/xtkCtBkiz/prometheus-blackbox-exporter?orgId=1&refresh=10s

AFAIK it's not possible verify the response body using blackbox exporter. But we have stats with status code and content length of the response and we can use this data (expecting status code = 200 and content length = 972 (current response size of foo-bar-blue.png)).

@platan
Copy link
Member

platan commented Nov 4, 2019

https://metrics.shields.io/d/xtkCtBkiz/prometheus-blackbox-exporter has stats for raster badge for the last two weeks. Average response time is 458 ms.
Screenshot_2019-11-04 Prometheus Blackbox Exporter - Grafana

Today I've added probing of the most popular SVG badges: static_badge, npm_version, travis_build, npm_downloads, github_stars. Results can be seen at https://metrics.shields.io/d/xtkCtBkiz/prometheus-blackbox-exporter.

@paulmelnikow
Copy link
Member Author

Nice! I'm guessing that request is hitting the Now CDN cache?

@platan
Copy link
Member

platan commented Nov 4, 2019

I think you are right. Response times for new badges not requested before are 2x-3x longer than response times for badges requested several times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
operations Hosting, monitoring, and reliability for the production badge servers
Projects
None yet
Development

No branches or pull requests

2 participants