Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: performance data and metrics? #271

Closed
imjasonh opened this issue Feb 18, 2021 · 2 comments · Fixed by #288
Closed

Question: performance data and metrics? #271

imjasonh opened this issue Feb 18, 2021 · 2 comments · Fixed by #288
Labels
enhancement New feature or request

Comments

@imjasonh
Copy link

As an operator, it would be useful to see how much time/storage/bandwidth my users are saving by enabling stargz-snapshotter. I'm excited to see the benchmarks in the README, but that might not be representative of my own images. As future improvements to the snapshotter are released and adopted, I'd also like to see whether time/storage/bandwidth savings are improving or regressing over time.

I see the state directory tracks some of this information, and that's a reasonable start. It'd be great to scrape this and emit metrics that can be more easily digested by monitoring, to make pretty graphs 📉

Real-world usage information could also be collected and fed into future optimizations, like prioritizing files that actually get fetched in production. Per-file fetch data could help users identify unnecessary bloat in their images and help even non-stargz-snapshotter users.

@ktock
Copy link
Member

ktock commented Feb 18, 2021

Thanks for the suggestion.
+1 for having metrics that can be fed into other tools. Do you have any suggestion about metrics monitor and/or data formats?

Maybe we can start from data currently exposed on state directory and extend this into other information (accessed files, etc).

@imjasonh
Copy link
Author

I don't know enough about containerd's existing metrics, whether there's anything you can reuse or piggyback off of.

I saw the [metrics] section of the containerd config docs:

[metrics] : Section to enable and configure a metrics listener. Contains two properties:

address (Default: "") Metrics endpoint does not listen by default
grpc_histogram (Default: false) Turn on or off gRPC histogram metrics

This seems to be for container-level metrics like CPU and memory usage, but maybe there's an option to extend it with other snapshotter metrics. If not, the snapshotter binary could also emit its metrics in a similar way at least, for Prometheus etc to scrape.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants