New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional metrics reporting size of artifacts stored #4603
Comments
In thinking over the details here there is an accuracy issue with this plan: while the Bytes of Artifacts new to this repository version are indeed new to this repository version, they may already be part of another repository version in this domain and so they aren't really new to Pulp. Say rpm Foo is 10MB. Say repo A gets rpm Foo in version 1. Then repo B gets rpm Foo in version 1 also. Both repo versions would show 10MB and if you sum them together you would expect 20MB of storage to be used, but in reality they are de-duplicated and only using 10MB. So that's problematic if the goal is to sum these values and have them be correct. |
There is desire to implement these two usecases:
|
@ipanova I think you're summary of the discussion is good. @ALL I'd like to add the observation that only the operator of Pulp (not the operator of a domain) is the one consuming the open telemetry data. Given that this ticket is for open telemetry data only, I propose we only report the total domain storage size and not anything on the individual repo or repo version level. When to report it? Ideally we'd emit the OTEL metric anytime the storage size stored for a domain changes. That would include sync, orphan cleanup, and the pulp-content app streaming data (while also storing it). Really any operation that results in a possible change to the data amount stored for a domain. How to report it? To me, reporting in total Bytes as an integer, along with the domain "name" and it's "pulp_href" would be sufficient. This would leave the reporting tooling for helping API users answer the questions about how much space is being used for specific repos or repository versions as a completely separate topic for another time. What do others think about this? |
This is no longer a valid statement. We decided to report metrics periodically. |
At the moment, it is not possible to destroy instruments that send metrics. Therefore, when a user removes a domain, there might be still the metrics about it emitted. A temporary workaround is to restart the pulpcore-api process to reload meters. Ref: open-telemetry/opentelemetry-specification#2232 closes pulp#4603
At the moment, it is not possible to destroy instruments that send metrics. Therefore, when a user removes a domain, there might be still the metrics about it emitted. A temporary workaround is to restart the pulpcore-api process to reload meters. Ref: open-telemetry/opentelemetry-specification#2232 closes pulp#4603
At the moment, it is not possible to destroy instruments that send metrics. Therefore, when a user removes a domain, there might be still the metrics about it emitted. A temporary workaround is to restart the pulpcore-api process to reload meters. Ref: open-telemetry/opentelemetry-specification#2232 closes pulp#4603
At the moment, it is not possible to destroy instruments that send metrics. Therefore, when a user removes a domain, there might be still the metrics about it emitted. A temporary workaround is to restart the pulpcore-api process to reload meters. Ref: open-telemetry/opentelemetry-specification#2232 closes pulp#4603
At the moment, it is not possible to destroy instruments that send metrics. Therefore, when a user removes a domain, there might be still the metrics about it emitted. A temporary workaround is to restart the pulpcore-api process to reload meters. Ref: open-telemetry/opentelemetry-specification#2232 closes #4603
Feature
When running on Pulp on cloud services the data storage costs can be expensive. Understanding these costs is key. It would be excellent to know answers to these questions:
Proposal
Have the saving of a RepositoryVersion emit this open telemetry data.
With ^ data, aggregating reports could be run (not part of this ticket) in Prometheus that do things like:
The text was updated successfully, but these errors were encountered: