New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/telemetry: handle empty device report when "send" is triggered #44994
mgr/telemetry: handle empty device report when "send" is triggered #44994
Conversation
3953a08
to
d44a98b
Compare
Looks like the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ljflores it works now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ljflores looks good; I had a couple of comments
ee5b524
to
d44a98b
Compare
adding a note here that this should not be backported since the assert was introduced with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's squash the fixes
ea14350
to
ff53805
Compare
On certain environments, such as the "ceph-dev-docker" environment (https://github.com/ricardoasmarques/ceph-dev-docker), the mgr module is unable to fetch device metrics. As a result, the device report generated by "gather_device_report()" returns an empty dict. This causes an AssertionError when the "send" function is triggered (i.e. by running `ceph telemetry status` or `ceph telemetry send`), and the module crashes. The fix in this commit checks that the generated device report contains metrics before trying to send it. If the device report does not contain metrics (it returns an empty dict), the module will log an appropriate message in the mgr log and not send the device report. If this scenario happens when running the `ceph telemetry send` command, the user will additionally see this message: ``` Ceph report sent to https://telemetry.ceph.com/report Unable to send device report: channel is on, but generated report was empty. ``` I also added a few more debug messages in gather_device_report() to make future debugging easier. Fixes: https://tracker.ceph.com/issues/54250 Signed-off-by: Laura Flores <lflores@redhat.com>
ff53805
to
54e0e58
Compare
Failures, unrelated: Details: |
On certain environments, such as the "ceph-dev-docker" environment
(https://github.com/ricardoasmarques/ceph-dev-docker), the mgr
module is unable to fetch device metrics. As a result, the device
report generated by "gather_device_report()" in telemetry returns an empty dict.
This causes an AssertionError when the "send" function is triggered
(i.e. by running
ceph telemetry status
orceph telemetry send
),and the module crashes.
The fix in this commit checks that the generated device report
contains metrics before trying to send it. If the device report
does not contain metrics (it returns an empty dict), the module
will log an appropriate message in the mgr log and not send the
device report.
If this scenario happens when running the
ceph telemetry send
command,the user will additionally see this message:
Fixes: https://tracker.ceph.com/issues/54250
Signed-off-by: Laura Flores lflores@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox