New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common: counter dump command revision
#51947
Conversation
ff4913b
to
691fc31
Compare
counter dump command revisioncounter dump command revision
53f83f6
to
448d445
Compare
|
@idryomov thank you for the review and the patch to refactor the change I made in If everything looks fine I'll squash the newest commit and add the exporter patch @avanthakkar is working on when it's done. We want that change in this PR right? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You adjusted existing unit tests, but a case with more than one item in the array -- the entire point of this PR -- is not covered at all...
Also, see the conversation I left unresolved regarding SimpleTest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one nit, looks good to squash! @avanthakkar How is the ceph-exporter adjustment coming?
I have the patch ready, should I commit it to this PR itself? @idryomov @alimaredia |
I'd just point Ali at it, he would be making the final fixup/squashing and can pick your commit then. |
`counter dump` now emits an array of <labels,counters> pairs for each individual key. Commit includes revisions to perf counters unit test. Fixes: https://tracker.ceph.com/issues/61587 Signed-off-by: Ali Maredia <amaredia@redhat.com>
ca894cf
to
7c2c6c2
Compare
|
@avanthakkar for any modifications to your exporter commit you can just push them to this PR. |
7c2c6c2
to
d7bdc3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit "exporter: adopt counter dump command output" has a stray "cherry picked from" annotation -- this is the original submission to main, not a backport.
d7bdc3c
to
058647f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prometheus endpoint output seems wrong, plus the stray "cherry picked from" annotation is still there.
Yeah seems like when refactoring it missed the extra |
/metrics endpoint: |
058647f
to
f760f40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change itself looks good but I don't understand why
"remote_timestamp": 1686811917.260234241,
"local_timestamp": 1686811917.260234241,
"last_sync_time": 0.961036955,
and
"remote_timestamp": 1686811921.763860194,
"local_timestamp": 1686811921.763860194,
"last_sync_time": 1.398823385,
became
ceph_rbd_mirror_snapshot_image_local_timestamp{ceph_daemon="client.admin.322498",image="image1",namespace="",pool="data"} 1.686812
ceph_rbd_mirror_snapshot_image_remote_timestamp{ceph_daemon="client.admin.322498",image="image1",namespace="",pool="data"} 1.686812
ceph_rbd_mirror_snapshot_image_last_sync_time{ceph_daemon="client.admin.322498",image="image1",namespace="",pool="data"} 0.000000
and
ceph_rbd_mirror_snapshot_image_local_timestamp{ceph_daemon="client.admin.322498",image="image2",namespace="",pool="data"} 1.686812
ceph_rbd_mirror_snapshot_image_remote_timestamp{ceph_daemon="client.admin.322498",image="image2",namespace="",pool="data"} 1.686812
ceph_rbd_mirror_snapshot_image_last_sync_time{ceph_daemon="client.admin.322498",image="image2",namespace="",pool="data"} 0.000000
respectively. Note the last_sync_time discrepancy for image2 in particular.
|
@avanthakkar This is a third time in a row that you are presenting something that is obviously broken as ready to merge. Please review the code and compare the outputs yourself carefully before pushing/posting. |
|
I suspect that this is the problem. Why is the value getting divided by 10^9 here? |
Well it's converting |
|
jenkins test api |
The manager is getting perf counters over the wire, so there is binary encoding involved. Here, ceph/src/common/perf_counters.cc Lines 480 to 484 in 73567f9
If you are assuming a nanosecond unit for "time" values, then the output is telling you that rbd-mirror daemon managed to transfer 50M of data in 0.961 nanoseconds for image1 and 1.399 nanoseconds for image2. That would be very nice but such networks haven't been invented yet :-) |
Yeah makes sense. I'll remove the transformation. Thanks for pointing out
|
|
>> counter dump >>> /metrics endpoint |
a3ef4ca
to
8013655
Compare
I've removed the condition now and pasted the output for metrics exported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format change for counter dump/schema LGTM!
Adopt the counter dump format changes in exporter for extracting the counters. Removed the condition for `PERFCOUNTER_TIME` as counter dump already does tranformation internally. Signed-off-by: Avan Thakkar <athakkar@redhat.com>
8013655
to
3e8ef70
Compare
counter dump command revisioncounter dump command revision
|
>> counter dump >> /metrics endpoint |
|
jenkins test windows |
|
@idryomov Are you running qa job? |
|
No, I just set |
This is for @ceph/core to decide. |
|
From @kamoltat ref: https://trello.com/c/7y6uj4bo RADOS approved, all failures/dead jobs are unrelated and known failures. Failures: 7332480 -> Bug #58946: cephadm: KeyError: 'osdspec_affinity' - Dashboard - Ceph -> cephadm: KeyError: 'osdspec_affinity' - Ceph - Mgr - Dashboard Deads: 7332357 -> Bug #61164: Error reimaging machines: reached maximum tries (100) after waiting for 600 seconds - Infrastructure - Ceph -> Error reimaging machines: reached maximum tries (100) after waiting for 600 seconds |
counter dumpnow emits an array of <labels,counters> pairs for each individual key.Fixes: https://tracker.ceph.com/issues/61587
Example of format change:
Becomes:
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows