Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reef: msgr: AsyncMessenger add faulted connections metrics #53033

Merged
merged 2 commits into from Oct 10, 2023

Conversation

pereman2
Copy link
Contributor

@pereman2 pereman2 commented Aug 17, 2023

@pereman2 pereman2 requested a review from jmolmo August 17, 2023 09:03
@pereman2 pereman2 requested review from a team as code owners August 17, 2023 09:03
@pereman2 pereman2 requested review from nizamial09 and Pegonzal and removed request for a team August 17, 2023 09:03
@github-actions github-actions bot added this to the reef milestone Aug 17, 2023
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

pereman2 and others added 2 commits August 17, 2023 11:14
Add msgr_connection_idle_timeouts and msgr_connection_ready_timeouts
labeled perfcounters to keep track of failed connections with prometheus metrics.

Signed-off-by: Pere Diaz Bou <pdiazbou@redhat.com>
Fixes: https://tracker.ceph.com/issues/59076
(cherry picked from commit 587ee42)
Signed-off-by: Pere Diaz Bou <pere-altea@hotmail.com>
(cherry picked from commit 4e89ce7)
@idryomov
Copy link
Contributor

@pereman2 Please note the edited description. If not using ceph-backport.sh script, it needs to be produced manually.

@pereman2
Copy link
Contributor Author

@pereman2 Please note the edited description. If not using ceph-backport.sh script, it needs to be produced manually.

Thanks!

@nizamial09
Copy link
Member

@pereman2 should this go through a qa run or could be merged directly?

@ljflores
Copy link
Contributor

ljflores commented Sep 8, 2023

@ljflores
Copy link
Contributor

ljflores commented Sep 8, 2023

The jobs will all likely be finished running by next week Monday.

@pereman2
Copy link
Contributor Author

@ljflores test failures look unrelated, can you confirm?

@ljflores
Copy link
Contributor

Hi @pereman2, taking a look

@ljflores
Copy link
Contributor

ljflores commented Sep 26, 2023

@pereman2 please see this new failure, which came up three times in the run. I don't think it's related, but see what you think:
Heartbeat crash in osd - https://tracker.ceph.com/issues/62992

Otherwise, everything else is for sure unrelated. If you deem the above tracker to be unrelated, feel free to merge (on behalf of rados).

Failures, unrelated:

  1. https://tracker.ceph.com/issues/61774
  2. https://tracker.ceph.com/issues/58946
  3. https://tracker.ceph.com/issues/59196

Details:

  1. centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons - Ceph - RADOS
  2. cephadm: KeyError: 'osdspec_affinity' - Ceph - Orchestrator
  3. ceph_test_lazy_omap_stats segfault while waiting for active+clean - Ceph - RADOS

@pereman2
Copy link
Contributor Author

@pereman2 please see this new failure, which came up three times in the run. I don't think it's related, but see what you think: Heartbeat crash in osd - https://tracker.ceph.com/issues/62992

Otherwise, everything else is for sure unrelated. If you deem the above tracker to be unrelated, feel free to merge (on behalf of rados).

Failures, unrelated:

  1. https://tracker.ceph.com/issues/61774
  2. https://tracker.ceph.com/issues/58946
  3. https://tracker.ceph.com/issues/59196

Details:

  1. centos 9 testing reveals rocksdb "Leak_StillReachable" memory leak in mons - Ceph - RADOS
  2. cephadm: KeyError: 'osdspec_affinity' - Ceph - Orchestrator
  3. ceph_test_lazy_omap_stats segfault while waiting for active+clean - Ceph - RADOS

Thanks @ljflores. I don't think it is related so we might go forward with merging it.

@aaSharma14
Copy link
Contributor

@jmolmo , Can you please merge this PR after a final review? Thanks

@jmolmo jmolmo merged commit 4638ce2 into ceph:reef Oct 10, 2023
12 of 13 checks passed
@jmolmo jmolmo deleted the wip-62025-reef branch October 10, 2023 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
6 participants