Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quincy: osd: Handle oncommits and wait for future work items from mClock queue #47490

Merged
merged 1 commit into from Aug 9, 2022

Conversation

sseshasa
Copy link
Contributor

@sseshasa sseshasa commented Aug 8, 2022

backport tracker: https://tracker.ceph.com/issues/57052


backport of #47216
parent tracker: https://tracker.ceph.com/issues/56530

this backport was staged using ceph-backport.sh version 16.0.0.6848
find the latest version at https://github.com/ceph/ceph/blob/main/src/script/ceph-backport.sh

When a worker thread with the smallest thread index waits for future work
items from the mClock queue, oncommit callbacks are called. But after the
callback, the thread has to continue waiting instead of returning back to
the ShardedThreadPool::shardedthreadpool_worker() loop. Returning results
in the threads with the smallest index across all shards to busy loop
causing very high CPU utilization.

The fix involves reacquiring the shard_lock and waiting on sdata_cond
until notified or until time period lapses. After this, the smallest
thread index repopulates the oncommit queue from the context_queue
if there were any additions.

Fixes: https://tracker.ceph.com/issues/56530
Signed-off-by: Sridhar Seshasayee <sseshasa@redhat.com>
(cherry picked from commit 180a5a7)
@sseshasa sseshasa requested a review from a team as a code owner August 8, 2022 04:50
@sseshasa sseshasa added this to the quincy milestone Aug 8, 2022
@neha-ojha
Copy link
Member

jenkins test make check

@sseshasa
Copy link
Contributor Author

sseshasa commented Aug 9, 2022

Teuthology Test Result
https://pulpito.ceph.com/yuriw-2022-08-08_22:19:32-rados-wip-yuri4-testing-2022-08-08-1009-quincy-distro-default-smithi/

Summary
309/318 Jobs passed. 8 jobs failed. 1 job was not scheduled.

Unrelated Failures

  1. https://tracker.ceph.com/issues/52420 - 'wait for operator' reached maximum tries (90) after waiting for 900 seconds
  2. https://tracker.ceph.com/issues/45721 - FAIL: test_rados.TestWatchNotify.test
  3. https://tracker.ceph.com/issues/56951 - Error EINVAL: Failed to update cephclusters/rook-ceph: (403)
  4. https://tracker.ceph.com/issues/47589 - reached maximum tries (800) after waiting for 4800 seconds
  5. https://tracker.ceph.com/issues/54307 - Test cls_rgw.index_list_delimited timesout in qa/workunits/cls/test_cls_rgw.sh
  6. https://tracker.ceph.com/issues/52124 - Valgrind failures in osd.

@neha-ojha neha-ojha merged commit 07a9635 into ceph:quincy Aug 9, 2022
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants