-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: fix index suggest #22798
rgw: fix index suggest #22798
Conversation
since we do nothing to the not exists index, just continue to process other indexes and pass the test case Fixes: http://tracker.ceph.com/issues/24640 Signed-off-by: Tianshan Qu <tianshan@xsky.com>
a special sequence can cause this new situation. IO sequence: 1.put index prepare 2.list, get stale index 3.check_disk_state, find the head obj not exist 4.write head obj 5.index complete 6.aio_operate dir_suggest_changes CEPH_RGW_REMOVE step 6 will delete the index Fixes: http://tracker.ceph.com/issues/24744 Signed-off-by: Tianshan Qu <tianshan@xsky.com>
1.recover index from put crash after complete 2.list raced with put, index_suggest should not delete index 3.list raced with delete, index_suggest should not recover index Signed-off-by: Tianshan Qu <tianshan@xsky.com>
@ivancich please help review |
@tianshan Will do. Thank you. |
the fix for issue 2, seems miss a situation that delete raced with some other op, but the other op is canceled, so the stale index can never be deleted. |
@tianshan Do I understand correctly that you'll be updating this PR with a fix that: a) handles the race condition, and b) is clearer? If that's true, I should wait for the update, right? |
I verified that this version of the PR does not generate the unit test failure. |
I ran the race condition inducing test (https://github.com/ivancich/rgw-race-inducer) for about 20 hours and the race did not take place. @tianshan I remain unclear as to whether you plan on following up with an update to the pr that addresses the issues you raised above. |
@ivancich the first commit fix the ut issue, and the second race condition is a new issue when I write some new ut cases, so I propose a fix. But after talk with my colleague, we think the fix maybe introduced some new issue, and discussed with a new fix idea. |
This PR has been replaced by #22937. Closing. |
which he later pulled. Nonetheless, due to extensive testing by Mark Kogan <mkogan@redhat.com>, this commit seems necessary to address customer's issue. See PR 22798 (ceph#22798). The message for the original commit is: ==== rgw: fix list op raced with put op maybe cause index delete a special sequence can cause this new situation. IO sequence: 1.put index prepare 2.list, get stale index 3.check_disk_state, find the head obj not exist 4.write head obj 5.index complete 6.aio_operate dir_suggest_changes CEPH_RGW_REMOVE step 6 will delete the index Fixes: http://tracker.ceph.com/issues/24744 ==== Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
This comes from a commit submitted by Tianshan Qu <tianshan@xsky.com>, which he later pulled. Nonetheless, due to extensive testing by Mark Kogan <mkogan@redhat.com>, this commit seems necessary to address customer's issue. See PR 22798 (ceph#22798). The message for the original commit is: ==== a special sequence can cause this new situation. IO sequence: 1.put index prepare 2.list, get stale index 3.check_disk_state, find the head obj not exist 4.write head obj 5.index complete 6.aio_operate dir_suggest_changes CEPH_RGW_REMOVE step 6 will delete the index Fixes: http://tracker.ceph.com/issues/24744 ==== Signed-off-by: J. Eric Ivancich <ivancich@redhat.com>
This comes from a commit submitted by Tianshan Qu <tianshan@xsky.com>, which he later pulled. Nonetheless, due to extensive testing by Mark Kogan <mkogan@redhat.com>, this commit seems necessary to address customer's issue. See PR 22798 (ceph#22798). The message for the original commit is: ==== a special sequence can cause this new situation. IO sequence: 1.put index prepare 2.list, get stale index 3.check_disk_state, find the head obj not exist 4.write head obj 5.index complete 6.aio_operate dir_suggest_changes CEPH_RGW_REMOVE step 6 will delete the index Fixes: http://tracker.ceph.com/issues/24744 ==== Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> resolves: rhbz#1584220
This comes from a commit submitted by Tianshan Qu <tianshan@xsky.com>, which he later pulled. Nonetheless, due to extensive testing by Mark Kogan <mkogan@redhat.com>, this commit seems necessary to address customer's issue. See PR 22798 (ceph#22798). The message for the original commit is: ==== a special sequence can cause this new situation. IO sequence: 1.put index prepare 2.list, get stale index 3.check_disk_state, find the head obj not exist 4.write head obj 5.index complete 6.aio_operate dir_suggest_changes CEPH_RGW_REMOVE step 6 will delete the index Fixes: http://tracker.ceph.com/issues/24744 ==== Signed-off-by: J. Eric Ivancich <ivancich@redhat.com> resolves: rhbz#1584220
1.fix the test failure http://tracker.ceph.com/issues/24640
2.fix the race issue http://tracker.ceph.com/issues/24744
3.add test cases for races issues