Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is based on #7325
there is something wrong on the origin PR #7325 , and I fix it based on v0.94.6 version;
also fix compatibility problem: use osd_recovery_partial config to control partial recover, default false, when upgrade, recover whole object, after upgraded, change osd_recovery_partial to true, then it can recover partial.
here is such an situation we solved:
consider such a upgrade situation which we need to upgrade to this can_recover_partial version:
eg. a pg 3.67 [0, 1, 2]
1)firstly, we update osd.0(service ceph restart osd.0), and recover normally, everything goes on;
2)a write req(eg. req1, will write to obj1) is sent to primary(osd.0), and pglog record such a req;
3)then we update osd.1, req1 send to osd.1 fail, but will send to osd.2, when osd.2 is dealing with the req(just in function do_request), pg3.67 starts peering, then on osd.7, it call can_discard_request to check that req1 should be dropped;
4)so the req1 only write successfuly on osd.0, because min_size=2, osd.0 re-enqueue the req1;
5)when peering, primary find that req1's object obj1 is missing on osd.1 and osd.2, so recover the object;
6)because osd.0 and osd.1 is already updated, osd.0 will calculate partial data in prep_push_to_replica, and osd.1 can deal with the partial data very well,
7)but osd.2 has not been updated, on osd.2's code logic(submit_push_data), it will remove origin object first, then write the partial data from osd.0, so the origin data of the object is lost;