-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd/PG: introduce all_missing_unfound helper #27205
Conversation
We use pg_log.missing to track each peer's missing objects separately, whereas missing_loc records the location of all (probably existing) good copies for both primary and replicas' missing objects. Hence an item from pg_log.missing or missing_loc is of different meaning and is not comparable. During recovery, we can skip recovering primary only if - primary is good, e.g., has no missing at all - or all of the primary's missing objects do exist in missing_loc and are currently unfound Obviously, the current "all missing objects are unfound" checker is broken. Fix by introducing an independent all_missing_unfound helper to make the count of missing objects that are currently unfound correct. Fixes: http://tracker.ceph.com/issues/38784 Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
@neha-ojha I end up fixing http://tracker.ceph.com/issues/38784 in a slightly different way. If this fixer is good, would you like me to add your Signed-off-by line too (since it mainly bases on your analysis :-) |
if (num_missing == num_unfound) { | ||
// All of the missing objects we have are unfound. | ||
if (!missing.have_missing() || // Primary does not have missing | ||
all_missing_unfound()) { // or all of the missing objects are unfound. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xiexingguo Considering the scenario described in http://tracker.ceph.com/issues/38784, this will help to recover the missing object on the primary, before calling recover_replicas(), but wouldn't all_missing_unfound()
(when we have 1 unfound object that is not accounted for in the missing) return false, when checking d949713#diff-d43117703c33bebe41fc39224c8025adR1614?
Thanks for putting up a PR to fix this, it has been on my to-do list :) Don't worry about adding my signature. |
at first, primary has two missing objects (both are delete)
primary starts to recover 3:caaf3368:::benchmark_data_smithi131_76364_object3032:head first, because it is old
recovery of 3:caaf3368:::benchmark_data_smithi131_76364_object3032:head is done
primary should continue to recover 3:c8460e07:::benchmark_data_smithi131_76364_object3062:head in its own missing list, e.g, because it is also a delete, which is obviously recoverable. But it then switches to recover replicas... if (num_missing == num_unfound) {
// All of the missing objects we have are unfound.
// Recover the replicas.
started = recover_replicas(max, handle, &recovery_started);
} Since we still have one missing object in primary's missing list, then the only possibility is that we now have a unfound object in primary's missing_loc. The log continues, until we finally hit the following assert: ceph_assert(attrs || !pg_log.get_missing().is_missing(soid) || (it_objects != pg_log.get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT)) which declares that the object in recovering must not be missing on primary too or else it must be a lost revert object.
Continue to check the log above, and finally we can see:
@neha-ojha I think this patch should make the above bug go away, because we never count a recovery_delete object as unfound. Show me the code, again 👻 bool is_unfound(const hobject_t &hoid) const {
auto it = needs_recovery_map.find(hoid);
if (it == needs_recovery_map.end()) {
return false;
}
if (it->second.is_delete()) {
return false;
}
auto mit = missing_loc.find(hoid);
return mit == missing_loc.end() || !(*is_recoverable)(mit->second);
} So if there is a recovery_delete object in primary's missing list, the new all_missing_unfound helper should instead return false and hence make primary continue to recovery itself first. Right? |
@xiexingguo your analysis looks right and aligns with mine, I guess what I was suggesting is the following
might want to rename all_missing_unfound() to something else in that case. Your version looks fine too! |
In purge_strays(), we'll aggressively clear stray_set and add all related peers into peer_purged. However, if the corrsponding peer is down and becomes up again, (unconditionally) adding it to peer_purged will prevent primary from re-purging it. (See Active::react(const MNotifyRec& notevt)) On consuming a new osdmap, let's move any down peers out from peer_purged simutaneously. This way we can lower the risk of leaving any leftover PGs behind. Related-to: http://tracker.ceph.com/issues/38931 Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
We use pg_log.missing to track each peer's missing objects separately,
whereas missing_loc records the location of all (probably existing) good copies
for both primary and replicas' missing objects. Hence an item from
pg_log.missing or missing_loc is of different meaning and is not comparable.
During recovery, we can skip recovering primary only if
currently unfound
Obviously, the current "all missing objects are unfound" checker is broken.
Fix by introducing an independent all_missing_unfound helper to make the
count of missing objects that are currently unfound correct.
Fixes: http://tracker.ceph.com/issues/38784
Signed-off-by: xie xingguo xie.xingguo@zte.com.cn