New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: stray pg should/could not send notify anymore #25589
Conversation
retest this please |
I'm confused by step 4. If the interval did not change, why does the stray send the notify? Also, with step 5, is this because the notify and the ful log query pass each other on the wire? |
We should open a tracker ticket to track this issue. |
step 4 is the case that it the stray pg handle an advance osdmap and then handle_activate_map in OSD::advance_pg(), the default action when stray pg react ActMap is sending notify(see PG::RecoveryState::Stray::react(const ActMap&)). currently stray pg will stop sending if it goes into active state. step 5 is because primary proc_replica_log first, and then handle notify in PG::RecoveryState::Primary::react(const MNotifyRec& notevt) |
@liewegas I opened a new tracker, and then found that we already have a same failed assert there, see http://tracker.ceph.com/issues/15373. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
@athanatos mind take a look? |
retest this please |
This PR is suspected cause of http://tracker.ceph.com/issues/39441 ... putting on hold for the moment |
The root cause for http://tracker.ceph.com/issues/39441 was not this PR, and the bug has now been fixed. Let's run this PR through another round of testing. |
@Aran85 needs rebase |
let us say: primary pg queried full log from stray pg primary rewind the stray pg log when proc_replica_log(),and generate new peer info osdmap update and did not change peering interval stray pg react ActMap and send old info the new peer info was replaced with older info when Primary::react(const MNotifyRec& notevt) when activating, primary pg will use old peer info to search missing.. Fixes: http://tracker.ceph.com/issues/37679 Signed-off-by: Zengran Zhang <zhangzengran@sangfor.com.cn>
@tchaikov rebased, thanks. |
rados run: http://pulpito.ceph.com/nojha-2019-05-07_17:20:56-rados-fix-pg-notify-distro-basic-smithi/ @Aran85 I noticed http://pulpito.ceph.com/nojha-2019-05-07_17:20:56-rados-fix-pg-notify-distro-basic-smithi/3937720/, which we have not seen before. Can you take look to make sure it not related? |
hi @neha-ojha
It seems that some ec shard read failed but whole stripes decode successfully, So i think it is related with the read failure.. can we get the osd log? |
Should be in the log directory in http://qa-proxy.ceph.com/teuthology/nojha-2019-05-07_17:20:56-rados-fix-pg-notify-distro-basic-smithi/3937720/remote/ |
@neha-ojha @Aran85 There is a pg [7,4,3]/[7,4,2147483647]p7(0) async=[3(2)]. The read of 3(0) is getting ENOENT read_result_t(r=0, errors={3(0)=-2}. Everything looks ok to me except that I'm not sure why 3(0) is missing. So there is also a missing shard 7(0) causing a pull that reads 3(0) which is the easiest way to fix the primary. Keep in mind the async recovery is to 3(2) a different shard that happens to also be stored on osd.3. So after 3(0) fails, the primary is able to read 4(1) and 6(2) to reconstruct shard 0 for osd.7. Is this what the fix does? Makes 6(2) a stray available? If so, maybe we need to add "enough copies available" to the log-whitelist. |
Thanks @dzafman for help to debug. 6(2) being a stray available is not the fix does, it is because search missing found the 6(2) shard . I checked the osd log, and have no idea about why 3(0) is missing. 3(0) last update when it is in acting set is 87'1005, 87'1004 is a modify, 87'1005 is a delete, they are on same object.Then interval change, auth log rewind to 87'1004, so 3(0) treat the 87'1005 as divergent log, and rolled back to 87'1004. when 7(0) read shard from 3(0) for recover, read error occurred on 3(0). I see it's because get_onode return -2. but I also see some get_onode error on above process. related logs of 3(0): 87'1005 delete appiled
87'1005 rolled back to 87'1004
sub read 87'1004 error
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
unstale |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
let us say:
Fixes: http://tracker.ceph.com/issues/37679
Signed-off-by: Zengran Zhang zhangzengran@sangfor.com.cn