New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd/PG: store osd_async_recovery_min_cost in pg_info_t #43534
Conversation
pg peering forever when trigger async recovery if single osd config value that "osd_async_recovery_min_cost" differ peer osd. For example, up[9, 8, 27] acting[9, 8, 27]. osd.9 osd_async_recovery_min_cost is 20, but osd.8 and osd.27 osd_async_recovery_min_cost is 100. If osd.9 down, and then restart after some time, osd.9 acting is [8, 27] because of choose async recovery. So osd.8 is primary, but osd.9 will not be erase in want set of osd.8, so osd.8 report mon pg_temp[9, 8, 27]. pg will peering forever. Select min value of osd_async_recovery_min_cost in each peer shard when choose async recovery. Fixes: https://tracker.ceph.com/issues/52925 Signed-off-by: Yite Gu <ess_gyt@qq.com>
@neha-ojha hi, ojha, look my patch if you have time :) |
I want to add a new value for PG class: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the description in the commit message and in the tracker, it sounds like we are trying to make this change to address a very specific use case, where min cost on one osd is intentionally set to a different value. The result observed in the tracker is a known behavior. Ideally we do not recommend changing the default value of osd_async_recovery_min_cost (it is an advanced config option), and if one needs to change it, they should apply to all osds.
Are there any other advantages of this PR?
osd_async_recovery_min_cost, controls how much asynchronous recovery to do. A higher value means asynchronous recovery will be less, whereas a lower value means asynchronous recovery will be more. In some cases, user want adjust this value to further reduce impact of normal recovery in run-time. "one needs to change it, they should apply to all osds", good idea. I only want to avoid happend accident about this PR. |
I really intentionally set to a different value :( , I didn't expect it happend such heavy impact. I think even if set to a different values, we should have a way to let it work normally. what do you think, professor ojha? |
After simple investigation, ceph tell osd.* config set osd_async_recovery_min_cost can apply to all osd, but if one osd restart, still cause value different to other osds. I'm thinking, as you say, do not recommend changing the osd_async_recovery_min_cost. Could we remove osd_async_recovery_min_cost? and we write a special value in code. @neha-ojha |
Can you please write a test case to demonstrate the problems this could cause?
Not sure I understand what you mean by this |
up [30, 11, 16] acting [30, 11, 16]
Conclusion: if osd_force_auth_primary_missing_objects defferent to peer, cause pg peering forever also. so, my mean is
change to
even if osd_async_recovery_min_cost have change to inconsistent by user, it have not impact. @neha-ojha |
@neha-ojha hmm~ After I think about it carefully, such as osd_async_recovery_min_cost or osd_force_auth_primary_missing_objects config can not modify careless. administrators must know what myself are doing. Please close this pr. |
exactly, thanks for your patience! |
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox