Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pybind/mgr/balancer: fix pool-deletion vs auto-optimization race
This patch fixes the error below: ``` File "/usr/lib/ceph/mgr/balancer/module.py", line 722, in optimize return self.do_crush_compat(plan) File "/usr/lib/ceph/mgr/balancer/module.py", line 781, in do_crush_compat pe = self.calc_eval(ms, plan.pools) File "/usr/lib/ceph/mgr/balancer/module.py", line 570, in calc_eval objects_by_osd[osd] += ms.pg_stat[pgid]['num_objects'] KeyError: ('5.1b',) ``` The root cause is that balancer is basically collecting cluster information from two separate maps (OSDMap and PGMap), and hence there is a small window/chance that the pool statistics might become divergent. E.g.: 1) auto-optimization begin 2) get osdmap 3) a pool is gone (deleted by admin); pg_dump refreshed 4) get pg_dump (balancer is now with both the newest pg_dump and an obsolute osdmap in hand) 5) execute optimization; balancer complains some PGs are missing in the pg_dump map.. Fix the above problem by tracing pools existing in both maps only. Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn> (cherry picked from commit a57b803)
- Loading branch information