Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
storage: scatter right after split leads to poor balance #35907
Run this script:
See this kind of graph:
The expectation is that the SCATTER leaves the leaseholders roughly balanced. The graph shows a >10x difference.
We think that much of the variability in durations of bulk i/o restore/import is due to this phenomenon.
When I looked at this last, I think it was caused by the allocator receiving updated replica counts only at some interval, but I just inserted a 20s sleep before the scatter and it's just as bad. Ditto 70s:
Scatters seems to .... just not be doing the right thing. It seems to drain the local node, giving equal shares of the leases to the other followers.
cc @danhhz this is much worse than I thought