MDS: make popular counter decay at proper rate #18776
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Jianyu Li joannyli@tencent.com
Currently pop_auth_subtree counter don't decay at right speed, it uses the default DecayRate(which is zero) instead of mds->mdcache->decayrate used by pop_auth_subtree_nested. In this case, the pop_auth_subtree_nested of top-level dir would reduce more rapidly than pop_auth_subtree of low-level dir, after a while, the whole mds_load got from / directory would become much less than low-level directory, which makes it hard for MDbalance to find proper export subtree.
For example, start 12 mdtest processors, each creates/deletes 50000 files under /mnt/cephfs/test[20-30] separately, below is a snip log to illustrate this problem:
2017-11-02 20:43:26.349331 7fe7cff81700 0 mds.0.bal mds.0 mdsload<[8640.6,3290.68 15221.9]/[108733,24090.1 156913], req 1.72793e+06, hr 0, qlen 1, cpu 16.66> = 1.7715e+06 ~ 15221.9 <--- the pop_auth_subtree_nested of / is [8640.6,3290.68 15221.9]
2017-11-02 20:43:26.349357 7fe7cff81700 0 mds.0.bal mds.1 mdsload<[0,0 0]/[0,0 0], req 881655, hr 0, qlen 5, cpu 16.37> = 884380 ~ 7599.19
2017-11-02 20:43:26.349380 7fe7cff81700 5 mds.0.bal prep_rebalance: my load 15221.9 target 11410.6 total 22821.1
2017-11-02 20:43:26.349388 7fe7cff81700 5 mds.0.bal i am sufficiently overloaded
2017-11-02 20:43:26.349400 7fe7cff81700 5 mds.0.bal - mds.0 exports 3811.38 to mds.1
2017-11-02 20:43:26.349414 7fe7cff81700 5 mds.0.bal want to send 3811.38 to mds.1 -> 3811.38
...
2017-11-02 20:43:26.349646 7fe7cff81700 7 mds.0.bal find_exports in 40428 [dir 0x10002d05443 /test31/#test-dir.0/ [2,head] auth v=687 cv=634/634 ap=0+58+59 state=1610612738|complete f(v0 m2017-11-02 18:00:45.117433 1=0+1) n(v76 rc2017-11-02 20:43:25.847469 35384=0+35384) hs=1+0,ss=0+0 dirty=1 | child=1 replicated=0 dirty=1 authpin=0 0x7fe8004bf500] need 3811.38 (3049.1 - 4573.65)
2017-11-02 20:43:26.389502 7fe7cff81700 7 mds.0.bal find_exports in 40490.8 [dir 0x10002d05442 /test28/#test-dir.0/ [2,head] auth v=674 cv=621/621 ap=0+58+59 state=1610612738|complete f(v0 m2017-11-02 18:00:45.107886 1=0+1) n(v74 rc2017-11-02 20:43:25.846067 35356=0+35356) hs=1+0,ss=0+0 dirty=1 | child=1 replicated=0 dirty=1 authpin=0 0x7fe7f7b33a80] need 3811.38 (3049.1 - 4573.65)
<--- but the pop_auth_subtree counter of /test* dirs are much greater than it, which makes this mds couldn't choose any subtree to migrate although it is already overloaded
...