Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the threshold of dict expand, shrink and rehash #12948

Merged
merged 6 commits into from
Jan 19, 2024

Conversation

lyq2333
Copy link
Contributor

@lyq2333 lyq2333 commented Jan 15, 2024

Before this change (most recently modified in #12850 (comment)), The trigger for normal expand threshold was 100% utilization and the trigger for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the trigger thresholds were multiplied by 5 (dict_force_resize_ratio), meaning 500% for expand and 2% (100/10/5) for shrink.

However, in dictRehash (the incremental rehashing), the rehashing threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by mistake.
This meant that if a shrinking is triggered when dict_can_resize is DICT_RESIZE_ENABLE which the threshold is 10%, the rehashing can continue when dict_can_resize is DICT_RESIZE_AVOID.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the thresholds of the incremental rehashing the same, however, in one we compare the size of the hash table to the number of records, and in the other we compare the size of ht[0] to the size of ht[1], so the formula is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so the normal shrinking threshold is changed from 100/10 (i.e. 10%) to 100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to 4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125% (100/8/4)

@oranagra
Copy link
Member

that's actually a bug in #12276, (downsize rehashing at 20% instead of 2%), right?
but i suppose it's not that severe, considering it'll only happen if a resize is triggered (which will happen at 2% or 10% depending on the policy). @sundb @enjoy-binbin PATL.
so maybe we don't need to rush an backport such change (which is complicated considering all the recent changes to dict.c.

@soloestoy @lyq2333 isn't it better to have dictRehash use the same conditions as in _dictExpandIfNeeded and _dictSrinkIfNeeded? i.e. if we decided to expand or shrink in a specific scenario, we should also do the rehashing (and eventually release the extra LUT)

@enjoy-binbin
Copy link
Collaborator

that's actually a bug in #12276, (downsize rehashing at 20% instead of 2%), right?
but i suppose it's not that severe, considering it'll only happen if a resize is triggered (which will happen at 2% or 10% depending on the policy)

yes, that is right.

i also feel the same conditions need to be used. That is to avoid introducing some other conditions.

@soloestoy
Copy link
Collaborator

isn't it better to have dictRehash use the same conditions as in _dictExpandIfNeeded and _dictSrinkIfNeeded?

We can't, _dictExpandIfNeeded compares ht_used[0] and DICTHT_SIZE[0], while dictRehash compares DICTHT_SIZE[0] and DICTHT_SIZE[1]. When performing dictRehash, the total ht_used (sum of ht_used[0] and ht_used[1]) may have changed, and it may no longer meet the threshold for _dictExpandIfNeeded or _dictSrinkIfNeeded. Therefore, we need to use DICTHT_SIZE to infer the conditions used by _dict*IfNeeded at that time.

@lyq2333
Copy link
Contributor Author

lyq2333 commented Jan 17, 2024

isn't it better to have dictRehash use the same conditions as in _dictExpandIfNeeded and _dictSrinkIfNeeded

I agree with @soloestoy.

Because we trigger a shrinking at 2% or 10%, no shrinking will be stopped by this condition which is 20%. I think it's better that this condition could limit some shrinking just like expanding.

@sundb
Copy link
Collaborator

sundb commented Jan 17, 2024

@soloestoy d->ht_used[0] / DICTHT_SIZE(d->ht_size_exp[0]) > dict_force_expand_rehash_ratio
What about this? This way the expand goes from 10% to 12.5%, This way both sides can be kept same.

src/dict.c Outdated
/* If dict_can_resize is DICT_RESIZE_AVOID, we want to avoid rehashing.
* We can restrict the condition because both s0 and s1 are powers of 2.
* - If expanding, the _dictNextExp of dict_force_resize_ratio is 8.
* - If shrinking, the _dictNextExp of (HASHTABLE_MIN_FILL / dict_force_resize_ratio) is 1/32. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just been trying to understand why it's 32, not 64, and I thinks that others will have that understanding cost if we don't make both sides consistent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#12950 can explain : )

@soloestoy
Copy link
Collaborator

dictRehash is executed after dictResize, at which point ht_table[1] has already been created. The rehashing process is progressive, meaning that some buckets may have already been moved to ht_table[1] during each dictRehash execution. Therefore, ht_used[0] cannot be used for calculations. In fact, neither ht_used[0] nor ht_used[1] are suitable for determining the threshold for dictRehash. I believe that during the progressive rehashing phase, which occurs after the resize, it is more appropriate to focus on the buckets size of both the old and new tables, rather than the number of used size.

@oranagra
Copy link
Member

isn't it better to have dictRehash use the same conditions as in _dictExpandIfNeeded and _dictSrinkIfNeeded?

We can't, _dictExpandIfNeeded compares ht_used[0] and DICTHT_SIZE[0], while dictRehash compares DICTHT_SIZE[0] and DICTHT_SIZE[1]. When performing dictRehash, the total ht_used (sum of ht_used[0] and ht_used[1]) may have changed, and it may no longer meet the threshold for _dictExpandIfNeeded or _dictSrinkIfNeeded. Therefore, we need to use DICTHT_SIZE to infer the conditions used by _dict*IfNeeded at that time.

you're right.

I believe that during the progressive rehashing phase, which occurs after the resize, it is more appropriate to focus on the buckets size of both the old and new tables, rather than the number of used size.

i agree.

but still i don't understand why we need different thresholds, and not re-use the same ones.
it'll maybe require some adjustments since we're now using power of two, but maybe we can tweak the original thresholds or slightly adjust the formula instead of maintaining two thresholds?

@soloestoy
Copy link
Collaborator

it'll maybe require some adjustments since we're now using power of two, but maybe we can tweak the original thresholds or slightly adjust the formula instead of maintaining two thresholds?

Yes, the reason is as follows: for expand, the next power of 2 after 5 is 8, and for shrink, 2% of the previous power of 2 is 1/32.

@oranagra
Copy link
Member

oranagra commented Jan 18, 2024

so let's change dict_force_expand_rehash_ratio to 4 instead of 5.
and maybe we should also change HASHTABLE_MIN_FILL from 10 to 12.5 (i.e. 100 / 8).

then avoid introducing the 2 new thresholds this PR currently adds, and the formulas and comments will make much more sense?

@soloestoy
Copy link
Collaborator

How about change HASHTABLE_MIN_FILL to 8? This means it will be 1/8, and then multiplying HASHTABLE_MIN_FILL by dict_force_expand_rehash_ratio will be 32, which is 1/32.

@oranagra
Copy link
Member

isn't that what i said..?
today it's 10% with is both x/100*10, but also just x/10, so it's confusing.
doing x/8 brings us to 12.5%.
dividing further by 4 (dict_force_expand_rehash_ratio), will get us to x/8/4 (i.e. 3.125%), aka 100/32

@soloestoy
Copy link
Collaborator

and maybe we should also change HASHTABLE_MIN_FILL from 10 to 12.5 (i.e. 100 / 8).

I thought you wanna 12.5, haha

@lyq2333
Copy link
Contributor Author

lyq2333 commented Jan 19, 2024

@oranagra @soloestoy I have changed dict_force_resize_ratio to 4 and HASHTABLE_MIN_FILL to 8 to avoid introducing
the new thresholds. Please take a look if it matches your discussions. Thanks.

@enjoy-binbin Because of the change of dict_force_resize_ratio and HASHTABLE_MIN_FILL, I also adjust the test code to avoid code modifications if we change the two thresholds in the future. PTAL. Thanks.

@oranagra
Copy link
Member

@lyq2333 thanks. please update the title and top comment to match the new code and make sure they explain the problem and solution.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. waiting for acks from others before i merge.

Copy link
Collaborator

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes and tests LGTM

src/dict.c Outdated Show resolved Hide resolved
@oranagra
Copy link
Member

@oranagra
Copy link
Member

@lyq2333 we have a build failure in the 32bit builds: https://github.com/redis/redis/actions/runs/7584286940/job/20657641562

@lyq2333 lyq2333 changed the title Strengthen restriction in rehashing if dict_can_resize is DICT_RESIZE_AVOID Change the threshold of dict expand, shrink and rehash Jan 19, 2024
@oranagra
Copy link
Member

@lyq2333 i edited the top comment. please ack or edit

@lyq2333
Copy link
Contributor Author

lyq2333 commented Jan 19, 2024

@lyq2333 i edited the top comment. please ack or edit

@oranagra Thanks very much! LGTM.

@oranagra oranagra merged commit b07174a into redis:unstable Jan 19, 2024
13 checks passed
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
Before this change (most recently modified in
redis#12850 (comment)), The
trigger for normal expand threshold was 100% utilization and the trigger
for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the
trigger thresholds were multiplied by 5 (`dict_force_resize_ratio`),
meaning 500% for expand and 2% (100/10/5) for shrink.

However, in `dictRehash` (the incremental rehashing), the rehashing
threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by
mistake.
This meant that if a shrinking is triggered when `dict_can_resize` is
`DICT_RESIZE_ENABLE` which the threshold is 10%, the rehashing can
continue when `dict_can_resize` is `DICT_RESIZE_AVOID`.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the
thresholds of the incremental rehashing the same, however, in one we
compare the size of the hash table to the number of records, and in the
other we compare the size of ht[0] to the size of ht[1], so the formula
is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so
the normal shrinking threshold is changed from 100/10 (i.e. 10%) to
100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to
4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125%
(100/8/4)
oranagra added a commit that referenced this pull request Feb 11, 2024
Fail CI:
https://github.com/redis/redis/actions/runs/7837608438/job/21387609715

## Why defragment tests only failed under 32-bit

First of all, under 32-bit jemalloc will allocate more small bins and
less large bins, which will also lead to more external fragmentation,
therefore, the fragmentation ratio is higher in 32-bit than in 64-bit,
so the defragment tests(`Active defrag eval scripts: cluster` and
`Active defrag big keys: cluster`) always fails in 32-bit.

## Why defragment tests only failed with cluster
The fowllowing is the result of `Active defrag eval scripts: cluster`
test.

1) Before #11695, the fragmentation ratio is 3.11%.

2) After #11695, the fragmentation ratio grew to 4.58%.
Since we are using per-slot dictionary to manage slots, we will only
defragment the contents of these dictionaries (keys, values), but not
the dictionaries' struct and ht_table, which means that frequent
shrinking and expanding of the dictionaries, will make more fragments.

3) After #12850 and #12948, In cluster mode, a large number of cluster
slot dicts will be shrunk, creating additional fragmention, and the
dictionary will not be defragged.

## Solution
* Add defragmentation of the per-slot dictionary's own structures, dict
struct and ht_table.

## Other change
* Increase floating point print precision of `frags` and `rss` in debug
logs for defrag

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants