Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink dict when deleting dictEntry #12850

Merged
merged 14 commits into from Jan 15, 2024
Merged

Conversation

lyq2333
Copy link
Contributor

@lyq2333 lyq2333 commented Dec 8, 2023

When we insert entries into dict, it may autonomously expand if needed. However, when we delete entries from dict, it doesn't shrink to the proper size. If there are few entries in a very large dict, it may cause huge waste of memory and inefficiency when iterating.

The main keyspace dicts (keys and expires), are shrinked by cron (tryResizeHashTables calls htNeedsResize and dictResize),
And some data structures such as zset and hash also do that (call htNeedsResize) right after a loop of calls to dictDelete,
But many other dicts are completely missing that call (they can only expand).

In this PR, we provide the ability to automatically shrink the dict when deleting. The conditions triggering the shrinking is the same as htNeedsResize used to have. i.e. we expand when we're over 100% utilization, and shrink when we're below 10% utilization.

Additionally:

  • Add dictPauseAutoResize so that flows that do mass deletions, will only trigger shrinkage at the end.
  • Rename dictResize to dictShrinkToFit (same logic as it used to have, but better name describing it)
  • Rename _dictExpand to _dictResize (same logic as it used to have, but better name describing it)

related to discussion #12819 (comment)

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally, i'd like to proceed with this direction, but i have some concerns.

src/dict.c Outdated Show resolved Hide resolved
src/t_set.c Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM, minor comment edit.

src/dict.c Outdated Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
Co-authored-by: Oran Agra <oran@redislabs.com>
src/dict.c Show resolved Hide resolved
src/t_zset.c Show resolved Hide resolved
@oranagra
Copy link
Member

oranagra commented Jan 2, 2024

this PR was discussed in a core-team meeting and we agreed we can proceed and merged it.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyq2333 i edited the top comment, please check if it looks ok to you.

src/t_zset.c Outdated Show resolved Hide resolved
@lyq2333
Copy link
Contributor Author

lyq2333 commented Jan 4, 2024

@lyq2333 i edited the top comment, please check if it looks ok to you.

@oranagra Thanks. Looks good to me.

src/dict.c Show resolved Hide resolved
src/dict.c Show resolved Hide resolved
src/dict.c Outdated Show resolved Hide resolved
@oranagra
Copy link
Member

oranagra commented Jan 8, 2024

@soloestoy waiting for your ack and response in one of the above threads.

src/dict.c Show resolved Hide resolved
src/dict.c Show resolved Hide resolved
src/dict.c Show resolved Hide resolved
src/dict.c Show resolved Hide resolved
@oranagra oranagra merged commit e2b7932 into redis:unstable Jan 15, 2024
13 checks passed
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jan 15, 2024
The new shrink was added in redis#12850.
Also updated outdated comments, see redis#11692.
oranagra pushed a commit that referenced this pull request Jan 15, 2024
The new shrink was added in #12850.
Also updated outdated comments, see #11692.
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jan 18, 2024
Before redis#12850, we will only try to shrink the dict in serverCron,
which we can control by using a child process, but now every time
we delete a key, the shrink check will be called.

In these test (added in redis#12802), we meant to disable the resizing,
but druing the delete, the dict will meet the force shrink, like
2 / 128 = 0.015 < 0.2, the delete will trigger a force resize and
will cause the test to fail.

In this commit, we try to keep the load factor at 3 / 128 = 0.023,
that is, do not meet the force shrink.
oranagra pushed a commit that referenced this pull request Jan 18, 2024
Before #12850, we will only try to shrink the dict in serverCron,
which we can control by using a child process, but now every time
we delete a key, the shrink check will be called.

In these test (added in #12802), we meant to disable the resizing,
but druing the delete, the dict will meet the force shrink, like
2 / 128 = 0.015 < 0.2, the delete will trigger a force resize and
will cause the test to fail.

In this commit, we try to keep the load factor at 3 / 128 = 0.023,
that is, do not meet the force shrink.
oranagra pushed a commit that referenced this pull request Jan 19, 2024
Before this change (most recently modified in
#12850 (comment)), The
trigger for normal expand threshold was 100% utilization and the trigger
for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the
trigger thresholds were multiplied by 5 (`dict_force_resize_ratio`),
meaning 500% for expand and 2% (100/10/5) for shrink.

However, in `dictRehash` (the incremental rehashing), the rehashing
threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by
mistake.
This meant that if a shrinking is triggered when `dict_can_resize` is
`DICT_RESIZE_ENABLE` which the threshold is 10%, the rehashing can
continue when `dict_can_resize` is `DICT_RESIZE_AVOID`.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the
thresholds of the incremental rehashing the same, however, in one we
compare the size of the hash table to the number of records, and in the
other we compare the size of ht[0] to the size of ht[1], so the formula
is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so
the normal shrinking threshold is changed from 100/10 (i.e. 10%) to
100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to
4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125%
(100/8/4)
oranagra pushed a commit that referenced this pull request Jan 29, 2024
)

The function `tryResizeHashTables` only attempts to shrink the dicts
that has keys (change from #11695), this was a serious problem until the
change in #12850 since it meant if all keys are deleted, we won't shrink
the dick.
But still, both dictShrink and dictExpand may be blocked by a fork child
process, therefore, the cron job needs to perform both dictShrink and
dictExpand, for not just non-empty dicts, but all dicts in DBs.

What this PR does:

1. Try to resize all dicts in DBs (not just non-empty ones, as it was
since #12850)
2. handle both shrink and expand (not just shrink, as it was since
forever)
3. Refactor some APIs about dict resizing (get rid of `htNeedsShrink`
`htNeedsShrink` `dictShrinkToFit`, and expose `dictShrinkIfNeeded`
`dictExpandIfNeeded` which already contains all the code of those
functions we get rid of, to make APIs more neat)
4. In the `Don't rehash if redis has child process` test, now that cron
would do resizing, we no longer need to write to DB after the child
process got killed, and can wait for the cron to expand the hash table.
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
When we insert entries into dict, it may autonomously expand if needed.
However, when we delete entries from dict, it doesn't shrink to the
proper size. If there are few entries in a very large dict, it may cause
huge waste of memory and inefficiency when iterating.

The main keyspace dicts (keys and expires), are shrinked by cron
(`tryResizeHashTables` calls `htNeedsResize` and `dictResize`),
And some data structures such as zset and hash also do that (call
`htNeedsResize`) right after a loop of calls to `dictDelete`,
But many other dicts are completely missing that call (they can only
expand).

In this PR, we provide the ability to automatically shrink the dict when
deleting. The conditions triggering the shrinking is the same as
`htNeedsResize` used to have. i.e. we expand when we're over 100%
utilization, and shrink when we're below 10% utilization.

Additionally:
* Add `dictPauseAutoResize` so that flows that do mass deletions, will
only trigger shrinkage at the end.
* Rename `dictResize` to `dictShrinkToFit` (same logic as it used to
have, but better name describing it)
* Rename `_dictExpand` to `_dictResize` (same logic as it used to have,
but better name describing it)
 
related to discussion
redis#12819 (comment)

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Co-authored-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
The new shrink was added in redis#12850.
Also updated outdated comments, see redis#11692.
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
Before redis#12850, we will only try to shrink the dict in serverCron,
which we can control by using a child process, but now every time
we delete a key, the shrink check will be called.

In these test (added in redis#12802), we meant to disable the resizing,
but druing the delete, the dict will meet the force shrink, like
2 / 128 = 0.015 < 0.2, the delete will trigger a force resize and
will cause the test to fail.

In this commit, we try to keep the load factor at 3 / 128 = 0.023,
that is, do not meet the force shrink.
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
Before this change (most recently modified in
redis#12850 (comment)), The
trigger for normal expand threshold was 100% utilization and the trigger
for normal shrink threshold was 10% (HASHTABLE_MIN_FILL).
While during fork (DICT_RESIZE_AVOID), when we want to avoid rehash, the
trigger thresholds were multiplied by 5 (`dict_force_resize_ratio`),
meaning 500% for expand and 2% (100/10/5) for shrink.

However, in `dictRehash` (the incremental rehashing), the rehashing
threshold for shrinking during fork (DICT_RESIZE_AVOID) was 20% by
mistake.
This meant that if a shrinking is triggered when `dict_can_resize` is
`DICT_RESIZE_ENABLE` which the threshold is 10%, the rehashing can
continue when `dict_can_resize` is `DICT_RESIZE_AVOID`.
This would cause unwanted CopyOnWrite damage.

It'll make sense to change the thresholds of the rehash trigger and the
thresholds of the incremental rehashing the same, however, in one we
compare the size of the hash table to the number of records, and in the
other we compare the size of ht[0] to the size of ht[1], so the formula
is not exactly the same.

to make things easier we change all the thresholds to powers of 2, so
the normal shrinking threshold is changed from 100/10 (i.e. 10%) to
100/8 (i.e. 12.5%), and we change the threshold during forks from 5 to
4, i.e. from 500% to 400% for expand, and from 2% (100/10/5) to 3.125%
(100/8/4)
roggervalf pushed a commit to roggervalf/redis that referenced this pull request Feb 11, 2024
…is#12819)

The function `tryResizeHashTables` only attempts to shrink the dicts
that has keys (change from redis#11695), this was a serious problem until the
change in redis#12850 since it meant if all keys are deleted, we won't shrink
the dick.
But still, both dictShrink and dictExpand may be blocked by a fork child
process, therefore, the cron job needs to perform both dictShrink and
dictExpand, for not just non-empty dicts, but all dicts in DBs.

What this PR does:

1. Try to resize all dicts in DBs (not just non-empty ones, as it was
since redis#12850)
2. handle both shrink and expand (not just shrink, as it was since
forever)
3. Refactor some APIs about dict resizing (get rid of `htNeedsShrink`
`htNeedsShrink` `dictShrinkToFit`, and expose `dictShrinkIfNeeded`
`dictExpandIfNeeded` which already contains all the code of those
functions we get rid of, to make APIs more neat)
4. In the `Don't rehash if redis has child process` test, now that cron
would do resizing, we no longer need to write to DB after the child
process got killed, and can wait for the cron to expand the hash table.
oranagra added a commit that referenced this pull request Feb 11, 2024
Fail CI:
https://github.com/redis/redis/actions/runs/7837608438/job/21387609715

## Why defragment tests only failed under 32-bit

First of all, under 32-bit jemalloc will allocate more small bins and
less large bins, which will also lead to more external fragmentation,
therefore, the fragmentation ratio is higher in 32-bit than in 64-bit,
so the defragment tests(`Active defrag eval scripts: cluster` and
`Active defrag big keys: cluster`) always fails in 32-bit.

## Why defragment tests only failed with cluster
The fowllowing is the result of `Active defrag eval scripts: cluster`
test.

1) Before #11695, the fragmentation ratio is 3.11%.

2) After #11695, the fragmentation ratio grew to 4.58%.
Since we are using per-slot dictionary to manage slots, we will only
defragment the contents of these dictionaries (keys, values), but not
the dictionaries' struct and ht_table, which means that frequent
shrinking and expanding of the dictionaries, will make more fragments.

3) After #12850 and #12948, In cluster mode, a large number of cluster
slot dicts will be shrunk, creating additional fragmention, and the
dictionary will not be defragged.

## Solution
* Add defragmentation of the per-slot dictionary's own structures, dict
struct and ht_table.

## Other change
* Increase floating point print precision of `frags` and `rss` in debug
logs for defrag

---------

Co-authored-by: Oran Agra <oran@redislabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants