Client eviction #8687

yoav-steinberg · 2021-03-23T14:51:16Z

Description

A mechanism for disconnecting clients when the sum of all connected clients is above a configured limit. This prevents eviction or OOM caused by accumulated used memory between all clients. It's a complimentary mechanism to the client-output-buffer-limit mechanism which takes into account not only a single client and not only output buffers but rather all memory used by all clients.

Design

The general design is as following:

We track memory usage of each client, taking into account all memory used by the client (query buffer, output buffer, parsed arguments, etc...). This is kept up to date after reading from the socket, after processing commands and after writing to the socket.
Based on the used memory we sort all clients into buckets. Each bucket contains all clients using up up to x2 memory of the clients in the bucket below it. For example up to 1m clients, up to 2m clients, up to 4m clients, ...
Before processing a command and before sleep we check if we're over the configured limit. If we are we start disconnecting clients from larger buckets downwards until we're under the limit.

Config

maxmemory-clients max memory all clients are allowed to consume, above this threshold we disconnect clients.
This config can either be set to 0 (meaning no limit), a size in bytes (possibly with MB/GB suffix),
or as a percentage of maxmemory by using the % suffix (e.g. setting it to 10% would mean 10% of maxmemory).

Important code changes

During the development I encountered yet more situations where our io-threads access global vars. And needed to fix them. I also had to handle keeps the clients sorted into the memory buckets (which are global) while their memory usage changes in the io-thread. To achieve this I decided to simplify how we check if we're in an io-thread and make it much more explicit. I removed the CLIENT_PENDING_READ flag used for checking if the client is in an io-thread (it wasn't used for anything else) and just used the global io_threads_op variable the same way to check during writes.
I optimized the cleanup of the client from the clients_pending_read list on client freeing. We now store a pointer in the client struct to this list so we don't need to search in it (pending_read_list_node).
Added evicted_clients stat to INFO command.
Added CLIENT NO-EVICT ON|OFF sub command to exclude a specific client from the client eviction mechanism. Added corrosponding 'e' flag in the client info string.
Added multi-mem field in the client info string to show how much memory is used up by buffered multi commands.
Client tot-mem now accounts for buffered multi-commands, pubsub patterns and channels (partially), tracking prefixes (partially).
CLIENT_CLOSE_ASAP flag is now handled in a new beforeNextClient() function so clients will be disconnected between processing different clients and not only before sleep. This new function can be used in the future for work we want to do outside the command processing loop but don't want to wait for all clients to be processed before we get to it. Specifically I wanted to handle output-buffer-limit related closing before we process client eviction in case the two race with each other.
Added a DEBUG CLIENT-EVICTION command to print out info about the client eviction buckets.
Each client now holds a pointer to the client eviction memory usage bucket it belongs to and listNode to itself in that bucket for quick removal.
Global io_threads_op variable now can contain a IO_THREADS_OP_IDLE value indicating no io-threading is currently being executed.
In order to track memory used by each clients in real-time we can't rely on updating these stats in clientsCron() alone anymore. So now I call updateClientMemUsage() (used to be clientsCronTrackClientsMemUsage()) after command processing, after writing data to pubsub clients, after writing the output buffer and after reading from the socket (and maybe other places too). The function is written to be fast.
Clients are evicted if needed (with appropriate log line) in beforeSleep() and before processing a command (before performing oom-checks and key-eviction).
All clients memory usage buckets are grouped as follows:
- All clients using less than 64k.
- 64K..128K
- 128K..256K
- ...
- 2G..4G
- All clients using 4g and up.
Added client-eviction.tcl with a bunch of tests for the new mechanism.
Extended maxmemory.tcl to test the interaction between maxmemory and maxmemory-clients settings.
Added an option to flag a numeric configuration variable as a "percent", this means that if we encounter a '%' after the number in the config file (or config set command) we consider it as valid. Such a number is store internally as a negative value. This way an integer value can be interpreted as either a percent (negative) or absolute value (positive). This is useful for example if some numeric configuration can optionally be set to a percentage of something else.

Origianl PR description:

See #7676

Description

We track how much memory each client uses. The value is updated after each processInputBuffer and after writing to the socket. When the sum of all clients exceeds configuration we disconnect the fat clients. This is checked after each command.

This is more or less done given the current state of my discussions with @oranagra.
We need a larger forum to review this solution and decide if it's good enough.

Important/Open issues:

@oranagra and I thought it won't be right to evict clients based on some rolling avg of memory usage, rather to look at the current state. This is because we evict clients until we're back under some threshold and looking at past memory usage while aiming to go below some current memory usage might cause unexpected results. Originally we thought some past avg will be needed to avoid disconnecting a bursting client which uses up a lot of memory for a short time, but in reality such a client might eventually also be disconnected and it might not be what we want if not disconnecting it will cause all other clients to disconnect as well. There's no one right solution here, but I think adding a past based rolling avg just complicates things and makes them less predictable and tougher to tune for the user.
How does this relate to the already existing single client COB threshold. Do we still need it? Or is it redundant and should be deprecated. From the user's point of view it's easier to simply define a threshold for maxmemory-clients which is also enforced on the individual client level (and also includes the query buffer size) and there might be no real reason to configure the client-output-buffer-limit. Also need to consider how this relates to replication client limits, currently maxmemory-clients ignores these clients.
Soft vs hard thresholds: the old client-output-buffer-limit had a timer based soft threshold. What is it really good for? Isn't this just over complicating things? If not perhaps we want something like this for maxmemory-clients as well? I have a feeling this is an overkill and can probably be deprecated.
One difference between the client-output-buffer-limit implementation and this is that here we check and evict clients only before processing a command and before sleep but not during the command processing. We might want to change this and add the code checking the limit and evicting clients inside _addReplyProtoToList where asyncCloseClientOnOutputBufferLimitReached is called. I think this won't affect performance that much. If we're thinking this is the future replacement for client-output-buffer-limit then we need to do this.
Should client no-evict flag also guard against client-output-buffer-limit protection?

TODO:

remove client loop in info command: Client eviction #8687 (comment)
remove big todo comment in server.h: Client eviction #8687 (comment)
Account MULTI command buffer size as clients used memory.
There's the issue of accounting watched keys: they are kind of per-client but not really because in reality there's a list of clients per watched key. How do we handle this? Do we add another mechanism for limiting memory used by watched keys?

Tests to be added:

Decrease maxmemory-clients in runtime causes client eviction.
Only the required number of clients are evicted to achieve maxmemory-clients
First larger clients are evicted and then smaller ones.
Client eviction works on both large query, large args and large output buffers, and large multi buffers and watched keys list.

src/config.c

src/networking.c

src/server.c

src/server.h

tests/unit/maxmemory.tcl

oranagra

conceptual approval (just asked for some minor cleanup)

tests/unit/maxmemory.tcl

oranagra · 2021-05-04T15:15:07Z

@redis/core-team please take a look at this new feature for redis 7.0 (details at the top)

src/networking.c

yoav-steinberg · 2021-06-08T15:04:39Z

@yossigo @oranagra There's the issue of tracking memory usage of watched keys:

A WATCH command adds the key name to a global dict of watched keys. Each entry in the dict contains a list of clients watching that key. This means that this isn't a per-client memory consumption. So we need to think of a mechanism of limiting how much memory watched keys can consume. Another config?
We also don't have any reporting of these global dicts. So mem overhead reporting should be updated accordingly.
In addition, each client contains a list of pointers to all the keys it's watching. This can be accounted for per-client, reported in CLIENT LIST and used for client eviction. This is already implemented in my last commits.
Any thoughts?

yoav-steinberg · 2021-06-08T16:01:23Z

After talk with @oranagra about how to handle io-threads-do-reads we came up with following concept (to be tested):
To handle eviction buckets being global and update them when filling data per client in the read threads we can simply make sure all updates are either atomic decrement or increment (we need decrements when moving a client from one bucket to another). We can also check (and update) the total memory usage sum. If we pass maxmemory-clients we can stop processing the client or even abort the thread. When we're back in the main thread we can safely assume all sums in the buckets are valid because of eventual consistency. And at this point handle any client evictions if needed.

oranagra · 2021-06-10T09:03:01Z

regarding the watched keys: i don't think the client eviction mechanism needs to be perfect and count all per-client overheads, it's ok that we solve the output buffer problem and other painful problems, and some edge cases remain unsolved (it's not a security feature).
So the things that are truly per client, and are easy to count, we'll count (no reason no to), but things that are shared between clients, we can skip.

we can however improve the total overhead reported in INFO MEMORY, and the detailed report in MEMORY STATS to include these WATCH, and maybe CSC (client side caching / tracking) overheads (for manual troubleshooting).

src/networking.c

madolson · 2021-06-11T03:59:53Z

Regarding the memory usage, I agree with oran that right now a best effort will catch most of the issues for now. If we get t the point in the future we see issues, we can iterate on this solution.

oranagra · 2021-06-13T12:03:21Z

@madolson i didn't understand your comment about the valid_fn (for some reason i can't respond to that comment)

madolson · 2021-06-14T03:30:38Z

@oranagra I'm not entirely sure how that comment ended up there, it was a response to a sundb comment but somehow got duplicated as its own comment. I can't respond to it either, so I deleted it.

src/networking.c

src/server.h

src/server.c

src/server.h

src/networking.c

src/debug.c

oranagra

the top comment needs an update. the current discussion in it can remain below some separator, but i'd like to add a better description at the top that explain what the PR eventually does.
i.e. purpose, design, and most importantly interface changes and any unrelated changes.

the ones i listed during my review are these:

new config
explain the refactor of CLIENT_PENDING_READ and io_threads_op (why and how)
c->pending_read_list_node
evicted_clients stat in INFO
new multi-mem in CLIENT LIST and CLIENT INFO, and also other existing client memory fields (previously untracked)
pubsub_patterns and pubsub_channels and parts of client_tracking_prefixes memory tracked in the above
CLIENT NO-EVICT sub-command
CLIENT_CLOSE_ASAP handled in beforeNextClient rather than beforeSleep
anything else i missed?

src/config.c

src/debug.c

src/server.h

src/pubsub.c

tests/support/test.tcl

tests/unit/client-eviction.tcl

tests/unit/maxmemory.tcl

src/config.c

src/networking.c

Fixing CI test issues introduced in #8687 - valgrind warnings in readQueryFromClient when client was freed by processInputBuffer - adding DEBUG pause-cron for tests not to be time dependent. - skipping a test that depends on socket buffers / events not compatible with TLS - making sure client got subscribed by not using deferring client

multi-mem: added in redis/redis#8687 resp: added in redis/redis#9508 Also adjust the order of fields to better match the output The current output format is (redis unstable branch): ``` id=3 addr=127.0.0.1:50188 laddr=127.0.0.1:6379 fd=8 name= age=7 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=20448 argv-mem=10 multi-mem=0 obl=0 oll=0 omem=0 tot-mem=40986 events=r cmd=client|list user=default redir=-1 resp=2 ```

In a benchmark we noticed we spend a relatively long time updating the client memory usage leading to performance degradation. Before #8687 this was performed in the client's cron and didn't affect performance. But since introducing client eviction we need to perform this after filling the input buffers and after processing commands. This also lead me to write this code to be thread safe and perform it in the i/o threads. It turns out that the main performance issue here is related to atomic operations being performed while updating the total clients memory usage stats used for client eviction (`server.stat_clients_type_memory[]`). This update needed to be atomic because `updateClientMemUsage()` was called from the IO threads. In this commit I make sure to call `updateClientMemUsage()` only from the main thread. In case of threaded IO I call it for each client during the "fan-in" phase of the read/write operation. This also means I could chuck the `updateClientMemUsageBucket()` function which was called during this phase and embed it into `updateClientMemUsage()`. Profiling shows this makes `updateClientMemUsage()` (on my x86_64 linux) roughly x4 faster.

oranagra · 2022-08-18T05:52:49Z

@yoav-steinberg i got some failure with valgrind. maybe you have time to look into it

*** [err]: avoid client eviction when client is freed by output buffer limit in tests/unit/client-eviction.tcl
Expected 'obuf-client1' to match 'no client named obuf-client1 found*' (context: type eval line 38 cmd {assert_match {no client named obuf-client1 found*} $e} proc ::test)

yoav-steinberg · 2022-08-18T06:42:33Z

Not sure, the test seems fine. If it recreates you can check the server logs to see why the two client's aren't being disconnected for reaching their output buffer.

…ors (#11657) This call is introduced in #8687, but became irrelevant in #11348, and is currently a no-op. The fact is that #11348 an unintended side effect, which is that even if the client eviction config is enabled, there are certain types of clients for which memory consumption is not accurately tracked, and so unlike normal clients, their memory isn't reported correctly in INFO.

…ors (#11657) This call is introduced in #8687, but became irrelevant in #11348, and is currently a no-op. The fact is that #11348 an unintended side effect, which is that even if the client eviction config is enabled, there are certain types of clients for which memory consumption is not accurately tracked, and so unlike normal clients, their memory isn't reported correctly in INFO. (cherry picked from commit af0a4fe)

A bug introduced in #11657 (7.2 RC1), causes client-eviction (#8687) and INFO to have inaccurate memory usage metrics of MONITOR clients. Because the type in `c->type` and the type in `getClientType()` are confusing (in the later, `CLIENT_TYPE_NORMAL` not `CLIENT_TYPE_SLAVE`), the comment we wrote in `updateClientMemUsageAndBucket` was wrong, and in fact that function didn't skip monitor clients. And since it doesn't skip monitor clients, it was wrong to delete the call for it from `replicationFeedMonitors` (it wasn't a NOP). That deletion could mean that the monitor client memory usage is not always up to date (updated less frequently, but still a candidate for client eviction).

…ors (redis#11657) This call is introduced in redis#8687, but became irrelevant in redis#11348, and is currently a no-op. The fact is that redis#11348 an unintended side effect, which is that even if the client eviction config is enabled, there are certain types of clients for which memory consumption is not accurately tracked, and so unlike normal clients, their memory isn't reported correctly in INFO.

no-evict added in redis/redis#8687 no-touch added in redis/redis#11483 Co-authored-by: Binbin <binloveplau1314@qq.com>

In the past, we did not call _dictNextExp frequently. It was only called when the dictionary was expanded. Later, dictTypeExpandAllowed was introduced in redis#7954, which is 6.2. For the data dict and the expire dict, we can check maxmemory before actually expanding the dict. This is a good optimization to avoid maxmemory being exceeded due to the dict expansion. And in redis#11692, we moved the dictTypeExpandAllowed check before the threshold check, this caused a bit of performance degradation, every time a key is added to the dict, dictTypeExpandAllowed is called to check. The main reason for degradation is that in a large dict, we need to call _dictNextExp frequently, that is, every time we add a key, we need to call _dictNextExp once. Then the threshold is checked to see if the dict needs to be expanded. We can see that the order of checks here can be optimized. So we moved the dictTypeExpandAllowed check back to after the threshold check in redis#12789. In this way, before the dict is actually expanded (that is, before the threshold is reached), we will not do anything extra compared to before, that is, we will not call _dictNextExp frequently. But note we'll still hit the degradation when we over the thresholds. When the threshold is reached, because redis#7954, we may delay the dict expansion due to maxmemory limitations. In this case, we will call _dictNextExp every time we add a key during this period. This PR use CLZ in _dictNextExp to get the next power of two. CLZ (count leading zeros) can easily give you the next power of two. It should be noted that we have actually introduced the use of __builtin_clzl in redis#8687, which is 7.0. So i suppose all the platforms we use have it (even if the CPU doesn't have an instruction). We build 67108864 (2**26) keys through DEBUG POPULTE, which will use approximately 5.49G memory (used_memory:5898522936). If expansion is triggered, the additional hash table will consume approximately 1G memory (2 ** 27 * 8). So we set maxmemory to 6871947673 (that is, 6.4G), which will be less than 5.49G + 1G, so we will delay the dict rehash while addint the keys. After that, each time an element is added to the dict, an allow check will be performed, that is, we can frequently call _dictNextExp to test the comparison before and after the optimization. Using DEBUG HTSTATS 0 to check and make sure that our dict expansion is dealyed. Using `./src/redis-benchmark -P 100 -r 1000000000 -t set -n 5000000`, After ten rounds of testing: ``` unstable: this PR: 769585.94 816860.00 771724.00 818196.69 775674.81 822368.44 781983.12 822503.69 783576.25 828088.75 784190.75 828637.75 791389.69 829875.50 794659.94 835660.69 798212.00 830013.25 801153.62 833934.56 ``` We can see there is about 4-5% performance improvement in this case.

In the past, we did not call _dictNextExp frequently. It was only called when the dictionary was expanded. Later, dictTypeExpandAllowed was introduced in #7954, which is 6.2. For the data dict and the expire dict, we can check maxmemory before actually expanding the dict. This is a good optimization to avoid maxmemory being exceeded due to the dict expansion. And in #11692, we moved the dictTypeExpandAllowed check before the threshold check, this caused a bit of performance degradation, every time a key is added to the dict, dictTypeExpandAllowed is called to check. The main reason for degradation is that in a large dict, we need to call _dictNextExp frequently, that is, every time we add a key, we need to call _dictNextExp once. Then the threshold is checked to see if the dict needs to be expanded. We can see that the order of checks here can be optimized. So we moved the dictTypeExpandAllowed check back to after the threshold check in #12789. In this way, before the dict is actually expanded (that is, before the threshold is reached), we will not do anything extra compared to before, that is, we will not call _dictNextExp frequently. But note we'll still hit the degradation when we over the thresholds. When the threshold is reached, because #7954, we may delay the dict expansion due to maxmemory limitations. In this case, we will call _dictNextExp every time we add a key during this period. This PR use CLZ in _dictNextExp to get the next power of two. CLZ (count leading zeros) can easily give you the next power of two. It should be noted that we have actually introduced the use of __builtin_clzl in #8687, which is 7.0. So i suppose all the platforms we use have it (even if the CPU doesn't have an instruction). We build 67108864 (2**26) keys through DEBUG POPULTE, which will use approximately 5.49G memory (used_memory:5898522936). If expansion is triggered, the additional hash table will consume approximately 1G memory (2 ** 27 * 8). So we set maxmemory to 6871947673 (that is, 6.4G), which will be less than 5.49G + 1G, so we will delay the dict rehash while addint the keys. After that, each time an element is added to the dict, an allow check will be performed, that is, we can frequently call _dictNextExp to test the comparison before and after the optimization. Using DEBUG HTSTATS 0 to check and make sure that our dict expansion is dealyed. Using `./src/redis-server redis.conf --save "" --maxmemory 6871947673`. Using `./src/redis-benchmark -P 100 -r 1000000000 -t set -n 5000000`. After ten rounds of testing: ``` unstable: this PR: 769585.94 816860.00 771724.00 818196.69 775674.81 822368.44 781983.12 822503.69 783576.25 828088.75 784190.75 828637.75 791389.69 829875.50 794659.94 835660.69 798212.00 830013.25 801153.62 833934.56 ``` We can see there is about 4-5% performance improvement in this case.

In the past, we did not call _dictNextExp frequently. It was only called when the dictionary was expanded. Later, dictTypeExpandAllowed was introduced in #7954, which is 6.2. For the data dict and the expire dict, we can check maxmemory before actually expanding the dict. This is a good optimization to avoid maxmemory being exceeded due to the dict expansion. And in #11692, we moved the dictTypeExpandAllowed check before the threshold check, this caused a bit of performance degradation, every time a key is added to the dict, dictTypeExpandAllowed is called to check. The main reason for degradation is that in a large dict, we need to call _dictNextExp frequently, that is, every time we add a key, we need to call _dictNextExp once. Then the threshold is checked to see if the dict needs to be expanded. We can see that the order of checks here can be optimized. So we moved the dictTypeExpandAllowed check back to after the threshold check in #12789. In this way, before the dict is actually expanded (that is, before the threshold is reached), we will not do anything extra compared to before, that is, we will not call _dictNextExp frequently. But note we'll still hit the degradation when we over the thresholds. When the threshold is reached, because #7954, we may delay the dict expansion due to maxmemory limitations. In this case, we will call _dictNextExp every time we add a key during this period. This PR use CLZ in _dictNextExp to get the next power of two. CLZ (count leading zeros) can easily give you the next power of two. It should be noted that we have actually introduced the use of __builtin_clzl in #8687, which is 7.0. So i suppose all the platforms we use have it (even if the CPU doesn't have an instruction). We build 67108864 (2**26) keys through DEBUG POPULTE, which will use approximately 5.49G memory (used_memory:5898522936). If expansion is triggered, the additional hash table will consume approximately 1G memory (2 ** 27 * 8). So we set maxmemory to 6871947673 (that is, 6.4G), which will be less than 5.49G + 1G, so we will delay the dict rehash while addint the keys. After that, each time an element is added to the dict, an allow check will be performed, that is, we can frequently call _dictNextExp to test the comparison before and after the optimization. Using DEBUG HTSTATS 0 to check and make sure that our dict expansion is dealyed. Using `./src/redis-server redis.conf --save "" --maxmemory 6871947673`. Using `./src/redis-benchmark -P 100 -r 1000000000 -t set -n 5000000`. After ten rounds of testing: ``` unstable: this PR: 769585.94 816860.00 771724.00 818196.69 775674.81 822368.44 781983.12 822503.69 783576.25 828088.75 784190.75 828637.75 791389.69 829875.50 794659.94 835660.69 798212.00 830013.25 801153.62 833934.56 ``` We can see there is about 4-5% performance improvement in this case. (cherry picked from commit 22cc9b5)

A mechanism for disconnecting clients when the sum of all connected clients is above a configured limit. This prevents eviction or OOM caused by accumulated used memory between all clients. It's a complimentary mechanism to the `client-output-buffer-limit` mechanism which takes into account not only a single client and not only output buffers but rather all memory used by all clients. The general design is as following: * We track memory usage of each client, taking into account all memory used by the client (query buffer, output buffer, parsed arguments, etc...). This is kept up to date after reading from the socket, after processing commands and after writing to the socket. * Based on the used memory we sort all clients into buckets. Each bucket contains all clients using up up to x2 memory of the clients in the bucket below it. For example up to 1m clients, up to 2m clients, up to 4m clients, ... * Before processing a command and before sleep we check if we're over the configured limit. If we are we start disconnecting clients from larger buckets downwards until we're under the limit. `maxmemory-clients` max memory all clients are allowed to consume, above this threshold we disconnect clients. This config can either be set to 0 (meaning no limit), a size in bytes (possibly with MB/GB suffix), or as a percentage of `maxmemory` by using the `%` suffix (e.g. setting it to `10%` would mean 10% of `maxmemory`). * During the development I encountered yet more situations where our io-threads access global vars. And needed to fix them. I also had to handle keeps the clients sorted into the memory buckets (which are global) while their memory usage changes in the io-thread. To achieve this I decided to simplify how we check if we're in an io-thread and make it much more explicit. I removed the `CLIENT_PENDING_READ` flag used for checking if the client is in an io-thread (it wasn't used for anything else) and just used the global `io_threads_op` variable the same way to check during writes. * I optimized the cleanup of the client from the `clients_pending_read` list on client freeing. We now store a pointer in the `client` struct to this list so we don't need to search in it (`pending_read_list_node`). * Added `evicted_clients` stat to `INFO` command. * Added `CLIENT NO-EVICT ON|OFF` sub command to exclude a specific client from the client eviction mechanism. Added corrosponding 'e' flag in the client info string. * Added `multi-mem` field in the client info string to show how much memory is used up by buffered multi commands. * Client `tot-mem` now accounts for buffered multi-commands, pubsub patterns and channels (partially), tracking prefixes (partially). * CLIENT_CLOSE_ASAP flag is now handled in a new `beforeNextClient()` function so clients will be disconnected between processing different clients and not only before sleep. This new function can be used in the future for work we want to do outside the command processing loop but don't want to wait for all clients to be processed before we get to it. Specifically I wanted to handle output-buffer-limit related closing before we process client eviction in case the two race with each other. * Added a `DEBUG CLIENT-EVICTION` command to print out info about the client eviction buckets. * Each client now holds a pointer to the client eviction memory usage bucket it belongs to and listNode to itself in that bucket for quick removal. * Global `io_threads_op` variable now can contain a `IO_THREADS_OP_IDLE` value indicating no io-threading is currently being executed. * In order to track memory used by each clients in real-time we can't rely on updating these stats in `clientsCron()` alone anymore. So now I call `updateClientMemUsage()` (used to be `clientsCronTrackClientsMemUsage()`) after command processing, after writing data to pubsub clients, after writing the output buffer and after reading from the socket (and maybe other places too). The function is written to be fast. * Clients are evicted if needed (with appropriate log line) in `beforeSleep()` and before processing a command (before performing oom-checks and key-eviction). * All clients memory usage buckets are grouped as follows: * All clients using less than 64k. * 64K..128K * 128K..256K * ... * 2G..4G * All clients using 4g and up. * Added client-eviction.tcl with a bunch of tests for the new mechanism. * Extended maxmemory.tcl to test the interaction between maxmemory and maxmemory-clients settings. * Added an option to flag a numeric configuration variable as a "percent", this means that if we encounter a '%' after the number in the config file (or config set command) we consider it as valid. Such a number is store internally as a negative value. This way an integer value can be interpreted as either a percent (negative) or absolute value (positive). This is useful for example if some numeric configuration can optionally be set to a percentage of something else. Co-authored-by: Oran Agra <oran@redislabs.com>

…10401) In a benchmark we noticed we spend a relatively long time updating the client memory usage leading to performance degradation. Before redis#8687 this was performed in the client's cron and didn't affect performance. But since introducing client eviction we need to perform this after filling the input buffers and after processing commands. This also lead me to write this code to be thread safe and perform it in the i/o threads. It turns out that the main performance issue here is related to atomic operations being performed while updating the total clients memory usage stats used for client eviction (`server.stat_clients_type_memory[]`). This update needed to be atomic because `updateClientMemUsage()` was called from the IO threads. In this commit I make sure to call `updateClientMemUsage()` only from the main thread. In case of threaded IO I call it for each client during the "fan-in" phase of the read/write operation. This also means I could chuck the `updateClientMemUsageBucket()` function which was called during this phase and embed it into `updateClientMemUsage()`. Profiling shows this makes `updateClientMemUsage()` (on my x86_64 linux) roughly x4 faster.

yoav-steinberg mentioned this pull request Mar 23, 2021

[NEW] Client "eviction" - drop clients when their total buffer overhead is over a limit #7676

Closed

yoav-steinberg added state:needs-design the solution is not obvious and some effort should be made to design it state:major-decision Requires core team consensus labels Mar 31, 2021

oranagra linked an issue Apr 22, 2021 that may be closed by this pull request

[NEW] Client "eviction" - drop clients when their total buffer overhead is over a limit #7676

Closed

yoav-steinberg force-pushed the client_eviction branch from 08f48b7 to 670845e Compare April 22, 2021 12:08

sundb reviewed Apr 22, 2021

View reviewed changes

src/config.c Outdated Show resolved Hide resolved

yoav-steinberg force-pushed the client_eviction branch from f120d3d to 314c9ee Compare April 22, 2021 15:49

oranagra reviewed Apr 25, 2021

View reviewed changes

oranagra previously approved these changes Apr 28, 2021

View reviewed changes

tests/unit/maxmemory.tcl Outdated Show resolved Hide resolved

tests/unit/maxmemory.tcl Outdated Show resolved Hide resolved

yoav-steinberg dismissed oranagra’s stale review via d9367fb April 29, 2021 13:22

oranagra added approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes labels May 4, 2021

oranagra reviewed May 12, 2021

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

yoav-steinberg force-pushed the client_eviction branch from 440ee1e to b0a400c Compare May 13, 2021 06:32

madolson reviewed Jun 11, 2021

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

oranagra reviewed Jun 16, 2021

View reviewed changes

src/networking.c Show resolved Hide resolved

src/server.h Show resolved Hide resolved

src/server.c Show resolved Hide resolved

src/server.c Outdated Show resolved Hide resolved

src/server.h Show resolved Hide resolved

oranagra reviewed Jun 20, 2021

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

yoav-steinberg force-pushed the client_eviction branch from 791ec0a to 93dde0a Compare August 10, 2021 14:33

yoav-steinberg added the state:needs-doc-pr requires a PR to redis-doc repository label Aug 12, 2021

sundb reviewed Aug 13, 2021

View reviewed changes

src/debug.c Show resolved Hide resolved

yoav-steinberg marked this pull request as ready for review August 17, 2021 12:15

oranagra reviewed Aug 19, 2021

View reviewed changes

oranagra reviewed Aug 30, 2021

View reviewed changes

src/config.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

enjoy-binbin mentioned this pull request Dec 6, 2021

Add evicted_clients field in INFO redis/redis-doc#1703

Merged

enjoy-binbin mentioned this pull request Jan 12, 2022

Add multi-mem and resp in CLIENT LIST redis/redis-doc#1740

Merged

filipecosta90 mentioned this pull request Feb 24, 2022

[BUG] ZREVRANGE 50% slower after upgrading from 5.0.7 to 6.2.6 #10310

Closed

yoav-steinberg mentioned this pull request Mar 9, 2022

Optimization: remove updateClientMemUsage from i/o threads. #10401

Merged

sundb mentioned this pull request Dec 26, 2022

Remove unnecessary updateClientMemUsageAndBucket() when feeding monitors #11657

Merged

oranagra mentioned this pull request Jan 1, 2023

[BUG] hrandfield hangs the server #11671

Open

This was referenced Jul 23, 2023

Redis evict almost all the keys when reach maxmemory #4496

Open

update monitor client's memory and evict correctly #12420

Merged

enjoy-binbin mentioned this pull request Oct 23, 2023

update client list fields: no-evict and no-touch redis/redis-doc#2572

Merged

oranagra pushed a commit to redis/redis-doc that referenced this pull request Oct 24, 2023

update client list fields: no-evict and no-touch (#2572)

f5514bf

no-evict added in redis/redis#8687 no-touch added in redis/redis#11483 Co-authored-by: Binbin <binloveplau1314@qq.com>

enjoy-binbin mentioned this pull request Nov 28, 2023

Use CLZ in _dictNextExp to get the next power of two #12815

Merged

sundb mentioned this pull request Nov 29, 2023

Fix race condition issues between the main thread and module threads #12817

Merged

soloestoy mentioned this pull request Jan 22, 2024

Tracking clients memory more accurate #12976

Open

Client eviction #8687

Client eviction #8687

Uh oh!

Conversation

yoav-steinberg commented Mar 23, 2021 • edited by oranagra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Design

Config

Important code changes

Description

Important/Open issues:

Tests to be added:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

oranagra commented May 4, 2021

Uh oh!

Uh oh!

yoav-steinberg commented Jun 8, 2021

Uh oh!

yoav-steinberg commented Jun 8, 2021

Uh oh!

oranagra commented Jun 10, 2021

Uh oh!

Uh oh!

Uh oh!

madolson commented Jun 11, 2021

Uh oh!

oranagra commented Jun 13, 2021

Uh oh!

madolson commented Jun 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra commented Aug 18, 2022

Uh oh!

yoav-steinberg commented Aug 18, 2022

Uh oh!

Uh oh!

yoav-steinberg commented Mar 23, 2021 •

edited by oranagra

Loading

madolson commented Jun 14, 2021 •

edited

Loading