Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce dynamic client reply buffer size - save memory on idle clients #9822

Merged
merged 47 commits into from
Feb 22, 2022

Conversation

ranshid
Copy link
Collaborator

@ranshid ranshid commented Nov 22, 2021

Current implementation simple idle client which serves no traffic still
use ~17Kb of memory. this is mainly due to a fixed size reply buffer
currently set to 16kb here: https://github.com/redis/redis/blob/unstable/src/server.h#L157

We have encountered some cases in which the server operates in a low memory environments.
In such cases a user who wishes to create large connection pools to support potential burst period,
will exhaust a large amount of memory to maintain connected Idle clients.
Some users may choose to "sacrifice" performance in order to save memory.

This commit introduce a dynamic mechanism to shrink and expend the client reply buffer based on periodic observed peak.
the algorithm works as follows:

  1. each time a client reply buffer has been fully written, the last recorded peak is updated:
    new peak = MAX( last peak, current written size)
  2. during clients cron we check for each client if the last observed peak was:
    a. matching the current buffer size - in which case we expend (resize) the buffer size by 100%
    b. less than half the buffer size - in which case we shrink the buffer size by 50%
  3. In any case we will not resize the buffer in case:
    a. the current buffer peak is less then the current buffer usable size and higher than 1/2 the current buffer usable size
    b. the value of (current buffer usable size/2) is less than 1Kib
    c. the value of (current buffer usable size*2) is larger than 16Kib
  4. the peak value is reset to the current buffer position once every 5 seconds. we maintain a new field in the client structure (buf_peak_last_reset_time) which is used to keep track of how long it passed since the last buffer peak reset.

Interface changes:

CIENT LIST - now contains 2 new extra fields:
rbs= < the current size in bytes of the client reply buffer >
rbp=< the current value in bytes of the last observed buffer peak position >

INFO STATS - now contains 2 new statistics:
reply_buffer_shrinks = < total number of buffer shrinks performed >
reply_buffer_expends = < total number of buffer expends performed >

Results:

The main concern in this case was the potential performance degradation as a result of buffer dereferencing.
In order to verify the change we have performed multiple benchmarking tests.
all tests performed on:

m5.2xlarge instance
3 io-threads (main+2 threads)
468750 keys with 4K values
15 c5n.2xlarge client machines running single threaded redis-benchmark.
each redis benchmark running 50 clients with only get requests and run for 900 seconds

comparing the results of the static vs dynamic case:

  DYNAMIC STATIC
TPS 164353 164667
P90 Latency(ms) 6.1 6.1

Also looking into cache-misses stats using perf verified the TLB and L1 cache misses was increased ~1%.

sudo perf stat -a -g --pid 'pidof redis-servert' -e cache-misses -- sleep 60

Static Buf:

3,313,413,581 cache-misses

Dynamic Buffer:

3,351,794,484 cache-misses

cache-misses-fg

@oranagra
Copy link
Member

@ranshid thanks.
I like the idea of saving memory when there are many clients that don't actually need it.
but i don't like to add an extra config.
i would like to look into a possibility of making a better redis by default, without users needing to spot the problem and tune it.

in theory, we can realloc the client from time to time. i.e. create a small one initially, and make it grow when we see it normally consumes more than that first buffer.
but then i'm worried about a usage pattern that creates disposable clients.
i.e. client connects (created small), then runs some traffic, get resized, and then disconnect.
if we're flooded by such usage pattern, we could cause a significant regression.
@yoav-steinberg i'd like to hear your thoughts.

@oranagra oranagra added this to Backlog in 7.0 via automation Nov 22, 2021
@oranagra oranagra moved this from Backlog to In progress in 7.0 Nov 22, 2021
@ranshid
Copy link
Collaborator Author

ranshid commented Nov 22, 2021

@ranshid thanks. I like the idea of saving memory when there are many clients that don't actually need it. but i don't like to add an extra config. i would like to look into a possibility of making a better redis by default, without users needing to spot the problem and tune it.

in theory, we can realloc the client from time to time. i.e. create a small one initially, and make it grow when we see it normally consumes more than that first buffer. but then i'm worried about a usage pattern that creates disposable clients. i.e. client connects (created small), then runs some traffic, get resized, and then disconnect. if we're flooded by such usage pattern, we could cause a significant regression. @yoav-steinberg i'd like to hear your thoughts.

Thank you @oranagra for you quick reply!
I agree that adding a new config is restricting and an automatic adjustment mechanism would better serve the general case.
I do think there is a variety of use cases and sometimes users would mostly like to use as much memory is the cost of performance and in some cases the other way around while identify the use case might sometimes be problematic.
we can consider introducing an occasional memory relocation for clients, but usually once the relevant memory is already allocated for user data, there can be little spare for clients resizing (unless evictions are made)
we can also come up with a formula to choose the optimal client size based on maxclients and maxmemory configurations, but again this might be restrictive and sub-optimal in some user setups.

given said all that I would be happy to followup and consider a better alternative :)

@yoav-steinberg
Copy link
Contributor

Since the client output buffer is dynamically allocated with an initial static buffer of 16k and then a linked list of dynamic buffers, I'm guessing that any usage pattern that needs large buffers will cause dynamic allocations anyway so doing this dynamically doesn't make sense to me.

This should be either a compile time constant (like it is today) or a config (like proposed here). Both are good solutions.
I looked a bit at the history of this 16k size, I see it started out as 4k, then grew to ~7.5k and then grew to 16k. From the logs I couldn't really understand why. I think we should be willing to consider making this smaller (lets say 8k) to save memory.

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 22, 2021

Thank you @yoav-steinberg !
I also agree regarding the dynamic reply allocations once that fixed size buffer is filled (which is why reducing the size will result in performance degradation)
I believe this change was first introduced by @antirez in this commit: 65330ba
I can start an effort to test performance of different reply sizes and check the performance degradation, but as I said before - I suspect this is mainly intended for users to opt-in their Redis to specific use cases (which are not that common IMO)

@oranagra
Copy link
Member

As you said, this buffer is need for performance (avoid allocations and also the indirection and possibly cache locality we'd have if it was a pointer).
As was pointed out, it was smaller in the past then then increased, so obviously decreasing it is not a wise idea.
I generally want to avoid adding configs for such tuning. too low level and i don't think users need to worry about it.
also i'd like to improve memory usage even for ones who didn't bother to dig in and realize what's eating their ram.

you can proceed to run some performance tests, maybe we'll realize something new.
but i'd also like to try to think of some automatic mechanism that can be put here instead of that config.

@yoav-steinberg
Copy link
Contributor

yoav-steinberg commented Nov 23, 2021

but i'd also like to try to think of some automatic mechanism that can be put here instead of that config.

How about this:

  • Clients are allocated with some minimum fixed buf size (lets say 4K).
  • If this buffer is ever maxed out (meaning we added a dynamic allocation to the reply list) then we turn on some CLIENT_BUF_WAS_FULL in flags in the client struct.
  • In clientsCron() periodically we check if this flag is on. If it is we reallocate the client struct so its buf is doubled and replace it in the clients list (this means we can't have this client object referenced anywhere else). After reallocation we zero the flag.
  • We need a maximum too (lets say 16K) and stop growing the client struct after we reach this max. So we need to keep track of the buf size in the client struct.
  • Things to consider:
    • We need a shrinking mechanism to reallocate to a smaller size if between crons we don't use more than half the buffer. So we need another flag or perhaps the flag should always just mention half the buffer size?
    • This design might not be possible because of client struct references. An alternative can be to chuck the preallocated buf at the end of the client struct and just make sure we always have an initial entry in the reply list which we can manage with similar cron based growing and shrinking logic.
    • The cron intervals might not be good enough? Do we need to add some timestamps to the client struct signifying "last grow/shrink time"?

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 24, 2021

but i'd also like to try to think of some automatic mechanism that can be put here instead of that config.

How about this:

  • Clients are allocated with some minimum fixed buf size (lets say 4K).

  • If this buffer is ever maxed out (meaning we added a dynamic allocation to the reply list) then we turn on some CLIENT_BUF_WAS_FULL in flags in the client struct.

  • In clientsCron() periodically we check if this flag is on. If it is we reallocate the client struct so its buf is doubled and replace it in the clients list (this means we can't have this client object referenced anywhere else). After reallocation we zero the flag.

  • We need a maximum too (lets say 16K) and stop growing the client struct after we reach this max. So we need to keep track of the buf size in the client struct.

  • Things to consider:

    • We need a shrinking mechanism to reallocate to a smaller size if between crons we don't use more than half the buffer. So we need another flag or perhaps the flag should always just mention half the buffer size?
    • This design might not be possible because of client struct references. An alternative can be to chuck the preallocated buf at the end of the client struct and just make sure we always have an initial entry in the reply list which we can manage with similar cron based growing and shrinking logic.
    • The cron intervals might not be good enough? Do we need to add some timestamps to the client struct signifying "last grow/shrink time"?

Thank you @yoav-steinberg!
I was thinking the same direction. we can keep track of cases were the buffer was full.
as you said we can check this counter on the clientsCron and enlarge the buffer when the counter crossed some boundary
we can also establish some decay function on the counter of the buffer full occasions and on the same cronJob reduce the buffer size when this counter is below some boundary.
What I am more concerned about is the effect of making this a dynamic buffer in terms of cache. we can relocate the entire client struct but that would introduce much complexity.
I will now work to produce benchmark results for this buffer size with different payload sizes so that we can understand better and take correct actions before continue.

@yoav-steinberg
Copy link
Contributor

yoav-steinberg commented Nov 24, 2021

we can keep track of cases were the buffer was full. as you said we can check this counter on the clientsCron and enlarge the
buffer when the counter crossed some boundary...

I'm not sure a counter is the right direction here. When are we going to increase this counter? Different traffic patterns will produce different results but not necessarily meaning a higher value requires a larger buffer. My thought was that if the buffer was maxed out it means we will benefit from a larger buffer. This means a simple flag might be enough.

What I am more concerned about is the effect of making this a dynamic buffer in terms of cache. we can relocate the entire client struct but that would introduce much complexity.

I'm not worried about cache because my assumption is that we do the reallocation only periodically and this interval needs to be large enough to minimize performance implications which include both the cache invalidation and then memcpy itself that will happen during the reallocation. I think we just need large enough intervals, and in the case of redis between 0.1-1 is probably more than enough (these are the in practice intervals you'll get with the clientsCron implementation).

I will now work to produce benchmark...

Excellent! waiting for the results..

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 25, 2021

I have made several performance tests.
all tests where done on an m5.2xlarge instance on AWS.
I only wanted to measure get performance so I initially fill up the data with 512 bytes size values over 3750000 keys.
each test run for 60 seconds using 5 clients and different pipeline and reply buffer sizes.

<style> </style>
  tps p50 p95
P=1      
1k 111097 0.039 0.047
4k 109655 0.039 0.047
8k 113081 0.039 0.047
16k 111977 0.039 0.047
32k 111933 0.039 0.047
P=5      
1k 229835 0.103 0.119
4k 311151 0.071 0.087
8k 314887 0.071 0.079
16k 315922 0.071 0.079
32k 313676 0.071 0.079
P=10      
1k 359264 0.119 0.143
4k 357449 0.119 0.143
8k 418340 0.103 0.119
16k 420073 0.103 0.119
32k 417734 0.103 0.119
P=15      
1k 422487 0.151 0.175
4k 427107 0.151 0.175
8k 482625 0.135 0.159
16k 486129 0.135 0.151
32k 488663 0.135 0.151

@oranagra
Copy link
Member

a few random notes:

  1. regarding the benchmark, i think the objects are too small, 512 byte and pipeline of 15 only 7kb, which is why you can only see a difference of performance between buffer size of 4k and 8k. (btw, which benchmark tool did you use, and did you make sure to CPU is not saturated on either host in this test?)
  2. we can't just watch the fact the reply list was populated since there are cases of Deferred reply, which will use the list even if the reply is small.
  3. static buffer is faster than a pointer because of indirection indirection, we must keep a static buffer as part of the client struct and relocate the whole struct when needed, but we can maybe skip that if the client is listed in some list (like blocked, etc).
  4. when resizing, we wanna avoid a spike of many clients resized at the same time, i.e. many clients connected at the same time and will reach the threshold on the same second.
  5. keep it simple!, i think a boolean may be enough, or anyway, let's not make it too complicated.

@yoav-steinberg
Copy link
Contributor

I think the benchmark does show us how significant splitting the reply between the static buffer and the dynamic list is. Once the reply is split into two buffers we see a major performance impact. This is enough to convince us the original idea of having a big pre-allocated buffer is a good idea.
It might be a good idea to try the tests again, but change the code so the pre-allocated buffer is a dynamic buffer allocated in createClient() and not part of the client struct (in other words make buf a char*). This will help us asses how important it is to keep the static buffer implementation or which complicates things if there clients is referenced elsewhere.

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 28, 2021

a few random notes:

  1. regarding the benchmark, i think the objects are too small, 512 byte and pipeline of 15 only 7kb, which is why you can only see a difference of performance between buffer size of 4k and 8k. (btw, which benchmark tool did you use, and did you make sure to CPU is not saturated on either host in this test?)
  2. we can't just watch the fact the reply list was populated since there are cases of Deferred reply, which will use the list even if the reply is small.
  3. static buffer is faster than a pointer because of indirection indirection, we must keep a static buffer as part of the client struct and relocate the whole struct when needed, but we can maybe skip that if the client is listed in some list (like blocked, etc).
  4. when resizing, we wanna avoid a spike of many clients resized at the same time, i.e. many clients connected at the same time and will reach the threshold on the same second.
  5. keep it simple!, i think a boolean may be enough, or anyway, let's not make it too complicated.

thank you @oranagra
1/ I agree the value size has major effect on the results. I based the 512 since in our internal statistics this reflects the majority of the value size used. Also note that in my current implementation the reply list is still growing in at least 16K interval (which is also something to consider) So I did expect to only get a single performance drop point for each test case. I can repeat the test with much bigger value sizes, but I think the general concept should be clear by these results.
Per your question - I used redis-benchamrk -P -c 5 -t get -r 3750000 -n 3750000 and an additional parameter test-duration which was introduced in the benchmark code to limit the test time. the cpu was operating at high rate, but I still think we can have a clear grasp of the performance degradation from these results. however I can perform more tests in order to satisfy our view of the potential degradation.
2/ Also agree here the main trigger should consider the buffer state and not the existence of the reply list. when I get to detailed design I will address all the issues.
3 / That is also my callout previously regarding the fact that we should reallocate the client struct. however this might greatly complicate the implementation which might raise the question of profitability.
keeping logic for client is pointed by others will be very hard to maintain and will introduce risk of bugs in some scenarios.
I really support avoiding the indirection, but lets first establish the cost of it.
4+5/ 100% agree

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 28, 2021

I think the benchmark does show us how significant splitting the reply between the static buffer and the dynamic list is. Once the reply is split into two buffers we see a major performance impact. This is enough to convince us the original idea of having a big pre-allocated buffer is a good idea. It might be a good idea to try the tests again, but change the code so the pre-allocated buffer is a dynamic buffer allocated in createClient() and not part of the client struct (in other words make buf a char*). This will help us asses how important it is to keep the static buffer implementation or which complicates things if there clients is referenced elsewhere.

I agree with you @yoav-steinberg . I did perform a comparative test of dynamic VS static buffer:
again with preallocated 512 values in the DB.
redis-benchmark -P 30 -c 5 -t get -r 3750000 -n 3750000

<style> </style>
  tps p50 p95
Static      
4k 542377 0.239 0.271
16k 586762 0.223 0.255
Dynamic      
4k 540891 0.239 0.271
16k 585845 0.223 0.255

As you can see the results currently does not reflect any issue with the dynamic buffer (less than 1% TPS degradation)
But I might want to run a more "realistic" test with many client machines and 3K clients in order to better support this assumption

@yoav-steinberg
Copy link
Contributor

yoav-steinberg commented Nov 28, 2021

As you can see the results currently does not reflect any issue with the dynamic buffer (less than 1% TPS degradation)
But I might want to run a more "realistic" test with many client machines and 3K clients in order to better support this assumption

This is good news. This means we might be able to have a single preallocated buf which we can grow/shrink periodically without worrying about the client struct being referenced in other places.

when resizing, we wanna avoid a spike of many clients resized at the same time, i.e. many clients connected at the same time and will reach the threshold on the same second.

Regarding reallocating many buf at the same time, this will be mitigated by the fact that clientsCron() doesn't handle all clients at the same time, to avoid such potential spikes for any per-client cron work. And if this won't be enough we can limit the number of clients processed per-cron cycle even more. This should probably be enough.

@oranagra
Copy link
Member

I want to stress two points form my past discussions with Salvatore.

  1. when i suggested to get rid of the static buffer and replace it with a pointer, IIRC he argued that this will degrade performance (i think he mentioned cache efficiency, which i'm not sure is right, but it might be the indirection). so my point is that if we go this way we need to really test this carefully, maybe even on a smaller replies, like a pipeline of PINGs.
  2. IIRC (you can validate that in git blame), clientsCronResizeQueryBuffer has some limit not to try to resize buffers that are smaller than 4kb. this condition was added after complaints that this function caused latency issues (despite the fact we only process certain amount of clients per cron).

@ranshid
Copy link
Collaborator Author

ranshid commented Nov 29, 2021

I want to stress two points form my past discussions with Salvatore.

  1. when i suggested to get rid of the static buffer and replace it with a pointer, IIRC he argued that this will degrade performance (i think he mentioned cache efficiency, which i'm not sure is right, but it might be the indirection). so my point is that if we go this way we need to really test this carefully, maybe even on a smaller replies, like a pipeline of PINGs.
  2. IIRC (you can validate that in git blame), clientsCronResizeQueryBuffer has some limit not to try to resize buffers that are smaller than 4kb. this condition was added after complaints that this function caused latency issues (despite the fact we only process certain amount of clients per cron).

Thank you @oranagra!
1/ I was originally also concerned regarding the TLB/cache effect of the indirection. I totally agree that in order to validate the cache/indirection effect of moving to dynamic pointer buffer will not introduce degradation.
In AWS we have specific benchmark setup which I already started to adjust for this testing purpose. I will circle back with the results ASAP
2/ ACK - maybe we will keep a lower limit of 4Kb. I will make sure to test small buffer sizes in the tests

@ranshid
Copy link
Collaborator Author

ranshid commented Dec 14, 2021

Hi @oranagra and @yoav-steinberg
I am sorry for the long delay - I was held back due to some personal + work obligations.

In order to support the dynamic reply buffer I have implemented a very simple POC which does the folowing:

  1. allocate an initial 1024 bytes buffer.
  2. when the reply buffer is filled we set a flag to increase the buffer size x2 (done in cron job)
  3. when the reply buffer is fully written in case the bytes written is up to 1/2 the buffer size we set a flag to shrink it by 1/2 (done in cron job)
  4. when the client is idle for IDLE-TIME secs we set a flag to shrink it by 1/2 (done in cron job)

I have performed some benchmark tests to validate the dynamic buffer overhead:
all tests performed on:

  • m5.2xlarge instance
  • 3 io-threads (main+2 threads)
  • 468750 keys with 4K values
  • 15 c5n.2xlarge client machines running single threaded redis-benchmark.
  • each redis benchmark running 50 clients with only get requests and run for 900 seconds

in the first test I compared 3 variations:

  1. STATIC - the current static buffer
  2. NON-STATIC - same as current implementation only buffer is pre allocated during client creation
  3. DYNAMIC - POC version described above.
<style> </style>
  NON-STATIC DYNAMIC STATIC
TPS 163985 163945 162103
P90 Latency(ms) 6.62 6.09 6.3

in the second test I run the STATIC vs DYNAMIC variation AND used pipeline of 5 commands
and set IDLE-TIME to be one of {2,5,7,10}
the reason is to locate the optimal IDLE-TIME to shrink the buffer
the results are:

<style> </style>
  TPS P90 Latency(ms)
STATIC 253523 13.103
DYNAMIC/10 secs 263910 13.96
DYNAMIC/7 secs 247233 16.56
DYNAMIC/5 secs 237149 17.103
DYNAMIC/2 secs 231911 17.5

IMO the pipeline scenario illustrates a very intensive workload, however I think there is no special reason not to define a high idle time in order to shrink the buffer.
I think the results indicate the dynamic buffer will not introduce a performance degradation, however the dynamic shrinking might introduce some in specific workloads (large values + many clients).
I do not see any reason why the idle time should not be high, since the main purpose is to support long time idle connections.

Please share your thoughts, and if we agree to continue I can introduce the more constructed version of my POC

@yoav-steinberg
Copy link
Contributor

Thank you @ranshid, this looks promising. A few questions:

  • I'm not sure I understand the logic behind the buffer shrinkage. Between cron intervals there might be many cases where we've written 1/2 the buffer but also many cases where we filled it completely. So you'll have both flags set. What do you do? I think what should be done is that if at the beginning of the cron interval the buffer isn't half full we can turn on a flag saying "POTENTIAL_TO_SHRINK", if during the interval we pass half the buffer then we turn off the flag. If by the next interval the flag is still on we can shrink.
  • I didn't understand what you mean by "IDLE_TIME"? Is this time of no activity on that client? If so how does this relate to your benchmarks, because surely there's some activity for all clients within the given intervals you checked (2-10sec). Why do we need an idle time shrinking mechanism if we can shrink based on half buffer usage anyway?

@ranshid
Copy link
Collaborator Author

ranshid commented Dec 15, 2021

Thank you @ranshid, this looks promising. A few questions:

  • I'm not sure I understand the logic behind the buffer shrinkage. Between cron intervals there might be many cases where we've written 1/2 the buffer but also many cases where we filled it completely. So you'll have both flags set. What do you do? I think what should be done is that if at the beginning of the cron interval the buffer isn't half full we can turn on a flag saying "POTENTIAL_TO_SHRINK", if during the interval we pass half the buffer then we turn off the flag. If by the next interval the flag is still on we can shrink.

[ranshid] @yoav-steinberg, currently for this POC I only allowed one flag to override the other. so in the case of this POC the last flag set before the cron run was serviced and cleared after the buffer adjustments. this was mainly for POC as I knew in my POC all key values were the same. it is true That a potential logic would be to always clear the shrink flag in case we write more than half the buffer (or maybe some other logic)

  • I didn't understand what you mean by "IDLE_TIME"? Is this time of no activity on that client? If so how does this relate to your benchmarks, because surely there's some activity for all clients within the given intervals you checked (2-10sec). Why do we need an idle time shrinking mechanism if we can shrink based on half buffer usage anyway?

[ranshid] well I think that shrinking the buffer in case of idle client is important. for example in case when the last write on the client used more than half the buffer. there will be no more writes and the buffer will not be shrinked. I think the query buffer acts the same after client is idle for 2 seconds.
also I have checked the statistics I added for the POC to monitor the number of expansions and shrinkage during test of IDLE-TIME=2 and IDLE-TIME=7:
for the IDLE-TIME=7:

reply_buffer_shrink:132
reply_buffer_idle_shrink:10
reply_buffer_idle_expends:0

while for the IDLE-TIME=2:

reply_buffer_shrink:314
reply_buffer_idle_shrink:192
reply_buffer_expends:186

I think though the numbers are not high, given the tests are identical in terms of setup and workload, we should expect no expansions.

BTW: I was wrong before claiming the initial buffer size is 1Kib. in my POC it is 16Kib...

@yoav-steinberg
Copy link
Contributor

According to this design proposal:

I think what should be done is that if at the beginning of the cron interval the buffer isn't half full we can turn on a flag saying
"POTENTIAL_TO_SHRINK", if during the interval we pass half the buffer then we turn off the flag. If by the next interval the flag is still on we can shrink.

You won't need the idle mechanism because shrinking will be handled in the cron even when there's no traffic.

In any case I still don't understand why you had any idle time in your benchmarks. I also think that your benchmark might be shrinking the buffer too often in case both flags contradict. I might be wrong because I have a feeling I don't totally understand your logic in the POC. But since the results seem promising I suggest you create a (possibly draft)PR of your code so we can review the algorithm. @oranagra does this make sense?

@ranshid
Copy link
Collaborator Author

ranshid commented Dec 20, 2021

@yoav-steinberg @oranagra - I pushed the latest POC implementation of the dynamic reply buf size.
the main changes are:

  1. maintain per client buf_peak - I think that a better option than observe the last write position is to maintain a peak observed between cron runs. I reset the peak after each cron iteration on the client thus I am able to ignore the IDLE check.
  2. the shrink/expend logic is maintained only in cron job according to the following logic:
    • in case the last observed peak was equal to the buffer size - double the buffer size
    • in case the last observed peak was less than half the buffer size - shrink the buffer in half.
    • otherwise do nothing
  3. I added counters for the number of expends and shrinks (global counters) and added them in the info stats.
  4. added some reply buffer info in the client list.

I have retested benchmark results in compare to the static buffer case:

<style> </style>
     
  DYNAMIC STATIC
TPS 164353 164667
P90 Latency(ms) 6.1 6.1

@yoav-steinberg
Copy link
Contributor

@ranshid Can you update the PR with your new design (instead of config option for buffer size) and update the title accordingly.

@ranshid ranshid changed the title introduce dynamic client size config introduce dynamic client reply buffer size Dec 22, 2021
@ranshid
Copy link
Collaborator Author

ranshid commented Dec 22, 2021

@ranshid Can you update the PR with your new design (instead of config option for buffer size) and update the title accordingly.

@yoav-steinberg done.
Also made some cache misses analysis and added it to the PR.
Thank you for reviewing this!

src/server.c Outdated Show resolved Hide resolved
src/networking.c Outdated Show resolved Hide resolved
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some random comments (can be ignored), and small indentation fix.
i think this one is ready to be merged.

src/server.c Show resolved Hide resolved
src/debug.c Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
Co-authored-by: Oran Agra <oran@redislabs.com>
@oranagra oranagra added the state:major-decision Requires core team consensus label Feb 21, 2022
@oranagra oranagra changed the title introduce dynamic client reply buffer size introduce dynamic client reply buffer size - save memory on idle clients Feb 21, 2022
@oranagra
Copy link
Member

@redis/core-team please approve.
there are some interface (introspection) changes, in addition of course to the new mechanism that has no interfaces, but is a major change with some potential for issues.

@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Feb 21, 2022
@oranagra oranagra moved this from In Review to Awaits merge in 7.0 Feb 21, 2022
Copy link
Member

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some typos.

src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
ranshid and others added 8 commits February 21, 2022 17:34
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
Co-authored-by: Yossi Gottlieb <yossigo@gmail.com>
@oranagra oranagra merged commit 47c51d0 into redis:unstable Feb 22, 2022
@oranagra oranagra moved this from Awaits merge to Unreleased in 7.0 Feb 22, 2022
ranshid added a commit to ranshid/redis that referenced this pull request Feb 28, 2022
After introducing redis#9822 need to prevent client reply buffer shrink
to maintain correct client memory math.
oranagra added a commit that referenced this pull request Feb 28, 2022
…10354)

After introducing #9822 need to prevent client reply buffer shrink
to maintain correct client memory math.

add needs:debug missing one one test.

Co-authored-by: Oran Agra <oran@redislabs.com>
@oranagra oranagra mentioned this pull request Feb 28, 2022
@oranagra oranagra moved this from Unreleased to Done in 7.0 Mar 1, 2022
oranagra added a commit that referenced this pull request Mar 8, 2022
since #9822, the static reply buffer is no longer part of the client structure, so we need to dismiss it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action:run-benchmark Triggers the benchmark suite for this Pull Request release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus
Projects
Archived in project
7.0
Done
Development

Successfully merging this pull request may close these issues.

None yet

7 participants