New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FR] [STREAMS] Control Stream Size via Config or Command #10270
Comments
few random notes:
|
@oranagra Thanks for the reply.
Actually, this is exactly my use-case, we are using debezium server to push data into redis; there are thousands of events coming in every hour.. of all different sizes. So having this global config would be really nice (even if it sacrifices performance, it's better than going OOM).
|
are you using the Redis Debezium sink to push the events into Redis? We just pushed a PR for Debezium Redis sink that pauses writing to Redis if the memory is full and resumes once memory is available. see debezium/debezium#3185. Would that address your specific use case? |
@zalmane No, basically I think there should be a feature that automatically controls the size/len of streams. For example, you set something like So it's more about avoiding hitting that memory limit, through some configuration. Rather than deciding what to do when the memory limit is hit. |
To be clear, I think this should be added:
In addition. i also think these should be added:
Or something similar @oranagra Please let me know if there should be a different feature request for these |
A MAXSIZE feature to XADD and XTRIM sounds reasonable to me (it's basically an additional trimming threshold). A global config that controls the maximum size of each stream is just the same as the above, I.E. XADD can implicitly use that value when it's MAXSIZE argument is missing (assuming no one expects us to scan all streams if / when the config changes in retrospect). Regarding a global setting that affects the total size of all streams, that's far more problematic, specifically if we also expect "eviction" to be able to find some stream (based on some criteria) and trim it. |
I fully agree that a top level config isn't aligned with the current. Assuming that the stream's entries are (more or less) equal in size, this is just syntactic sugar for MAXLEN with the right factor. If the entries' size variance is significant, a MAXSIZE could have negative impacts. /cc @guybe7 |
Most use cases of trimming will be for limiting the amount of memory used (I assume), so wouldn't using memory as a factor make more sense than trying to guess/calculate the appropriate size? Can you elaborate on the negative affects, I don't quite understand. |
Given a stream capped to 100 memory units, where entries' sizes vary between 1 and 100 units, a new entry will require evicting an unknown number of entries.
|
I don't think limiting stream size can avoid hitting OOM or evicting keys on Redis side.
So, IMHO, limiting stream by size does not have a significant benefit compared to limiting the stream by the number of entries (of course limiting the stream by the number of entries also cannot avoid OOM). And all OOM-related issues should be handled on the infrastructure layer, not Redis itself (for example, an early warning about memory usage) |
Why would we need to re-config size when we add a new stream. This would be a global setting, so it's not stream specific.
This would never happen. Whenever you add new messages to a stream, i expected there to be some (async or sync) job for checking the stream size and evicting from the back of the largest stream. This depends on how it's implemented, but still that situation should never happen. @itamarhaber @yossigo @oranagra My primary complaint is that Redis Streams should have a better way to control the size of stream(s) internally. Redis stores all data in memory, if you have millions of events/hour coming though Redis Stream (like most enterprise use-cases), there should be a way to continue to allow events to flow through Redis by delete old events by the size of the stream, not the length. Currently, use a script to get memory usage and xtrim based on that (I also heard RedisGears can do this also), but definitely shouldn't be done on the application level because the application doesn't know (and it shouldn't) how much memory the Redis has. And let's say I want to increase/decrease Redis memory and/or stream memory. This would need to be done on the application layer, which is not good. |
@itamarhaber @yossigo @oranagra Any updates on this? I want to know if you understand my point why I think the Redis Streams is missing a really useful feature. |
@bubbajoe If we assume the stream gets its own instance, then a possible solution could be an I'm not sure how common this use case is, though. |
We might want to consider supporting more advanced types of "eviction" policies in future versions. I've actually heard a number of customers from AWS grumble that a global LRU/LFU is limiting. I've heard examples like they want to evict all items starting with X first, and once that is done consider the remaining items in an LRU fashion. This will require more thought for how to make it efficient. I'm not sure I like the idea of XADD variant that isn't marked with use-memory and will trim the stream to make memory available, and it also doesn't seem to match the outlined use case very well. I actually would prefer we figure out MAXSIZE on XADD. Reminds me of #10152. |
This is exactly my point, once a stream hits the maximum size. (Whether you are using it for cache, streaming, or both; this issue will happen at some point whether you have you have 4GB Or 32GB) Then how can you free up this memory? By xtrim or xadd with trim, right? There should be another way to do this that doesn't require clients to implement this confusing logic in there producers. In a perfect world, xadd with trim would be enough. But if you take Kafka for example, there are many integrations done by many people and you can't always expect them to have xadd trim options. Kafka has many configurations that are possible, including retention policies. Which is really important especially for Redis, considering all the data need to fit in memory. As I mentioned before having a client do this is not good. Especially when horizontal scaling redis across internal services. I think someone also mentioned that this was possible with Redis gears, but I believe this functionality to be a missing feature in Redis Streams. |
I don't know about the best implementation, I just think there should be a feature to automatically remove messages from a stream to free up memory. And maybe have different Max-memory per each stream and for all streams? |
I think I got the same requirement, when the Max-memory is reached, instead of remove the whole stream (because it will remove by keys), just remove the earliest messages from streams. The use case is:
|
I ended up writing a script for this, but I really wish someone from the Redis project would understand that setting stream configurations like trimming, shouldn't be handle a consumer (or even a producer IMHO). It's just weird.. A consumer/producer shouldn't need to care about stuff like that. It should be able to blindly push/pull data. And the configurations for stream should be set elsewhere... They need to take a look at how Kafka does it and also keep in mind the limitations of Redis for streaming.. |
we understand that for a database mainly composed of (large) streams, it makes sense to ask to limit the memory by trimming the streams rather than eviction a complete key. |
The problem/use-case that the feature addresses
We use redis for streaming and caching, but the issue is once you have millions of events going through redis, redis will start to delete
keys
from memory. This behavior is not wanted, what if we can control the size(sum in bytes) of all streams?Description of the feature
How about adding this in the config?
stream-maxmemory 1gb
- all streams should have a consecutivestream-maxmemory -1
- no limit (what is currently being done)stream-maxmemory-policy delete-keys
- delete keys when memory limit is hit (what is currently being done)stream-maxmemory-policy delete-messages
- delete messages from the tail of stream when the memory limitstream-maxmemory-policy restrict
-XADD
is disabled when the memory limit is hitAlternatives you've considered
this could also be a command for a specific stream, for example:
XSIZE MAXSIZE 500mb POLICY (restrict|messages|keys) STREAMS stream1 stream2
Also, we have spun up a second redis instance to avoid this issue. But i think this is better. Potential this could replace XTRIM for most.
The text was updated successfully, but these errors were encountered: