You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding how (or if) tiered Redis caches can be configured, and what the word "cache" means. I have done some searching and LLM-summary reading, but this is a keyword-dense question and I found no suitable answers.
I'm coming from the world of DNS caches and web caches, so apologies if my nomenclature is not consistent.
Redis at higher levels is fairly new to me, though I have been using trivial Redis instances in the past for many years. I have an application that requires a layered "cache" model. By this I mean that an application sends a query to a local Redis instance, for a SET query. The SET query is pointed at a fairly large number of objects in the set - anywhere from a few dozen to a few million. There are thousands (or tens of thousands, or hundreds of thousands!) of SETs as this will grow in scale.
I have:
limited memory on the edge nodes, such that I cannot store all SETs in every distributed cache (not enough memory)
small numbers of users who are only interested in a small sub-section of the full list of SETs that exist (I don't need all SETs in all locations)
very very fast access required for answers for SISMEMBER SET queries (<.1ms or less)
no way to tell what the needs are at any site until the actual questions are being asked to Redis (highly diverse and non-predictable data access)
a strong interest in using a single API (Redis) for all parts of this system, instead of customizing my own code to provide efficiency
I'm wondering if there has ever been any consideration of doing partial replication between Redis instances based on query patterns, not as a universal duplication of all data. This would look similar to how DNS caches work today: a request for an object comes in, the local cache is examined. If no answer exists in cache, then the singular query is sent onwards to (probably) another cache. If it does not have the answer, it sends onwards to other caches or to authoritative servers (this is only one diagram of many possible layerings of DNS caches; let's use it for this discussion.) Only what is needed is stored, by all members in the query path.
What I'm missing in my understanding of the Redis model is that there seems to be no "cache" feature between Redis instances - there is only "replicate," and "replicate" implies 100% of the data from one primary server, transmitted to the subsidiary servers. This seems highly inefficient. Why not just wait until there is a query at the edge node, then relay that to the central system, which would then reply (possibly with a "fast response" of that exact question) and then which would start replicating the data required to provide that answer? In this way the edge systems would have only and exactly what was required for their needs: Redis instances at the edge of the diagram would only have exactly as much data as they needed to answer for the clients connecting to them. It could be possible for the replication to have a TTL, such that the primary server would keep sending data updates to the subsidiary edge server until a timer expired, and then the cached data in the edge server would be deleted, the replication would stop, and resources would be preserved only for actively used data that required replication. The downside is that the first instances of queries to a SET might be a bit slow as the initial queries would be to an empty cache set, or would be answered by the central Redis instance before the full replication became available to the edge node. This may be acceptable, or may be mostly mitigated by some priming queries - that is a minor detail for application-layer implementation.
Again: my concept of "cache" perhaps is different. A DNS server does not cache the entire DNS - it only caches things that have been recently asked. A web cache does not store the entire Internet - it just caches things that have been recently requested. Cannot this model work with Redis?
Note that this is slightly different than caching the answer to a query, which can be done in the application and outside of Redis. With very large SET queries, one would need to duplicate the underlying data that provides the answer (and keep it fresh and updated) versus just caching the answers that came from the queries pointed at the SET data.
I see how I could write some tool to mimic the Redis API and then shuffle data between Redis instances but ... that's terribly ugly when it seems that this should be a native Redis function and not some bodged-up glue code.
Is it possible to configure Redis in such a cascading, memory-sparse fashion? Am I missing the obvious in the documentation? Do I need to forget the way the word "cache" has been used up until now in my experiences?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I have a question regarding how (or if) tiered Redis caches can be configured, and what the word "cache" means. I have done some searching and LLM-summary reading, but this is a keyword-dense question and I found no suitable answers.
I'm coming from the world of DNS caches and web caches, so apologies if my nomenclature is not consistent.
Redis at higher levels is fairly new to me, though I have been using trivial Redis instances in the past for many years. I have an application that requires a layered "cache" model. By this I mean that an application sends a query to a local Redis instance, for a SET query. The SET query is pointed at a fairly large number of objects in the set - anywhere from a few dozen to a few million. There are thousands (or tens of thousands, or hundreds of thousands!) of SETs as this will grow in scale.
I have:
I'm wondering if there has ever been any consideration of doing partial replication between Redis instances based on query patterns, not as a universal duplication of all data. This would look similar to how DNS caches work today: a request for an object comes in, the local cache is examined. If no answer exists in cache, then the singular query is sent onwards to (probably) another cache. If it does not have the answer, it sends onwards to other caches or to authoritative servers (this is only one diagram of many possible layerings of DNS caches; let's use it for this discussion.) Only what is needed is stored, by all members in the query path.
What I'm missing in my understanding of the Redis model is that there seems to be no "cache" feature between Redis instances - there is only "replicate," and "replicate" implies 100% of the data from one primary server, transmitted to the subsidiary servers. This seems highly inefficient. Why not just wait until there is a query at the edge node, then relay that to the central system, which would then reply (possibly with a "fast response" of that exact question) and then which would start replicating the data required to provide that answer? In this way the edge systems would have only and exactly what was required for their needs: Redis instances at the edge of the diagram would only have exactly as much data as they needed to answer for the clients connecting to them. It could be possible for the replication to have a TTL, such that the primary server would keep sending data updates to the subsidiary edge server until a timer expired, and then the cached data in the edge server would be deleted, the replication would stop, and resources would be preserved only for actively used data that required replication. The downside is that the first instances of queries to a SET might be a bit slow as the initial queries would be to an empty cache set, or would be answered by the central Redis instance before the full replication became available to the edge node. This may be acceptable, or may be mostly mitigated by some priming queries - that is a minor detail for application-layer implementation.
Again: my concept of "cache" perhaps is different. A DNS server does not cache the entire DNS - it only caches things that have been recently asked. A web cache does not store the entire Internet - it just caches things that have been recently requested. Cannot this model work with Redis?
Note that this is slightly different than caching the answer to a query, which can be done in the application and outside of Redis. With very large SET queries, one would need to duplicate the underlying data that provides the answer (and keep it fresh and updated) versus just caching the answers that came from the queries pointed at the SET data.
I see how I could write some tool to mimic the Redis API and then shuffle data between Redis instances but ... that's terribly ugly when it seems that this should be a native Redis function and not some bodged-up glue code.
Is it possible to configure Redis in such a cascading, memory-sparse fashion? Am I missing the obvious in the documentation? Do I need to forget the way the word "cache" has been used up until now in my experiences?
JT
Beta Was this translation helpful? Give feedback.
All reactions