Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Client Side Caching and Redirect inconsistency #13082

Open
sjpotter opened this issue Feb 21, 2024 · 2 comments
Open

[BUG] Client Side Caching and Redirect inconsistency #13082

sjpotter opened this issue Feb 21, 2024 · 2 comments

Comments

@sjpotter
Copy link
Contributor

sjpotter commented Feb 21, 2024

Describe the bug

Redis supports 2 ways of sending invalidate messages. It can either send it to the connection (client id) that cached the itself via push messages (only RESP3) or send the invalidation to a separate connection via the redirect option.

Redirect can be useful not just for a single control/client pair, but to implement a pooled client, where one has a cache that is shared amongst multiple connections/clients. Where each connection redirects to a single control connection.

Conceptually, if one wanted to implement a shared cache, one has 3 possible ways

  1. no knowledge that the cache is shared, and when any connection drops, drop the whole cache (as anything cached by that connection, will no longer be able to be invalidated)

  2. track which connection added a cache entry and when one retrieves the cache entry, validate that the connection that created it is still existent and if not, invalidate that cache entry.

  3. all connections redirect to a single control connection, if that control connection drops, much like 1) one has to do drop the whole cache, but if only a regular conncetion drops, conceptually, that shouldn't matter.

However it does. The way redirect works, is that each connection itself knows who it redirects to, but we only invalidate to the that redirected connection if said client is still alive.

why?

We store the ID of the connection that caused the entry to be cached, so on invalidation, we get it's ID, and check if it's still alive, if not, redis decides there's nothing to do. If its still connected, it checks if the client is redirected. So, even if the redirect connection is still alive, but the connection that caused the entry to be cached is not, there will be no invalidate message sent.

Now, why might one want this? One can change which connections gets the invalidation message.

Why is that pointless? If one is redirecting to connection ID 10 and that connection dies, it doesn't matter if we can update each client, since invalidations could occur before we have a new connection and all the clients are updated, we have to treat it as if the cache is no longer valid at all.

Therefore, redis, instead of storing the ID of the connection that caused the key to be cached, redis should store the ID of the connection that will get the invalidate message. (it's possible that one can store both, but I'm unsure of any need for the caching connection at this point). This way, the connection that expects to receive invalidation, will always receive them for as long as it remains connected.

To reproduce

make 3 connections to redis

have client 1 CLIENT TRACKING ON REDIRECT to the client 2

have client 1 get x (should now be cached)

have client 3 set x to some value (should see an invalidate message on client 2)

have client 1 get x (again, cached)

disconnect client 1 (typo fixed see comment below)

have client 3 set x to some value (will not see an invalidate message on client 2)

Expected behavior

while the above works as redis is built, I describe above why this seems wrong, and that as long as "client 2" remains alive, it should get invalidations that are not dependent on client 1 remaining connected.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Feb 22, 2024

It's an interesting topic. A shared cache could make the whole client side tracking idea more useful.

If the tracked keys would be tied to the control connection instead of the connection initiating the tracking, how would that affect the behaviour of CLIENT TRACKING OFF? Will it be OFF just for new keys from now on while keys already tracked remain being tracked? (Similarly for changing the control connection, does that affect only new keys accessed from now on?)

Under "To reproduce", is "disconnect client 2" a typo? Do you mean "disconnect client 1"?

@sjpotter
Copy link
Contributor Author

yes, to typo, its disconnect client 1. will fix.

yes, tracking off / new client id, will just be for future keys. Which makes sense in this context, as the control connection is the one that owns the cached keys, not the pooled client that caused the keys to be cached. If the "old connection" got disconnected (i.e. 2 in the above example, not the typo) then the cache has to be treated as effectively gone (i.e. same situation as today with 1:1 redirect and the caching client disconnecting).

so what's the story there, control connection dies. We reconnect it, we therefore get a new ID, we update all the pooled clients with the new ID and have them issue a new CLIENT TRACKING ON REDIRECT.

In node-redis, the POC implementation I have for pooled caching would clear and disable (prevent pooled clients from inserting, in practice since its cleared, don't have to prevent retrievals as there wont be anything there) the cache from the moment the connection drops until it reconnects (i.e. and had CLIENT TRACKING ON REDIRECT ... executed against all the existing pooled clients).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants