New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] Feature proposal: async "block" callbacks #12716
Comments
Very interesting. Intuitively, it seems like it could have broader applications beyond just block operations. For example, introducing the concept of sessions to decouple the server and connection. It's just an idea, and I haven't thought about it in depth yet : ) |
It's a very good idea. In Erlang, it is common that multiple lightweight processes share one connection, where all commands are streamed to redis and the client lib keeps track of the replies and delivers them back to the right caller. Blocking commands are very problematic with this usage. I believe it's the same problem with async-await style programming in other languages. I don't think it should be a new argument to every blocking command. It's better that it is one command to enable this, like |
@zuiderkwast Doesn't this Erlang behavior result in undesired head-of-line blocking and excess latency when command latency is not uniform? @soloestoy's point about decoupling sessions from connections would also mean the ordering of replies can be arbitrary and head-of-line blocking can be avoided. |
@yossigo Yes, potentially it can, but it's good for throughput. We keep track of how many commands are pending on each connection and start throttling (dropping commands) if Redis can't keep up the pace. (Using more connections doesn't help if Redis is the bottleneck, though scaling to more cluster shards does help.) If some code needs to use a blocking command or a slow command, they'd need to use separate instance of the client (or another client).
I don't understand how decoupling sessions from connections would work exactly. @soloestoy's idea seemed vague. Maybe you have a more specific idea? Should every command and response be accompained by a session ID? Or something like making all commands return in the way ASYNC? I think that would be excessive. If Redis has a response strait away, better deliver it immediately. But if some slow command can be computed incrementally (let's say KEYS) and the client can handle async results, then sure we can send an ASYNC response for this kind of commands too and expect the client to handle it just like a blocking command. |
The blocking commands have a nice property when it comes to transactions: they can't block inside transactions. Thus, we don't need to worry about async responses for blocking commands inside transactions. @mgravell In which ways do clients and users minimize the number of connections? Apart from what we do (multiple lightweight threads sharing the same connection, could also be coroutines), I believe an obvious way is with an async API, like |
@zuiderkwast sorry, didn't see that update; in SE.Redis this is all hidden inside the library, so from the customer's perspective: they just spin up the library API at the start and make requests; the library deals with routing (cluster, replicas, etc) and ordering of commands on individual connections; since it is .NET, async continuations work with the var val = await db.StringGetAsync(key); The library deals with all the concurrency concerns, i.e. we fully expect that multiple code-paths (independent requests, whatever) could be issuing requests to the same My intention would be that we could implement blocking operations in entirely the same way, i.e. var val = await db.SomeBlockingMethod(...); // for example XREAD BLOCK so: from their perspective, it works 100% identically; behind the scenes the library would deal with hooking things up such that we can signal the pending operation as completed (whether via success, timeout, or something worse) |
@mgravell Right, async programming can be done in many ways. Any client reusing the same connection for multiple "users" internally need to take care of a lot of special cases, like commands that affect the state of a connection (like WATCH) and automatic reconnects is another tricky topic if any keys are watched when it happens. Just a few examples. That said, I don't see any new problem with this async-blocking feature. I think it's strait-forward to implement it in a client. @yossigo Speaking of head-of-line blocking, the worst example of head-of-line blocking is caused by blocking commands. :) So this feature would eliminate that, assuming all other commands are reasonably fast. |
My idea is that the "async callback" can be used not only for blocking commands but for all Redis commands. Additionally, we can achieve decoupling of command execution and connection by adding a session layer, enabling connection reuse in unordered scenarios. For example, when multiple clients share a TCP connection, even though some clients already have similar implementations, Redis cannot differentiate between different clients behind a single TCP connection. Therefore, connection reuse relies on strict order consistency for command sending and receiving replies. Furthermore, operations with states such as If we add a session layer, we can achieve complete connection reuse for clients, regardless of command order and state. For example, when multiple clients share a connection, each client can initialize by requesting and obtaining session information from Redis, including an For example, let's consider two clients, A and B, using a shared TCP connection to access Redis. During initialization, they each obtain their respective sessions: Furthermore, with the presence of sessions, clients no longer need to use the same TCP connection to asynchronously retrieve command results. As long as they provide the correct session information, they can use any available connection to fetch the results, effectively decoupling the connection. In this scenario, even if a network anomaly causes a disconnection, it does not affect the interaction between the client and Redis. Both the client and Redis have recorded the session information and no longer rely on maintaining the connection state. Client-side command retries are also safer in this case. The client can determine whether its previously sent commands were received based on the id and sequence number recorded in Redis' current sessions. On the Redis side, even if the client sends duplicate commands, they can be rejected based on the session's |
@soloestoy Interesting. This idea is very similar to #12873. |
I like the idea of the async block. This looks simple and allows reusing a connection among blocking commands. I wish this could be done with a few modifications: No need to specify the "ASYNC" as part of the command since I don't want library users can turn async off. |
@soloestoy I agree that there are some merits to a fully multiplexed protocol, but: this is a huge upheaval - it is a much much bigger change than RESP3, and support for that is still patchy now. My concern is whether we can get most of the benefits without that huge level of complexity. Again: the full multiplexing is an order of magnitude a bigger change than RESP3 |
This is beyond the scope of this proposal, but I wonder if this will allow us to hand over slow commands (e.g. KEYS) to another thread, and continue operating without those commands blocking the other operations. |
At the moment, there are 3 categories of redis interactions:
As a library author, we have conflicting requests from people, to
These are conflicting and mutually exclusive.
I would like to propose a new category of API usage, to bridge this gap: activated callbacks.
Consider the following scenario:
XREAD BLOCK 5000 ASYNC ...
or similar+ASYNC 162748
where the second part is a connection-specific token issued by the serverASYNC
, including the specific token and the result payloadThe result of this is that "blocking" operations can now be issued efficiently without tying up a connection entirely. Multiple "blocking" async operations could be pending on a single connection.
Obviously the out-of-band nature here demands either RESP3 or a mechanism to specify an auxiliary connection id to use for the callbacks. In reality, I'm tempted to say "make this RESP3 only", to avoid inter-connection complications.
This feature would ideally apply similarly to all "blocking" operations. I think it can be applied to the existing commands as an argument, although if there is confusion maybe it also makes sense as a prefix to commands, like the client caching prefix.
Clients would be expected to store the server-issued token and use that to issue any responses. All async tokens would be single-shot only, meaning: they expect at most one reply (zero if the connection dies before it becomes activated, which the client should handle in some way).
The text was updated successfully, but these errors were encountered: