Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several issues regarding pipeline, as well as what Redis can do. #12743

Open
soloestoy opened this issue Nov 9, 2023 · 4 comments
Open

Several issues regarding pipeline, as well as what Redis can do. #12743

soloestoy opened this issue Nov 9, 2023 · 4 comments

Comments

@soloestoy
Copy link
Collaborator

soloestoy commented Nov 9, 2023

As is well known, pipeline can be used to improve the interaction efficiency between clients and Redis compared to the ping-pong mode. But it is not a one-size-fits-all solution, it is not a silver bullet.

Firstly, pipelines does not guarantee atomicity, and many users overlook this fact. Some users mistakenly believe that using pipeline to send multiple commands to Redis in one go will result in Redis executing them "atomically", which is a very wrong assumption.

If pipeline were both efficient and guaranteed atomicity, we would only need to implement the basic primitives such as SET/GET/EXPIRE. There would be no need to implement additional commands like SETEX/GETEX and MULTI/EXEC.

Even more critically, pipeline has some other usage flaws. Unlike the ping-pong mode where the client need to parse and handle the response of each command before sending the next one, pipeline operates in a streaming fashion. Clients send and receive commands without waiting for the reply of the current command. This can make it difficult to handle certain special replies, especially when dealing with errors or unexpected situations like receiving "-MOVED" in cluster mode, where the client needs to redirect to the correct node and retry the current command.

For example: Suppose a client is using pipeline mode and sends three commands in sequence to the Redis node where key "a" is located: LPUSH {a}list xxx; SET {a}string 1; SET {a}string 2. Normally all commands should be executed correctly. However, if the slot is migrated to another node while executing the first and second command (LPUSH {a}list xxx and SET {a}string 1), and then migrated back while executing the third command (SET {a}string 2), the client will receive three replies: MOVED, MOVED and OK. Pipeline does not provide atomicity guarantees, so this kind of situation is indeed possible.

There are usually two methods to handle these replies:

  1. Not redirecting or handling MOVED, and directly returning it to the user. In this case, the user needs to judge and handle the results themselves, which can make the logic more complex (LPUSH {a}list xxx can be retried, but SET {a}string 1 should not be retried since SET {a}string 2 is OK). Furthermore, during switchover and slot migration, a large number of errors can occur. AFAIK JedisCluster adopts this approach.
  2. The client can redirect and retry when encountering MOVED. Although this approach shields users from internal redirect errors in the cluster, it may lead to data inconsistencies. For example, retrying the SET {a}string 1 command may result in the final value of {a}string being 1, whereas the correct result should be 2. AFAIK Lettuce does like this.

Here I would like to get some feedback from client developers, such as:

  1. Does the client support pipelines?
  2. Under what conditions does the client allow or disallow the use of pipelines, such as in the case of cluster, which is a special scenario.
  3. Have client developers encountered issues similar to the MOVED error when implementing pipelines, and if so, how were they resolved?
  4. From the client's perspective, how do you expect Redis to address these issues regarding pipelines?
@rueian
Copy link

rueian commented Nov 9, 2023

  1. rueidis, a go redis client, supports pipeline and client-side caching.
  2. It treats pipeline as a TCP nature and automatically pipelines concurrent requests to the underlying TCP connections.
  3. When it encounters a MOVED error, it just retries the command.
  4. I hope commands such as SUBSCRIBE/UNSUBSCRIBE can have a reply. So writing a pipeline parser can be much easier.

@zuiderkwast
Copy link
Contributor

hiredis-cluster is a cluster client in C using hiredis for the individual connections. It has a sync API and an async API, just like hiredis.

  • In the sync API, the client does not follow redirects in pipelines. -MOVED, -ASK and -TRYAGAIN are simply returned to the caller. (Pipelines are done using repeated "append command" and then repeated "get reply" calls.)
  • In the async API, the user can send more commands before the previous command has returned, which the client sends to the right connection based on cluster slot, essentially creating an implicit pipeline per node. The client keeps track of the number of commands sent to each node and their callback functions. When the replies are received, the reply is passed to the right callback function. Redirects are followed, including -ASK and -TRYAGAIN and when -MOVED is seen, we schedule an update of the slot mapping. We don't support pubsub (mainly because hiredis async API is very problematic with pubsub).
  • MULTI, EXEC and any other commands that don't map to a cluster slot, are only possible in a low-level API where you fetch a reference to the node for a given key or a slot, then perform the commands on this node. Redirects are not followed in this mode.

@nihohit
Copy link
Contributor

nihohit commented Nov 12, 2023

redis-rs is a Rust Redis client, supporting both sync and async APIs for cluster and standalone servers. All of these modalities support pipelines, in different scenarios.

In general, we treat pipeline commands as single commands - the user creates a pipeline, and sends it on the connection. The pipeline is routed as a single unit, and it is retried & rerouted as a single unit. So, if you send a collection of commands as a pipeline, the collection will be routed according to the first command with concrete routing. If the user has accidentally included in the pipeline 2 commands with different routings, they're likely to be automatically MOVED from the nodes. So if, you have 2 commands - the command will be sent to node A, which will return OK for the first command and MOVED B to the second command. Then both commands will be sent to node B, which will return MOVED A to the first command and OK to the second command, and so on until the retries will be exhausted. This puts the responsibility of correctly creating pipelines on the user.

@madolson
Copy link
Contributor

I would like to raise one more issue, which is a little specific to cluster mode, which is how clients expect responses from pipelines across multiple nodes to be returned. At AWS we've seen two rather major events from customers where a user moved from standalone to Redis cluster, assuming everything would work OK because they weren't issuing cross slot key commands, but had issues with client pipeline behavior. Most of the clients have either a "pipe" like interface (put items on and then get responses as if it was a single stream) or fake atomic "send X commands and get Y responses back". Both suffer from single node failures. If a single node fails, their entire application went down until the failover, which isn't the desired behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

5 participants