Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several network/protocol related fixes. #27

Merged
merged 8 commits into from
Mar 17, 2020

Conversation

yossigo
Copy link
Collaborator

@yossigo yossigo commented Mar 17, 2020

Fixes multiple issues that result with corrupted replies, stuck clients and/or consistency issues:

  • Improved MULTI/EXEC error handling, see issue DISCARD doesn't always discard #25.
  • Follower proxy may fail to deliver error reply to client when not able to proxy (e.g. no leader, no connection, etc.); This may result with a client blocked forever or, if commands are pipelined, out-of-sync responses.
  • Proxied MULTI commands are delivered to the leader as a bundle, but erroneously go through the MULTI/EXEC mechanism again which may interleave them with anything else on the same proxy connection.

yossigo added 8 commits March 17, 2020 11:22
New metrics include:
* proxy_reqs: Total number of requests that were proxied.
* proxy_failed_reqs: Requests that could not be proxied due to
cluster/connection state; client receives an error reply.
* proxy_failed_responses: Requests for which a valid response was not
received; client receives a -TIMEOUT or other error in this case.
* proxy_outstanding_reqs: Number of proxied requests waiting for reply
from the leader.
When a leader receives a pre-bundled MULTI/EXEC transaction, it should
process it immediately rather than attempt to re-bundle the individual
commands.
This makes handling of hiredis contexts cleaner and more defensive, but
it was not confirmed to fix any specific issue.
Error message was delivered on the wrong context, leaving the client
blocked waiting for a reply or, if pipelining was used, getting out of
sync with replies.
MULTI and DISCARD should never fail and always begin or terminate a
transaction state. While in MULTI state, commands are simply +QUEUED and
should never fail due to Raft cluster state either.

Only EXEC itself applies Raft cluster state checks and may respond with
errors such as -NOLEADER or -MOVED.
@yossigo yossigo merged commit e657423 into RedisLabs:master Mar 17, 2020
@yossigo yossigo deleted the wip/network-issues branch March 18, 2020 13:17
sjpotter added a commit to sjpotter/redisraft that referenced this pull request Oct 7, 2021
create a self documenting function name instead of throwing in a bunch of conditions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant