Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringGet fails with "No connection available" when a master node fails over #1237

Closed
LeeSanderson opened this issue Sep 26, 2019 · 3 comments

Comments

@LeeSanderson
Copy link

I have a local test setup with 3 master nodes and 3 replicas and have configured the ConnectionMultiplexer as follows:

ConnectionMultiplexer.Connect("127.0.0.1:7000,127.0.0.1:7001,127.0.0.1:7002,127.0.0.1:7100,127.0.0.1:7101,127.0.0.1:7102")

Nodes 7000-7002 are initially the master nodes and nodes 7100-71002 are the replicas.
If I inspect the slot allocations for this configuration using the ClusterConfiguration I get:

Slot range from 0 - 5460 available via 127.0.0.1:7000
Slot range from 5461 - 10922 available via 127.0.0.1:7001
Slot range from 10923 - 16383 available via 127.0.0.1:7002

In this configuration StringGet works fine and selects the node with the slot allocated to it.

If I then kill one of the master nodes (say 7000) and wait for the replica to become the new master (say 7102 takes over from 7000) then inspecting the ClusterConfiguration gives:

Slot range from 0 - 5460 available via 127.0.0.1:7102
Slot range from 5461 - 10922 available via 127.0.0.1:7001
Slot range from 10923 - 16383 available via 127.0.0.1:7002

And running "CLUSTER INFO" from the redis-cli reports that the cluster state is "OK".

However, StringGet throws and error:

StackExchange.Redis.RedisConnectionException: No connection is available to service this operation: GET e0f6a26a5dc14b72ab8190ae7330a8e6; UnableToConnect on 127.0.0.1:7000/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.0.601.3402; IOCP: (Busy=0,Free=1000,Min=12,Max=1000), WORKER: (Busy=0,Free=2047,Min=12,Max=2047), Local-CPU: n/a ---> StackExchange.Redis.RedisConnectionException: UnableToConnect on 127.0.0.1:7000/Interactive, Initializing/NotStarted, last: NONE, origin: BeginConnectAsync, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: Connecting, mgr: 10 of 10 available, last-heartbeat: never, global: 0s ago, v: 2.0.601.3402

It looks like it is trying to read the value from the old master (7000) even though the replica (7102) has take over responsibility for those slots.

Is this a bug? Have I made some error in my configuration? Is there a work around?

Note also that restarting the failed node stops the error from happening - even though the replica is still the master node.

@AlexeiScherbakov
Copy link

This is related to #1120

@NickCraver
Copy link
Collaborator

Yep, please see #1120 and more importantly #1374 - evaluating a fix now and expect it in a 2.1.x release shortly.

@roizentner
Copy link

I'm still experiencing this issue in v2.2.4, couldn't understand how this is related to #1120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants