Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection to newly added node is initially unavailable with validateClusterNodeMembership: false. #340

Closed
oklahomer opened this issue Aug 23, 2016 · 2 comments
Labels
type: bug A general bug
Milestone

Comments

@oklahomer
Copy link

oklahomer commented Aug 23, 2016

This issue was previously discovered in another question-marked issue #339 .
Below shows detailed context and error content.

lettuce version

4.2.1

Contextual information

To maximize the availability, I applied some changes to client options: ClusterClientOptions and ClusterTopologyRefreshOptions.
Current settings are shown below:

ClientResources clientResources = DefaultClientResources.builder().build();
List<RedisURI> redisURIs = ...;
RedisClusterClient redisClusterClient = RedisClusterClient.create(clientResources, redisURIs);

ClusterTopologyRefreshOptions topologyRefreshOptions =
        ClusterTopologyRefreshOptions.builder()
                                     .enablePeriodicRefresh(INTERVAL_IN_SECONDS, TimeUnit.SECONDS)
                                     .enableAllAdaptiveRefreshTriggers()
                                     .build();

ClusterClientOptions clusterClientOptions =
        ClusterClientOptions.builder()
                            .validateClusterNodeMembership(false) // To acknowledge brand new nodes
                            .topologyRefreshOptions(topologyRefreshOptions)
                            .build();

redisClusterClient.setOptions(clusterClientOptions);

In this modification, ClusterClientOptions#validateClusterNodeMembership is set to false to acknowledge brand new nodes with new host:port regarding #109.
However, ClusterClientOptions#closeStaleConnections is left with default value: true. (Just got to have a hunch that this is somehow related...)

Simplest possible steps to reproduce

Create lettuce instance with above settings, add nodes to redis cluster, reshard, and an exception is thrown. It is thrown only once and later operations seem to run normally and as expected.

Thanks in advance.

@mp911de
Copy link
Collaborator

mp911de commented Aug 26, 2016

The reason for the failure: The newly created connection is opened in the same thread (EventLoop) that is assigned to the connection itself. Opening a new connection will assign a different EventLoop which is a different thread and so it eventually works. The solution should be retrying using the computation thread pool.

mp911de added a commit that referenced this issue Aug 28, 2016
Previously, cluster command redirects were executed on the I/O thread. This is most of the time not an issue but in cases where lettuce did not open a connection to the target host, a connection has to be established. This is a blocking operation which blocks the event loop. The new connection is assigned in some cases (depending on the connection/event loop assignment) to the event loop which currently executes the retry. This leads to a deadlock.

Command retries are scheduled now using the event executor thread group.
mp911de added a commit that referenced this issue Aug 28, 2016
Previously, cluster command redirects were executed on the I/O thread. This is most of the time not an issue but in cases where lettuce did not open a connection to the target host, a connection has to be established. This is a blocking operation which blocks the event loop. The new connection is assigned in some cases (depending on the connection/event loop assignment) to the event loop which currently executes the retry. This leads to a deadlock.

Command retries are scheduled now using the event executor thread group.
@mp911de mp911de added this to the Lettuce 4.2.3 milestone Aug 28, 2016
@mp911de
Copy link
Collaborator

mp911de commented Aug 28, 2016

That's fixed now.

@mp911de mp911de closed this as completed Aug 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants