Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replayed activation commands may fail because of their execution sequence #1255

Closed
robertvazan opened this issue Apr 1, 2020 · 6 comments
Closed
Labels
type: bug A general bug
Milestone

Comments

@robertvazan
Copy link

Bug Report

StatefulRedisPubSubConnectionImpl logs a warning: SELECT failed: ERR only (P)SUBSCRIBE / (P)UNSUBSCRIBE / PING / QUIT allowed in this context when repeatedly reconnecting in a short period of time (due to redis server restart). I assume this is a bug. Lettuce should be smart enough to not issue commands in the wrong order.

Current Behavior

Most relevant logs
Apr 01 08:32:49 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis/149.28.236.118:6378
Apr 01 08:32:49 [lettuce-nioEventLoop-4-3] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:04 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:05 [lettuce-nioEventLoop-4-3] WARN io.lettuce.core.protocol.ConnectionWatchdog - Cannot reconnect to [myredis:6378]: Connection refused: myredis/149.28.236.118:6378
Apr 01 08:33:09 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-3] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-3] WARN io.lettuce.core.pubsub.StatefulRedisPubSubConnectionImpl - SELECT failed: ERR only (P)SUBSCRIBE / (P)UNSUBSCRIBE / PING / QUIT allowed in this context
Full logs
Apr 01 08:30:42 [ReredisConnection-1] INFO io.lettuce.core.EpollProvider - Starting without optional epoll library
Apr 01 08:30:42 [ReredisConnection-1] INFO io.lettuce.core.KqueueProvider - Starting without optional kqueue library
Apr 01 08:32:49 [lettuce-eventExecutorLoop-1-1] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis/149.28.236.118:6378
Apr 01 08:32:49 [lettuce-eventExecutorLoop-1-2] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis/149.28.236.118:6378
Apr 01 08:32:49 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis/149.28.236.118:6378
Apr 01 08:32:49 [lettuce-nioEventLoop-4-1] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:32:49 [lettuce-nioEventLoop-4-3] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:32:49 [lettuce-nioEventLoop-4-2] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:04 [lettuce-eventExecutorLoop-1-2] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:04 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:04 [lettuce-eventExecutorLoop-1-1] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:05 [lettuce-nioEventLoop-4-2] WARN io.lettuce.core.protocol.ConnectionWatchdog - Cannot reconnect to [myredis:6378]: Connection refused: myredis/149.28.236.118:6378
Apr 01 08:33:05 [lettuce-nioEventLoop-4-3] WARN io.lettuce.core.protocol.ConnectionWatchdog - Cannot reconnect to [myredis:6378]: Connection refused: myredis/149.28.236.118:6378
Apr 01 08:33:05 [lettuce-nioEventLoop-4-1] WARN io.lettuce.core.protocol.ConnectionWatchdog - Cannot reconnect to [myredis:6378]: Connection refused: myredis/149.28.236.118:6378
Apr 01 08:33:09 [lettuce-eventExecutorLoop-1-2] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:09 [lettuce-eventExecutorLoop-1-3] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:09 [lettuce-eventExecutorLoop-1-1] INFO io.lettuce.core.protocol.ConnectionWatchdog - Reconnecting, last destination was myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-3] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-2] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-1] INFO io.lettuce.core.protocol.ReconnectionHandler - Reconnected to myredis:6378
Apr 01 08:33:09 [lettuce-nioEventLoop-4-3] WARN io.lettuce.core.pubsub.StatefulRedisPubSubConnectionImpl - SELECT failed: ERR only (P)SUBSCRIBE / (P)UNSUBSCRIBE / PING / QUIT allowed in this context

Input Code

Input Code
var uri = uri = RedisURI.builder()
	.withHost("myredis")
	.withPort(6378)
	.withSsl(true)
	.withPassword(Secrets.password("redis-password"))
	.withDatabase(10)
	.build();
var client = RedisClient.create();
var reads = client.connect(StringCodec.UTF8, uri);
var pubsub = client.connectPubSub(ByteArrayCodec.INSTANCE, uri);
var writes = client.connect(StringCodec.UTF8, uri);

Expected behavior/code

Instead of just silencing the warning (which I can do now), I would prefer to see a fix for whatever underlying bug is causing this.

Environment

  • Lettuce version(s): 5.2.2.RELEASE
  • Redis version: 5.0.5
@robertvazan robertvazan added the type: bug A general bug label Apr 1, 2020
@mp911de
Copy link
Collaborator

mp911de commented Apr 1, 2020

Thanks for the report. I'm currently puzzled how this is even possible as SELECT is sent prior to re-subscribing in StatefulRedisPubSubConnectionImpl.activated(). If you have a reliable reproduced, I'd be happy to inspect what's going on.

If you can reproduce the issue only locally, it would make sense the bit where the SELECT command is issued to figure out why this command is out of order (or what causes that issue).

In general, reconnect performs a connection handshake including authentication (AUTH, SELECT, then all SUBSCRIBE/PSUBSCRIBE commands).

@robertvazan
Copy link
Author

I just rebooted the server where redis runs and I have got the same logs, including the pattern of successful-unsuccessful-successful reconnect followed by the SELECT warning. So the issue is probably reproducible

I don't quite follow the logic in StatefulRedisConnectionImpl, but I guess that the first reconnect schedules AUTH, SELECT, and PSUBSCRIBE commands that are then moved to retry queue due to subsequent connection failure and the second reconnect then issues command pattern AUTH, SELECT, PSUBSCRIBE, AUTH, SELECT, PSUBSCRIBE and the second SELECT fails.

@mp911de
Copy link
Collaborator

mp911de commented Apr 1, 2020

That could be a likely explanation. If you’re able to reproduce the issue, can you attach a debug log so we can see what’s happening on the transport?

@robertvazan
Copy link
Author

Unfortunately the bug is not reproducible in test environment where I can enable debug logging. It apparently depends on timing of communication between the app and redis server.

@mp911de
Copy link
Collaborator

mp911de commented Apr 2, 2020

In any case, it makes sense to not re-queue/replay activation commands.

@mp911de mp911de added this to the 5.3.0 milestone Apr 2, 2020
@mp911de mp911de changed the title StatefulRedisPubSubConnectionImpl - SELECT failed Replayed activation commands may fail because of their execution sequence Apr 2, 2020
mp911de added a commit that referenced this issue Apr 2, 2020
Reformat code.
mp911de added a commit that referenced this issue Apr 2, 2020
Commands issued during connection activation (re-authentication, database re-selection, re-subscribe) are no longer replayed if the connection gets disconnected while the commands didn't complete yet.
mp911de added a commit that referenced this issue Apr 2, 2020
Reformat code.
mp911de added a commit that referenced this issue Apr 2, 2020
Commands issued during connection activation (re-authentication, database re-selection, re-subscribe) are no longer replayed if the connection gets disconnected while the commands didn't complete yet.
@mp911de
Copy link
Collaborator

mp911de commented Apr 2, 2020

That's fixed now for 5.3.0 and 6.0.

@mp911de mp911de closed this as completed Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants