Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to RESP2 hides potential authentication configuration problems #2313

Closed
liyuntao opened this issue Feb 1, 2023 · 3 comments
Closed
Assignees
Labels
type: bug A general bug
Milestone

Comments

@liyuntao
Copy link

liyuntao commented Feb 1, 2023

Bug Report / Help wanted

We have a simple 6-node redis5 cluster: no SSL, no password, bare-metal deployment, can be connected via machine IP: port directly.

And the application(running in Kubernetes environment) used to be normal until I upgrade the lettuce-core to 6.2.X.

Seems the application is stuck at connecting redis-cluster when starting up, until it is killed by Kubernetes liveness probe.

Below are the logs:

application logs under 6.2.X

After the last error='ERR unknown command HELLO output, the timeout error stack was thrown after about 1 minute.

2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.CommandHandler [channel=0x3e3653f5, /10.233.75.132:55582 -> /192.168.99.171:7004, epid=0x5, chid=0x6] Completing command LatencyMeteredCommand [type=HELLO, output=GenericMapOutput [output=null, error='ERR unknown command `HELLO`, with args beginning with: `3`, `AUTH`, `default`, ``, '], commandType=io.lettuce.core.protocol.AsyncCommand]
2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.RedisStateMachine Decode done, empty stack: true
2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.CommandHandler [channel=0x892d4fa0, /10.233.75.132:42120 -> /192.168.99.171:7005, epid=0x6, chid=0x4] Completing command LatencyMeteredCommand [type=HELLO, output=GenericMapOutput [output=null, error='ERR unknown command `HELLO`, with args beginning with: `3`, `AUTH`, `default`, ``, '], commandType=io.lettuce.core.protocol.AsyncCommand]

2023-02-01 16:33:24,606 DEBUG io.lettuce.core.AbstractRedisClient Connecting to Redis at 192.168.99.171/<unresolved>:7000, initialization: 192.168.99.171/<unresolved>:7000
io.lettuce.core.RedisCommandTimeoutException: Connection initialization timed out after 1 minute(s)
	at io.lettuce.core.protocol.RedisHandshakeHandler.lambda$channelRegistered$0(RedisHandshakeHandler.java:67)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1589)
2023-02-01 16:33:24,606 DEBUG io.lettuce.core.AbstractRedisClient Connecting to Redis at 192.168.99.171/<unresolved>:7001, initialization: 192.168.99.171/<unresolved>:7001
io.lettuce.core.RedisCommandTimeoutException: Connection initialization timed out after 1 minute(s)
	at io.lettuce.core.protocol.RedisHandshakeHandler.lambda$channelRegistered$0(RedisHandshakeHandler.java:67)
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
	at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:1589)
2023-02-01 16:33:24,608 DEBUG io.lettuce.core.RedisChannelHandler closeAsync()
@liyuntao

This comment was marked as resolved.

@liyuntao
Copy link
Author

liyuntao commented Feb 14, 2023

Hi @mp911de I feel like I've found the problem. There may exist some reason that hides the actual exception in 6.2.X.

Since we were using redis5, I tried to change the client option to protoVer=RESP2 explicitly, and the error message becomes:

Suppressed: io.lettuce.core.RedisConnectionException: Unable to connect to [192.168.99.171/<unresolved>:7005]: Password must not be empty
at io.lettuce.core.cluster.topology.DefaultClusterTopologyRefresh.lambda$openConnections$12(DefaultClusterTopologyRefresh.java:347)
at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)

and then I tried to pass an empty string to RedisURI.withPassword(), same error again.

Finally, the application can start nicely only when null value being explicitly passed to builder method. e.g.

uriBuilder.withPassword((char[]) null);

So a redis5 cluster without a password and such code below can reproduce the problem.

RedisURI.Builder uriBuilder = RedisURI.Builder.redis(part[0], Integer.parseInt(part[1]));
if (redisPwd != null) {
    uriBuilder.withPassword((CharSequence) redisPwd);
}

Summary:

  • missing doc description for a non-password scenario here
  • a slightly misleading description line-here
  • hiding error problem(logic path) when Auto-discovery + no password protected redis5

@mp911de mp911de added the type: bug A general bug label Feb 14, 2023
@mp911de mp911de changed the title cannot connect to Redis5 cluster under 6.2.X Fallback to RESP2 hides potential authentication configuration problems Feb 14, 2023
@mp911de
Copy link
Collaborator

mp911de commented Feb 14, 2023

You're right, this is a problem on our side where we do not properly surface configuration issues with the password. Since Redis passwords can be empty (although not recommended), we should accept empty passwords.

@mp911de mp911de added this to the 6.2.3.RELEASE milestone Feb 14, 2023
@mp911de mp911de self-assigned this Feb 14, 2023
mp911de added a commit that referenced this issue Feb 14, 2023
…rade #2313

We now correctly propagate exceptions that can happen from RESP2 handshake initiator. Previously, these exceptions didn't bubble up and went unnoticed resulting in a handshake timeout.
@mp911de mp911de closed this as completed Feb 14, 2023
mp911de added a commit that referenced this issue Jul 25, 2023
…rade #2313

We now correctly propagate exceptions that can happen from RESP2 handshake initiator. Previously, these exceptions didn't bubble up and went unnoticed resulting in a handshake timeout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants