-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redisson Client not ignoring defunct nodes in Redis Cluster #5754
Comments
EVENTUAL_FAIL state handling is missed. I'll add it. |
Can you share Redisson logs? |
Are the other states like |
yes |
I need the logs to make sure that the issue caused by state |
{ "timestamp": "2024-04-02T16:29:59,997", "logger": "org.redisson.connection.ServiceManager", "level": "ERROR", "threadID": "104", "threadName": "redisson-netty-2-16", "message": "Unable to resolve rediss://prri0mzh6nsauw6-0015-002.prri0mzh6nsauw6.wcbpsb.euw1.cache.amazonaws.com:6379", "exception": " io.netty.resolver.dns.DnsResolveContext$SearchDomainUnknownHostException: Failed to resolve 'prri0mzh6nsauw6-0015-002.prri0mzh6nsauw6.wcbpsb.euw1.cache.amazonaws.com' [A(1)] and search domain query for configured domains failed as well: [eu-west-1.compute.internal]\n\tat io.netty.resolver.dns.DnsResolveContext.finishResolve(DnsResolveContext.java:1097)\n\tat io.netty.resolver.dns.DnsResolveContext.tryToFinishResolve(DnsResolveContext.java:1044)\n\tat io.netty.resolver.dns.DnsResolveContext.query(DnsResolveContext.java:432)\n\tat io.netty.resolver.dns.DnsResolveContext.onResponse(DnsResolveContext.java:662)\n\tat io.netty.resolver.dns.DnsResolveContext.access$500(DnsResolveContext.java:66)\n\tat io.netty.resolver.dns.DnsResolveContext$2.operationComplete(DnsResolveContext.java:489)\n\tat io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590)\n\tat io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583)\n\tat io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559)\n\tat io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492)\n\tat io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636)\n\tat io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625)\n\tat io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105)\n\tat io.netty.resolver.dns.DnsQueryContext.trySuccess(DnsQueryContext.java:317)\n\tat io.netty.resolver.dns.DnsQueryContext.finishSuccess(DnsQueryContext.java:309)\n\tat io.netty.resolver.dns.DnsNameResolver$DnsResponseHandler.channelRead(DnsNameResolver.java:1400)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:97)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n" } |
Can you also help us how to add configuration for the other failed states? |
it's related to DNS and not node status. Here is the solution: #5726 (comment) |
Redis team suggested to use the node filter which will ignore failed nodes. And mentioned abive can happen because of node status. Can you help us with the config for it? Because we are afraid that the above config change can cause some other performance issues on our end. |
We also have logs with message : Unable to parse cluster nodes state got from: 10.0.150.252/10.0.150.252:6379 |
Our current theory is that the parsePartitions function failed since the failed nodes does not exist in the DNS server. Is there a way not to resolve the address of failed nodes? |
Can you share the fullstacktrace? |
Fixed. Thanks for report |
As per ElastiCache best practices : https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/BestPractices.Clients-lettuce.html , we are suggested to use nodeFilter which will ignore the failed/defunct nodes while connecting to Redis Cluster as per below.
f```
inal ClusterClientOptions clusterClientOptions =
ClusterClientOptions.builder()
... // other options
.nodeFilter(it ->
! (it.is(RedisClusterNode.NodeFlag.FAIL)
|| it.is(RedisClusterNode.NodeFlag.EVENTUAL_FAIL)
|| it.is(RedisClusterNode.NodeFlag.HANDSHAKE)
|| it.is(RedisClusterNode.NodeFlag.NOADDR)))
.validateClusterNodeMembership(false)
.build();
redisClusterClient.setOptions(clusterClientOptions);
The text was updated successfully, but these errors were encountered: