Connection leak when there is a master/slave node fail over #342

long-xuan-nguyen · 2016-08-25T05:28:02Z

I encountered an issue when there is a change in topology

Setup: 20 master nodes, each with 2 slaves. Running 4.1.1.Final. ReadFrom is read from slaves

Sequence of events:

M1 went down; redis process got killed completely
S1 got promoted to be new master
S2 now starts syncing from S1, the new master

During the time window when S2 is syncing from S1, there are many connections opened from a host to S2, eventually causing Redis to hit its client count limit and reject other connections. I see a bunch of IOException Connection reset by peer in the log.

Looking into the code (https://github.com/mp911de/lettuce/blob/4.2.x/src/main/java/com/lambdaworks/redis/cluster/PooledClusterConnectionProvider.java#L520), here is what i think is happening:

Cluster topology refresh happens
Requests for reading against S2 come in; new connection is initialized
Before this new connection is passed back to the LoadingCache, readonly() is invoked on the sync interface.
This command fails with error "Loading redis is loading dataset in memory"
The exception got propagated back to the caller, resulting in a failure. The connection created in second step is now orphaned and not accessible thru the cache
The loading process takes a few minutes and with enough TPS, S2 is completely saturated with connections.

Is the fix here to close the connection when readonly() fails?

Would be nice if Redis "cluster nodes" command provides a flag for this state when Redis is loading in RAM so we can exclude from topology. An alternative is to somehow trigger an event when this happens (first time seeing Redis loading error on a connection) and have the topology refresher exclude the node from the topology. These are just wish-list items though. Would really appreciate if you can provide feedback on the bug and the proposed fix.

Note that when S2 is saturated, we're also seeing very weird behavior in another set of servers where they purely perform writes which should never establish any connection to slaves except for topology refresh. These hosts would have ConnectionWatchdog throwing exceptions like NullPointerException, cannot initialize channel ... Unfortunately, we don't have debug/trace log enabled so cant really debug more that that. I am assuming this issue is related to #278 ? After a while, both the writer hosts and reader hosts got into a state where this exception is seen everywhere

21 Aug 2016 08:59:17,579 [WARN]  (lettuce-nioEventLoop-3-152) com.lambdaworks.redis.protocol.CommandHandler: [/10.91.86.192:49672 -> /10.162.234.169:6379] Unexpected exception during request: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 5133828096, max: 5145362432)
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 5133828096, max: 5145362432)
    at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:613)
    at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:567)
    at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:699)
    at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:688)
    at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:237)
    at io.netty.buffer.PoolArena.allocate(PoolArena.java:213)
    at io.netty.buffer.PoolArena.allocate(PoolArena.java:141)
    at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:262)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
    at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:170)
    at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:131)
    at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:73)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:111)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:571)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:512)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:426)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:398)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:805)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:145)
    at java.lang.Thread.run(Thread.java:745)

Thanks,

Make sure that:

You have read the contribution guidelines.
You specify the lettuce version and environment so it's obvious which version is affected
You provide a reproducible test case (either descriptive of as JUnit test) if it's a bug or the expected behavior differs from the actual behavior.

The text was updated successfully, but these errors were encountered:

mp911de · 2016-08-25T09:21:06Z

You're tackling a few issues here. Let's untangle these and see what we have here:

Clients exceed server capacity: If a node goes down and the remaining nodes are not able to deal with the capacity, then you might have a capacity issue. If you don't strictly require to read from slaves only, then you can either switch to ReadFrom.NEAREST or you provide an own implementation for ReadFrom.
Orphaned Connections: That's a bug that requires a fix, I created Close connections in PooledClusterConnectionProvider on connection failures #343 to fix that issue.
Flag for a slave that synchronizes: I'm not aware of any flags that could be used for that. The output of CLUSTER NODES is limited to nodeId/endpoint/ping/health/slot details and does not provide any details about lag. Synchronization is an ongoing async process so it's not possible to determine with a high accuracy why a node is syncing and what should be considered a faulty state.
ConnectionWatchdog throws NullPointerException: That's a bug that was fixed with lettuce 4.2.2.Finalwith ConnectionWatchdog should be the last handler #335 and CommandHandler notifications called out of order #315. Sometimes the reconnect is faster than the disconnect notification. That leads to the mentioned behavior. Please upgrade to 4.2.2.Final and create a ticket if the issue if the problem persists.

long-xuan-nguyen · 2016-08-25T19:41:54Z

We already have our own ReadFrom implementation which basically prefer slave over master
Thanks for the fix. Is there any other place where this could potentially be an issue?
in "info" command output, you can have tell whether the node is doing a full sync by looking at the flag "master_sync_in_progress". If only this flag got exposed as part of "cluster nodes", we will be able to exclude this node. Alternatively, we can call "info" on each node but that seems excessive. Would be nice to have Lettuce detect this error "LOADING" the first time and have topology reacts to this event and automatically exclude this node from further requests until this is done. This can be achieved with a background process to detect when the node is done syncing by checking this flag from "info" command once every 10 seconds or something
I thought so too. Upgraded our library to 4.2.2 and will report back if there is any issue

Thanks for your quick turnaround on this

mp911de · 2016-08-25T21:59:49Z

1+3. I think these things are related to each other. Removing the node from the topology might fit for your case, but not for others. Adding further configuration possibilities comes at a complexity and I'm not convinced yet that this is a common requirement. RedisClusterClient publishes a topology changed event. You can listen to these and query the nodes yourself. By using the ReadFrom API you're able to implement a selector that ignores these nodes in the candidate selection.
2. I need to check MasterSlave to verify the bug does not occur there as well, thanks for the hint.

mp911de · 2016-09-03T18:52:11Z

Is there anything I may assist you within the scope of this ticket or can I close it?

mp911de added the for: stackoverflow A question that is better suited to stackoverflow.com label Aug 25, 2016

mp911de mentioned this issue Aug 25, 2016

Close connections in PooledClusterConnectionProvider on connection failures #343

Closed

mp911de closed this as completed Sep 19, 2016

mp911de mentioned this issue Dec 17, 2018

Lettuce 4.4.4 Final version During failover the client make a lot of connections to redis #944

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connection leak when there is a master/slave node fail over #342

Connection leak when there is a master/slave node fail over #342

long-xuan-nguyen commented Aug 25, 2016 •

edited by mp911de

mp911de commented Aug 25, 2016

long-xuan-nguyen commented Aug 25, 2016

mp911de commented Aug 25, 2016

mp911de commented Sep 3, 2016

Connection leak when there is a master/slave node fail over #342

Connection leak when there is a master/slave node fail over #342

Comments

long-xuan-nguyen commented Aug 25, 2016 • edited by mp911de

mp911de commented Aug 25, 2016

long-xuan-nguyen commented Aug 25, 2016

mp911de commented Aug 25, 2016

mp911de commented Sep 3, 2016

long-xuan-nguyen commented Aug 25, 2016 •

edited by mp911de