When the loadmanager leader is not available, fall through regular least loaded selection by merlimat · Pull Request #3688 · apache/pulsar

merlimat · 2019-02-26T08:03:39Z

Motivation

Under certain conditions the topic failover can take ~30seconds even when doing a graceful broker shutdown.

This happens because of a race condition when the load-manager leader is being shut down. Since the ephemeral z-node for the leader election is not being explicitely deleted, in some cases it might hang around until the old zk-session gets expired.

The error that gets printed in brokers is:

00:07:47.874 [pulsar-client-io-41-1] WARN  org.apache.pulsar.client.impl.BinaryProtoLookupService - [persistent://system/functions-prod/assignments] failed to send lookup request : org.apache.pulsar.client.api.PulsarClientException$LookupException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /loadbalance/brokers/prod-broker-1.prod-broker.default.svc.cluster.local:8080
java.util.concurrent.CompletionException: org.apache.pulsar.client.api.PulsarClientException$LookupException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /loadbalance/brokers/prod-broker-1.prod-broker.default.svc.cluster.local:8080
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_181]
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_181]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) ~[?:1.8.0_181]
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632) ~[?:1.8.0_181]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) [?:1.8.0_181]
	at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) [?:1.8.0_181]
	at org.apache.pulsar.client.impl.ClientCnx.handleLookupResponse(ClientCnx.java:401) [org.apache.pulsar-pulsar-client-original-2.3.0-streamlio-14.jar:2.3.0-streamlio-14]
	at org.apache.pulsar.common.api.PulsarDecoder.channelRead(PulsarDecoder.java:118) [org.apache.pulsar-pulsar-common-2.3.0-streamlio-14.jar:2.3.0-streamlio-14]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:433) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:330) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

The reason for the error is:

The broker who's the load-manager leader is doing graceful shutdown
Clients reconnect almost immediately to a new broker
This broker still thinks the "leader" is the old broker and redirects lookup requests to it.
When finally the leader z-node gets cleared, the lookups are unblocked and everything goes back into place

The solution here is twofold:

Cleanup pro-actively the leader z-node on shutdown, to avoid waiting for session timeout in case the session doesn't get cleaned up properly
Double check for the load-manager leader to be active before trying to forward lookup requests to it.

…ast loaded selection

merlimat · 2019-02-26T17:05:38Z

run java8 tests
run integration tests

merlimat · 2019-02-27T18:30:04Z

run java8 tests
run integration tests

jiazhai · 2019-02-28T14:57:21Z

run java8 tests

merlimat · 2019-02-28T16:37:22Z

run java8 tests

merlimat · 2019-02-28T20:12:01Z

run java8 tests

…ast loaded selection (#3688) * When the loadmanager leader is not available, fall through regular least loaded selection * Handle exceptions coming from mock zk in tests

merlimat · 2019-04-01T17:39:23Z

Merged in 2.3.1 at
5746db9

When the loadmanager leader is not available, fall through regular le…

ee296f8

…ast loaded selection

merlimat added the type/bug The PR fixed a bug or issue reported a bug label Feb 26, 2019

merlimat added this to the 2.3.1 milestone Feb 26, 2019

merlimat self-assigned this Feb 26, 2019

merlimat requested review from ivankelly, jai1, jerrypeng, rdhabalia, sijie and srkukarni February 26, 2019 08:03

rdhabalia approved these changes Feb 26, 2019

View reviewed changes

jiazhai approved these changes Feb 26, 2019

View reviewed changes

Handle exceptions coming from mock zk in tests

8c8a222

merlimat merged commit ccfb949 into apache:master Mar 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When the loadmanager leader is not available, fall through regular least loaded selection#3688

When the loadmanager leader is not available, fall through regular least loaded selection#3688
merlimat merged 2 commits intoapache:masterfrom
merlimat:master

merlimat commented Feb 26, 2019

Uh oh!

merlimat commented Feb 26, 2019

Uh oh!

merlimat commented Feb 27, 2019

Uh oh!

jiazhai commented Feb 28, 2019

Uh oh!

merlimat commented Feb 28, 2019

Uh oh!

merlimat commented Feb 28, 2019

Uh oh!

merlimat commented Apr 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

merlimat commented Feb 26, 2019

Motivation

Uh oh!

merlimat commented Feb 26, 2019

Uh oh!

merlimat commented Feb 27, 2019

Uh oh!

jiazhai commented Feb 28, 2019

Uh oh!

merlimat commented Feb 28, 2019

Uh oh!

merlimat commented Feb 28, 2019

Uh oh!

merlimat commented Apr 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants