Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes outdated topology when no new leader is assigned #5979

Merged
merged 3 commits into from
Dec 14, 2020

Conversation

npepinpe
Copy link
Member

@npepinpe npepinpe commented Dec 7, 2020

Description

This PR fixes a bug in the gateway topology. The topology manager keeps track of who is leader and follower for each partition. This information is gossiped by all nodes in the cluster. Normally, when a node which was leader for partition 1 sends that it is now follower, another node will send that it is leader. There's an edge case, however, when no other node sends that it is the leader. In this case, we end up with a topology where a node is both leader and follower. This means that we report the wrong topology and that the gateway will keep trying to route requests to the node. The case where no new node becomes leader can happen due to network partitioning, for example, and is an expected case we should be able to tolerate.

This PR adds more test coverage and fixes the issue by removing the old partition leader if, when adding a new follower, they have the same ID.

Related issues

closes #2501

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/0.25) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement

@npepinpe
Copy link
Member Author

npepinpe commented Dec 7, 2020

One note: I think actually the fix is incomplete. We might need to also propagate the terms on follower updates as well to make sure that we aren't removing a leader with a newer term than the follower event we just received.

To check with Deepthi or Miguel.

@npepinpe npepinpe marked this pull request as ready for review December 8, 2020 08:02
Copy link
Contributor

@deepthidevaki deepthidevaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Looks good. Just a small comment. 👍

@deepthidevaki
Copy link
Contributor

One note: I think actually the fix is incomplete. We might need to also propagate the terms on follower updates as well to make sure that we aren't removing a leader with a newer term than the follower event we just received.

To check with Deepthi or Miguel.

Just for future reference, as we discussed it already: Updates from the same node is guaranteed to be delivered in order by our gossip. Hence we don't have to worry about receiving update from a broker saying it is the follower in previous term after the update with leader for newer term.

@npepinpe
Copy link
Member Author

npepinpe commented Dec 9, 2020

I tightened the conditions a little and fixed some things in TopologyAssert.

@@ -30,7 +32,7 @@ public final TopologyAssert isComplete(final int clusterSize, final int partitio
final List<BrokerInfo> brokers = actual.getBrokers();

if (brokers.size() != clusterSize) {
failWithMessage("Expected broker count to be <%s> but was <%s>", clusterSize, brokers.size());
throw failure("Expected broker count to be <%s> but was <%s>", clusterSize, brokers.size());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadoc from failWithMessage actually recommends using throw failure instead, as the compiler can now realize we're throwing an error (whereas with failWithMessage it thinks execution will continue).

Copy link
Contributor

@deepthidevaki deepthidevaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks.

@npepinpe
Copy link
Member Author

npepinpe commented Dec 9, 2020

bors r+

zeebe-bors bot added a commit that referenced this pull request Dec 9, 2020
5979: Fixes outdated topology when no new leader is assigned r=npepinpe a=npepinpe

## Description

This PR fixes a bug in the gateway topology. The topology manager keeps track of who is leader and follower for each partition. This information is gossiped by all nodes in the cluster. Normally, when a node which was leader for partition 1 sends that it is now follower, another node will send that it is leader. There's an edge case, however, when no other node sends that it is the leader. In this case, we end up with a topology where a node is both leader and follower. This means that we report the wrong topology and that the gateway will keep trying to route requests to the node. The case where no new node becomes leader can happen due to network partitioning, for example, and is an expected case we should be able to tolerate.

This PR adds more test coverage and fixes the issue by removing the old partition leader if, when adding a new follower, they have the same ID.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #2501 

## Definition of Done

_Not all items need to be done depending on the issue and the pull request._

Code changes:
* [x] The changes are backwards compatibility with previous versions
* [x] If it fixes a bug then PRs are created to [backport](https://github.com/zeebe-io/zeebe/compare/stable/0.24...develop?expand=1&template=backport_template.md&title=[Backport%200.24]) the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. `backport stable/0.25`) to the PR, in case that fails you need to create backports manually.

Testing:
* [x] There are unit/integration tests that verify all acceptance criterias of the issue
* [x] New tests are written to ensure backwards compatibility with further versions
* [ ] The behavior is tested manually
* [ ] The impact of the changes is verified by a benchmark 

Documentation: 
* [ ] The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
* [ ] New content is added to the [release announcement](https://drive.google.com/drive/u/0/folders/1DTIeswnEEq-NggJ25rm2BsDjcCQpDape)


Co-authored-by: Nicolas Pépin-Perreault <nicolas.pepin-perreault@camunda.com>
@zeebe-bors
Copy link
Contributor

zeebe-bors bot commented Dec 9, 2020

Build failed:

@npepinpe
Copy link
Member Author

npepinpe commented Dec 10, 2020

Looking at the failed container logs, it looks like we broker backwards compatibility. We were not catching this in the rolling update test before because we did not check in between that a leader was elected, just that the node was removed/added to the topology. In this instance, it fails (sometimes - not sure why) because node 0 is up (and updated), node 1 is down, and node 2 is up (but outdated). Then node 0 is printing out Kryo error, saying it cannot deserialize something, and node 2 is getting connection timeouts from node 0 trying to get elected (I can see it switches to candidate, but can never become leader).

However, I don't get why the test is flaky...I would expect, if we broke backwards compat with serialization, that this always fails. Maybe it depends who was previously leader? If Zeebe 2 was already leader, maybe it doesn't matter? idk

Logs from node 0:
2020-12-09 17:32:15.847 [Broker-0-TopologyManager] [Broker-0-zb-actors-0] DEBUG io.zeebe.broker.clustering - Received REACHABILITY_CHANGED from member 1, was not handled.

2020-12-09 17:32:19.580 [] [netty-messaging-event-epoll-client-7] ERROR io.atomix.cluster.messaging.impl.NettyMessagingService - Exception inside channel handling pipeline

com.esotericsoftware.kryo.KryoException: Unable to find class: �����	"

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$2(NamespaceImpl.java:209) ~[atomix-utils-0.25.0.jar:0.25.0]

	at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.run(KryoPoolQueueImpl.java:58) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$3(NamespaceImpl.java:206) ~[atomix-utils-0.25.0.j
ar:0.25.0]

	at io.atomix.utils.serializer.KryoIOPool.run(KryoIOPool.java:47) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:203) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:167) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.FallbackNamespace.deserialize(FallbackNamespace.java:100) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.Serializer$1.decode(Serializer.java:75) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:293) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:276) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$registerHandler$8(NettyMessagingService.java:218) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:38) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:25) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService$MessageDispatcher.channelRead0(NettyMessagingService.java:836) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at java.lang.Thread.run(Unknown Source) [?:?]

Caused by: java.lang.ClassNotFoundException: �����	"

	at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) ~[?:?]

	at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source) ~[?:?]

	at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]

	at java.lang.Class.forName0(Native Method) ~[?:?]

	at java.lang.Class.forName(Unknown Source) ~[?:?]

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154) ~[kryo-4.0.2.jar:?]

	... 36 more

2020-12-09 17:32:19.829 [] [netty-messaging-event-epoll-client-4] ERROR io.atomix.cluster.messaging.impl.NettyMessagingService - Exception inside channel handling pipeline

com.esotericsoftware.kryo.KryoException: Unable to find class: ��+�"+���

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$2(NamespaceImpl.java:209) ~[atomix-utils-0.25.0.jar:0.25.0]

	at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.run(KryoPoolQueueImpl.java:58) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$3(NamespaceImpl.java:206) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.KryoIOPool.run(KryoIOPool.java:47) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:203) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:167) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.FallbackNamespace.deserialize(FallbackNamespace.java:100) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.Serializer$1.decode(Serializer.java:75) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:293) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:276) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$registerHandler$8(NettyMessagingService.java:218) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:38) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:25) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService$MessageDispatcher.channelRead0(NettyMessagingService.java:836) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.netty.channel.SimpleChannelInboundHand
ler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at java.lang.Thread.run(Unknown Source) [?:?]

Caused by: java.lang.ClassNotFoundException: ��+�"+���

	at jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source) ~[?:?]

	at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source) ~[?:?]

	at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]

	at java.lang.Class.forName0(Native Method) ~[?:?]

	at java.lang.Class.forName(Unknown Source) ~[?:?]

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154) ~[kryo-4.0.2.jar:?]

	... 36 more

2020-12-09 17:32:20.078 [] [netty-messaging-event-epoll-client-0] ERROR io.atomix.cluster.messaging.impl.NettyMessagingService - Exception inside channel handling pipeline

com.esotericsoftware.kryo.KryoException: Unable to find class: ��+�"+���

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:160) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:693) ~[kryo-4.0.2.jar:?]

	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:804) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$2(NamespaceImpl.java:209) ~[atomix-utils-0.25.0.jar:0.25.0]

	at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.run(KryoPoolQueueImpl.java:58) ~[kryo-4.0.2.jar:?]

	at io.atomix.utils.serializer.NamespaceImpl.lambda$deserialize$3(NamespaceImpl.java:206) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.KryoIOPool.run(KryoIOPool.java:47) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:203) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.NamespaceImpl.deserialize(NamespaceImpl.java:167) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.FallbackNamespace.deserialize(FallbackNamespace.java:100) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.utils.serializer.Serializer$1.decode(Serializer.java:75) ~[atomix-utils-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:293) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.DefaultClusterCommunicationService$InternalMessageResponder.apply(DefaultClusterCommunicationService.java:276) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService.lambda$registerHandler$8(NettyMessagingService.java:218) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:38) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.AbstractServerConnection.dispatch(AbstractServerConnection.java:25) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.atomix.cluster.messaging.impl.NettyMessagingService$MessageDispatcher.channelRead0(NettyMessagingService.java:836) ~[atomix-cluster-0.25.0.jar:0.25.0]

	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-codec-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.53.Fin
al.jar:4.1.53.Final]

	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-transport-native-epoll-4.1.53.Final-linux-x86_64.jar:4.1.53.Final]

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.53.Final.jar:4.1.53.Final]

	at java.lang.Thread.run(Unknown Source) [?:?]

Logs from node 2:
2020-12-09 17:32:13.257 [GatewayTopologyManager] [Broker-2-zb-actors-0] DEBUG io.zeebe.gateway - Received metadata change from Broker 1, partitions {1=LEADER}, terms {1=1} and health {1=HEALTHY}.

2020-12-09 17:32:15.851 [GatewayTopologyManager] [Broker-2-zb-actors-0] DEBUG io.zeebe.gateway - Received REACHABILITY_CHANGED for broker 1, do nothing.

2020-12-09 17:32:15.851 [Broker-2-TopologyManager] [Broker-2-zb-actors-1] DEBUG io.zeebe.broker.clustering - Received REACHABILITY_CHANGED from member 1, was not handled.

2020-12-09 17:32:19.512 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.FollowerRole - RaftServer{raft-partition-partition-1}{role=FOLLOWER} - Poll request to 1 failed: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: broker-1/192.168.192.2:26502

2020-12-09 17:32:19.542 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{raft-partition-partition-1}{role=CANDIDATE} - io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: broker-1/192.168.192.2:26502

2020-12-09 17:32:19.542 [Broker-2-ZeebePartition-1] [Broker-2-zb-actors-0] DEBUG io.zeebe.broker.system - Partition role transitioning from FOLLOWER to CANDIDATE

2020-12-09 17:32:19.576 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - ConfigureRequest{term=2, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: broker-1/192.168.192.2:26502

2020-12-09 17:32:19.581 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - ConfigureRequest{term=2, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: broker-1/192.168.192.2:26502

2020-12-09 17:32:19.604 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=17, prevLogTerm=1, entries=1, checksums=1, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:19.831 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=18, prevLogTerm=2, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:20.080 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=18, prevLogTerm=2, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:20.829 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - Confi
gureRequest{term=2, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: io.netty.channel.ConnectTimeoutException: connection timed out: broker-1/192.168.192.2:26502

2020-12-09 17:32:24.582 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=18, prevLogTerm=2, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:24.830 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=18, prevLogTerm=2, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:25.094 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=2, leader=2, prevLogIndex=18, prevLogTerm=2, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:25.095 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - Suspected network partition after 3 failures from 0 over a period of time 5515 > 5000, stepping down

2020-12-09 17:32:25.106 [Broker-2-ZeebePartition-1] [Broker-2-zb-actors-0] DEBUG io.zeebe.broker.system - Partition role transitioning from CANDIDATE to FOLLOWER

2020-12-09 17:32:25.878 [Broker-2-TopologyManager] [Broker-2-zb-actors-1] DEBUG io.zeebe.broker.clustering - Received member removed BrokerInfo{nodeId=1, partitionsCount=1, clusterSize=3, replicationFactor=3, partitionRoles={1=LEADER}, partitionLeaderTerms={1=1}, partitionHealthStatuses={1=HEALTHY}, version=0.25.0} 

2020-12-09 17:32:25.878 [GatewayTopologyManager] [Broker-2-zb-actors-0] DEBUG io.zeebe.gateway - Received broker was removed BrokerInfo{nodeId=1, partitionsCount=1, clusterSize=3, replicationFactor=3, partitionRoles={1=LEADER}, partitionLeaderTerms={1=1}, partitionHealthStatuses={1=HEALTHY}, version=0.25.0}.

2020-12-09 17:32:29.440 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.CandidateRole - RaftServer{raft-partition-partition-1}{role=CANDIDATE} - java.net.ConnectException: Expected to send a message with subject 'raft-partition-partition-1-vote' to member '1', but member is not known. Known members are '[Member{id=2, address=broker-2:26502, properties={brokerInfo=EADJAAAAAwACAAAAAQAAAAMAAAADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTI6MjY1MDEFAAEBAAAAAQwAAA8AAAAwLjI2LjAtU05BUFNIT1QFAAEBAAAAAQ==, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}, Member{id=0, address=broker-0:26502, properties={brokerInfo=EADJAAAAAwAAAAAAAQAAAAMAAAADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTA6MjY1MDEFAAEBAAAAAQwAAAYAAAAwLjI1LjAFAAA=, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}]'.

2020-12-09 17:32:29.441 [Broker-2-ZeebePartition-1] [Broker-2-zb-actors-0] DEBUG io.zeebe.broker.system - Partition role transitioning from FOLLOWER to CANDIDATE

2020-12-09 17:32:29.460 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - ConfigureRequest{term=3, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: java.net.ConnectException: Expected to send a message with subject 'raft-partition-partition-1-configure' to member '1', but member is not known. Known members are '[Member{id=2, address=broker-2:26502, properties={brokerInfo=EADJAAAAAwACAAAAAQAAAAMAAAADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTI6MjY1MDEFAAEBAAAAAQwAAA8AAAAwLjI2LjAtU05BUFNIT1QFAAEBAAAAAQ==, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}, Member{id=0, address=broker-0:26502, properties={brokerInfo=EADJAAAAAwAAAAAAAQAAAAMAAA
ADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTA6MjY1MDEFAAEBAAAAAQwAAAYAAAAwLjI1LjAFAAA=, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}]'.

2020-12-09 17:32:29.462 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - ConfigureRequest{term=3, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: java.net.ConnectException: Expected to send a message with subject 'raft-partition-partition-1-configure' to member '1', but member is not known. Known members are '[Member{id=2, address=broker-2:26502, properties={brokerInfo=EADJAAAAAwACAAAAAQAAAAMAAAADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTI6MjY1MDEFAAEBAAAAAQwAAA8AAAAwLjI2LjAtU05BUFNIT1QFAAEBAAAAAQ==, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}, Member{id=0, address=broker-0:26502, properties={brokerInfo=EADJAAAAAwAAAAAAAQAAAAMAAA
ADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTA6MjY1MDEFAAEBAAAAAQwAAAYAAAAwLjI1LjAFAAA=, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}]'.

2020-12-09 17:32:29.464 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=18, prevLogTerm=2, entries=1, checksums=1, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:29.715 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - ConfigureRequest{term=3, leader=2, index=0, timestamp=1607535098419, members=[DefaultRaftMember{id=0, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=1, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}, DefaultRaftMember{id=2, type=ACTIVE, updated=2020-12-09T17:31:38.419097Z}]} to 1 failed: java.util.concurrent.CompletionException: java.net.ConnectException: Expected to send a message with subject 'raft-partition-partition-1-configure' to member '1', but member is not known. Known members are '[Member{id=2, address=broker-2:26502, properties={brokerInfo=EADJAAAAAwACAAAAAQAAAAMAAAADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTI6MjY1MDEFAAEBAAAAAQwAAA8AAAAwLjI2LjAtU05BUFNIT1QFAAEBAAAAAQ==, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}, Member{id=0, address=broker-0:26502, properties={brokerInfo=EADJAAAAAwAAAAAAAQAAAAMAAA
ADAAAAAAABCgAAAGNvbW1hbmRBcGkOAAAAYnJva2VyLTA6MjY1MDEFAAEBAAAAAQwAAAYAAAAwLjI1LjAFAAA=, event-service-topics-subscribed=Af8fAQEDAWpvYnNBdmFpbGFibOU=}}]'.

2020-12-09 17:32:29.723 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:29.964 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIn
dex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:32.970 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:33.216 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:33.481 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:36.976 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:37.219 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - AppendRequest{term=3, leader=2, prevLogIndex=19, prevLogTerm=3, entries=0, checksums=0, commitIndex=17} to 0 failed: java.util.concurrent.CompletionException: java.net.ConnectException

2020-12-09 17:32:37.466 [] [raft-server-2-raft-partition-partition-1] WARN  io.atomix.raft.roles.LeaderAppender - RaftServer{raft-partition-partition-1} - Suspected network partition after 6 failures from 1 over a period of time 8005 > 5000, stepping down

2020-12-09 17:32:37.471 [Broker-2-ZeebePartition-1] [Broker-2-zb-actors-0] DEBUG io.zeebe.broker.system - Partition role transitioning from CANDIDATE to FOLLOWER

This was an unintended side effect here, and it looks like by adding the condition we may have found caught an unexpected break in our rolling update, so I would like to keep this condition, but possibly the fix for it might go into another PR - so we'd need to extract the assert logic and the fix for this test into a different PR before merging this.

@MiguelPires could this be related to the checksum stuff? I can't think of anything else we did, but of course it's possible we broke something else.

@npepinpe
Copy link
Member Author

I think I understand the issue - VersionFieldSerializer allows newer version to read previously written data (i.e. they can receive message from the older nodes), but it cannot read new fields. So the older nodes cannot read data from the newer nodes, and they don't ignore the fields either (why not? good question, it seems like an easy thing to do, just skip it if the version is higher than what you know).

Can this cause issues during updates? When we update one node, it can receive message from the other two, and will probably not be leader. When we update the second node, then the first updated node could become leader (which we see here), which will cause issue with the older node. However, the two updated nodes should be able to work together - however our fault tolerance guarantees are lowered, I guess, since the older node is now "useless" until it's updated.

I don't see an easy solution here - the only think I can think of is postponing adding checksums to 0.27, as we will most likely be breaking backwards compatibility with the new workflow engine. At this point we can change how we do serialization and ignore the issue. Let me know what you think.

- fixes an issue in the gateway topology when the old leader becomes
  follower, and no new node is elected leader yet, by removing the new
  follower if it's still identified as the leader
- TopologyAssert#isComplete now also checks that all partitions have a
  leader
@npepinpe
Copy link
Member Author

bors r+

@zeebe-bors
Copy link
Contributor

zeebe-bors bot commented Dec 14, 2020

Build succeeded:

@zeebe-bors zeebe-bors bot merged commit c4ab987 into develop Dec 14, 2020
@zeebe-bors zeebe-bors bot deleted the 2501-gateway-topology-fix branch December 14, 2020 13:50
@github-actions
Copy link
Contributor

The process '/home/runner/work/_actions/zeebe-io/backport-action/master/backport.sh' failed with exit code 4

1 similar comment
@github-actions
Copy link
Contributor

The process '/home/runner/work/_actions/zeebe-io/backport-action/master/backport.sh' failed with exit code 4

zeebe-bors bot added a commit that referenced this pull request Dec 15, 2020
6011: [Backport stable/0.25] Fixes outdated topology when no new leader is assigned r=npepinpe a=npepinpe

# Description
Backport of #5979 to `stable/0.25`. There was some minor conflicts, where I had to bump the AssertJ version as `failure` did not exist in 3.17.

Co-authored-by: Nicolas Pépin-Perreault <nicolas.pepin-perreault@camunda.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 14, 2024
…stcontainers dependency versions (#5979)

* chore(backend): update elasticsearch, awssdk, aws-java, opensearch-testcontainers dependency versions

* chore(backend): update spring-boot version to 3.1.6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Standalone gateway returns out-of-date topology when brokers go away
2 participants