Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TcpIpJoiner throws ConcurrentModificationException: null #10207

Closed
ghost opened this issue Apr 3, 2017 · 4 comments

Comments

Projects
None yet
1 participant
@ghost
Copy link

commented Apr 3, 2017

Hello,

Does anyone witnessed this behavior already?

I'm using Hazelcast (version 3.8 of the hazelcast-client Maven library - 17Feb2017) in a Spring Boot application using Java 8 and I have instantiated an hazelcast cluster with 2 nodes, each of them has 1 client. The bootstrap was OK, the nodes discovered each other and the distributed map I have in place was working properly.

However, and suddenly, the following exception appeared on the server logs:

2017-04-03T09:26:04.522700902Z java.util.ConcurrentModificationException: null
2017-04-03T09:26:04.522705868Z 	at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
2017-04-03T09:26:04.522709925Z 	at java.util.ArrayList$Itr.next(ArrayList.java:851)
2017-04-03T09:26:04.522713645Z 	at com.hazelcast.cluster.impl.TcpIpJoiner.getConfigurationMembers(TcpIpJoiner.java:495)
2017-04-03T09:26:04.522717082Z 	at com.hazelcast.cluster.impl.TcpIpJoiner.getMembers(TcpIpJoiner.java:488)
2017-04-03T09:26:04.522728257Z 	at com.hazelcast.cluster.impl.TcpIpJoiner.getPossibleAddresses(TcpIpJoiner.java:404)
2017-04-03T09:26:04.522731619Z 	at com.hazelcast.cluster.impl.TcpIpJoiner.searchForOtherClusters(TcpIpJoiner.java:507)
2017-04-03T09:26:04.522741672Z 	at com.hazelcast.internal.cluster.impl.SplitBrainHandler.searchForOtherClusters(SplitBrainHandler.java:75)
2017-04-03T09:26:04.522763059Z 	at com.hazelcast.internal.cluster.impl.SplitBrainHandler.run(SplitBrainHandler.java:42)
2017-04-03T09:26:04.522767644Z 	at com.hazelcast.spi.impl.executionservice.impl.SkipOnConcurrentExecutionDecorator.run(SkipOnConcurrentExecutionDecorator.java:40)
2017-04-03T09:26:04.522771064Z 	at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:212)
2017-04-03T09:26:04.522773982Z 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
2017-04-03T09:26:04.522776790Z 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
2017-04-03T09:26:04.522779527Z 	at java.lang.Thread.run(Thread.java:745)
2017-04-03T09:26:04.522782361Z 	at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
2017-04-03T09:26:04.522785379Z 	at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:92)

I have no idea of what could have caused this. Could you please help me?

Thank you in advance. Best regards.

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2017

hi @jdiasamaro,

can you share your bootstrapping code? It appears you are re-using Config object to start multiple Hazelcast instances? Also it seems to be a log from Hazelcast 3.7.5 and not 3.8, but that's probably unrelated to the issue.

@jerrinot jerrinot added this to the 3.9 milestone Apr 3, 2017

@ghost

This comment has been minimized.

Copy link
Author

commented Apr 3, 2017

Hi @jerrinot, thank you for your fast reply.

To give you more details about my setup:

  • I have a Kubernetes cluster with 2 pods. In each pod, I have 2 replicas of the same Spring boot application. Each of them is creating an hazelcast instance.
  • To discover the other hazelcast instances, I'm using a Consul service.
  • When the application starts, it runs the following code to register the current instance with Consul (properties are fetched from a configuration file):
String localNetworkAddress = InetAddress.getByName(InetAddress.getLocalHost().getCanonicalHostName()).getHostAddress();

NewService newService = new NewService();
newService.setId(hazelcastConsultServiceName + "_" + localNetworkAddress);
newService.setName(hazelcastConsultServiceName);
newService.setAddress(localNetworkAddress);
newService.setPort(hazelcastPort);

consul.agentServiceRegister(newService);
  • After registration, and from time to time (with a Spring scheduler task), each the application queries consul to check if other hazelcast instances exist.
  • The code that each application replica (pod) runs is as follows:
@Bean
public HazelcastInstance hazelcastInstance(Config config, ConsulClient consul) throws UnknownHostException {
	HazelcastInstance instance = Hazelcast.newHazelcastInstance(config);
	registerHazelcastService(consul);
	return instance;
}

@Bean
public Config config() {
	List<String> currentMembers = discoverHazelcastServices();
	return new Config()
		.setNetworkConfig(new NetworkConfig()
			.setPort(hazelcastPort)
			.setJoin(new JoinConfig()
				.setMulticastConfig(new MulticastConfig().setEnabled(false))
				.setAwsConfig(new AwsConfig().setEnabled(false))
				.setTcpIpConfig(new TcpIpConfig().setEnabled(true)
					.setMembers(currentMembers))))
		.setGroupConfig(new GroupConfig()
			.setName(hazelcastGroupName != null ? hazelcastGroupName : context.getId())
			.setPassword(hazelcastGroupPassword));
}

@Scheduled(fixedDelayString = "${hazelcast.check-for-new-members-period-millis:60000}")
public void checkForNewHazelcastMembers() {
	HazelcastInstance hazelcast = context.getBean(HazelcastInstance.class);
	discoverHazelcastServices().forEach(member -> hazelcast.getConfig().getNetworkConfig().getJoin().getTcpIpConfig().addMember(member));
}

private List<String> discoverHazelcastServices() {
	return consul.getHealthServices(hazelcastConsultServiceName, true, QueryParams.DEFAULT)
			.getValue().stream()
			.map(service -> service.getService().getAddress() + ":" + service.getService().getPort())
			.collect(Collectors.toList());
}

Actually, the hazelcast Config is being created as a singleton bean since no @Scope is forced. This will force the object to be reused. However, the Hazelcast instance itself is a singleton, which means, this should be called only one (per application).

Moreover, each time the scheduler runs, it fetches the Config object from the hazelcast instance and changes the TcpIp group members to add the new one.

Best regards.

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2017

This is the problem:

@Scheduled(fixedDelayString = "${hazelcast.check-for-new-members-period-millis:60000}")
public void checkForNewHazelcastMembers() {
	HazelcastInstance hazelcast = context.getBean(HazelcastInstance.class);
	discoverHazelcastServices().forEach(member -> hazelcast.getConfig().getNetworkConfig().getJoin().getTcpIpConfig().addMember(member));
}

It's mutating Hazelcast configuration AFTER a Hazelcast instance was created. This violates contract of the Config object. I am afraid the TcpJoiner is not suitable for your use-case.

Hazelcast has Discovery SPI intended to be used in a cloud environment. There are various implementations of this SPI, perhaps you want to have a look at this project

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Apr 3, 2017

I am closing it for now as everything is working as designed. Feel free to reopen it if you believe it's a bug in Hazelcast.

@jerrinot jerrinot closed this Apr 3, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.