Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hazelcast Client fails to connect the cluster if the first cluster member is down #11735

Closed
vkakadia opened this issue Nov 6, 2017 · 3 comments
Closed

Comments

@vkakadia
Copy link

@vkakadia vkakadia commented Nov 6, 2017

Hi Team,

I am testing failover scenario with Hazelcast 3.9 client for our new Two node cluster application. Below is what I came across. If I have Hazelcast node A and B running then Hazelcast client is able to connect to both through SmartClient mode. However while Hazelcast client is starting up and if Node A is down and Node B is available then it keeps trying to connect to Node A and doesn't even try connecting node B.

Here is my client configuration.

<bean id="nodes" class="java.util.Arrays" factory-method="asList">
	<constructor-arg>
		<bean class="org.springframework.util.StringUtils" factory-method="commaDelimitedListToStringArray">
			<constructor-arg type="java.lang.String" value="10.87.143.132, 10.87.143.133" />
		</bean>
	</constructor-arg>	
</bean>

<bean id="tcpIpConfig" class="com.hazelcast.config.TcpIpConfig">
	<property name="enabled" value="true" />
	<property name="members" ref="nodes" />
</bean>

<bean id="msClientConfig" class="com.hazelcast.client.config.ClientConfig">
	<property name="groupConfig" ref="msGroupConfig" />
	<property name="networkConfig" ref="clientNetworkConfig" />
	<property name="properties" ref="clientConfigProperties" />
	<property name="connectionStrategyConfig" ref="msClientConnectionStrategy" />		
</bean>

<bean id="msClientConnectionStrategy" class="com.hazelcast.client.config.ClientConnectionStrategyConfig">
	<property name="asyncStart" value="true"></property>
	<property name="reconnectMode" value="ASYNC"></property>
</bean>
	    	
<bean id="clientNetworkConfig" class="com.hazelcast.client.config.ClientNetworkConfig">
	<property name="addresses" ref="nodes" />
	<property name="connectionAttemptLimit" value="1000" />
	<property name="connectionAttemptPeriod" value="10000" />
	<property name="smartRouting" value="true" />
	<property name="redoOperation" value="true" />
</bean>
2017-11-06 11:54:40 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5701 as owner member
2017-11-06 11:54:45 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5701, exception com.hazelcast.core.HazelcastException: java.util.concurrent.TimeoutException: Authentication response did not come back in 5000 millis
2017-11-06 11:54:45 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5703 as owner member
2017-11-06 11:54:46 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5703, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:54:46 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5702 as owner member
2017-11-06 11:54:47 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5702, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:54:47 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Unable to get alive cluster connection, try in 735 ms later, attempt 1 of 1000.
2017-11-06 11:54:48 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5703 as owner member
2017-11-06 11:54:49 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5703, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:54:49 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5701 as owner member
2017-11-06 11:54:54 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5701, exception com.hazelcast.core.HazelcastException: java.util.concurrent.TimeoutException: Authentication response did not come back in 5000 millis
2017-11-06 11:54:54 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5702 as owner member
2017-11-06 11:54:55 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5702, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:54:55 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Unable to get alive cluster connection, try in 2991 ms later, attempt 2 of 1000.
2017-11-06 11:55:00 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5702 as owner member
2017-11-06 11:55:01 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5702, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:55:01 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5701 as owner member
2017-11-06 11:55:06 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5701, exception com.hazelcast.core.HazelcastException: java.util.concurrent.TimeoutException: Authentication response did not come back in 5000 millis
2017-11-06 11:55:06 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5703 as owner member
2017-11-06 11:55:08 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5703, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:55:08 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Unable to get alive cluster connection, try in 721 ms later, attempt 3 of 1000.
2017-11-06 11:55:08 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5703 as owner member
2017-11-06 11:55:09 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5703, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:55:09 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5701 as owner member
2017-11-06 11:55:14 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5701, exception com.hazelcast.core.HazelcastException: java.util.concurrent.TimeoutException: Authentication response did not come back in 5000 millis
2017-11-06 11:55:14 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5702 as owner member
2017-11-06 11:55:15 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5702, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:55:15 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Unable to get alive cluster connection, try in 2969 ms later, attempt 4 of 1000.
2017-11-06 11:55:20 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5703 as owner member
2017-11-06 11:55:21 WARN  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Exception during initial connection to [10.87.143.132]:5703, exception com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
2017-11-06 11:55:21 INFO  hz.client_0.cluster- ClientConnectionManager:51 - hz.client_0 [ms] [3.9] Trying to connect to [10.87.143.132]:5701 as owner member
@sancar sancar modified the milestones: 3.10.1, 3.9.1 Nov 7, 2017
@sancar sancar self-assigned this Nov 7, 2017
@sancar
Copy link
Member

@sancar sancar commented Nov 7, 2017

Hi @vkakadia
I have tried your configuration.
As far as I understand problem arises because you have an space character in your config.
Can try with
value="10.87.143.132,10.87.143.133" instead of
value="10.87.143.132, 10.87.143.133"

Related log is printed in finest as UnkownHostException. That is probably why you missed it.

@sancar
Copy link
Member

@sancar sancar commented Nov 7, 2017

In the mean time, I will consider converting this log to warning from finest .

@vkakadia
Copy link
Author

@vkakadia vkakadia commented Nov 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.