Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client (2.x) unable to reconnect after killing one cluster member #653

Closed
notz opened this issue Aug 9, 2013 · 4 comments

Comments

Projects
None yet
3 participants
@notz
Copy link
Contributor

commented Aug 9, 2013

Client is initialized with 2 cluster members. After killing the cluster where it is connected, the client is unable to reconnect.

Aug  9 11:39:45 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5742 [event-test] #012#012Members [2] {#012#011Member [192.168.11.159]:5742 this#012#011Member [192.168.11.159]:5741#012}#012 
Aug  9 11:39:45 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] #012#012Members [2] {#012#011Member [192.168.11.159]:5742#012#011Member [192.168.11.159]:5741 this#012}#012 
Aug  9 11:39:46 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5742 [event-test] Re-partitioning cluster data... Immediate-Tasks: 271, Scheduled-Tasks: 0 
Aug  9 11:39:47 localhost debug: [INFO] com.hazelcast.jmx.ManagementService Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Hazelcast JMX agent enabled 
Aug  9 11:39:47 localhost debug: [INFO] com.hazelcast.impl.LifecycleServiceImpl Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Address[192.168.11.159]:5741 is STARTED 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [1] [192.168.11.159:5742 -> 192.168.11.159:5742] is lost 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [1] [192.168.11.159:5742 -> 192.168.11.159:5742] is lost 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.Connection Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Connection [Address[192.168.11.159]:5702] lost. Reason: java.io.EOFException[null] 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [0] [192.168.11.159:5702 -> 192.168.11.159:5702] is lost 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [0] [192.168.11.159:5702 -> 192.168.11.159:5702] is lost 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-1 Closing socket to endpoint Address[192.168.11.159]:5702, Cause:java.io.EOFException 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-1 Closing socket to endpoint Address[192.168.11.159]:5702, Cause:java.io.EOFException 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - API - Connection to Connection [2] [192.168.11.159:5742 -> 192.168.11.159:5742] is lost 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - API - Connection to Connection [2] [192.168.11.159:5742 -> 192.168.11.159:5742] is lost 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - API - Connection to Connection [1] [192.168.11.159:5702 -> 192.168.11.159:5702] is lost 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - API - Connection to Connection [1] [192.168.11.159:5702 -> 192.168.11.159:5702] is lost 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketAcceptor Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] 5741 is accepting socket connection from /192.168.11.34:32929 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketAcceptor Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] 5741 is accepting socket connection from /192.168.11.34:32930 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.ConnectionManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] 5741 accepted socket connection from /192.168.11.34:32929 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.Connection Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Connection [Address[192.168.11.159]:5742] lost. Reason: java.io.EOFException[null] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.ConnectionManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] 5741 accepted socket connection from /192.168.11.34:32930 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5741 [event-test] hz._hzInstance_2_event-test.IO.thread-1 Closing socket to endpoint Address[192.168.11.159]:5742, Cause:java.io.EOFException 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5741 [event-test] hz._hzInstance_2_event-test.IO.thread-1 Closing socket to endpoint Address[192.168.11.159]:5742, Cause:java.io.EOFException 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketAcceptor Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 is accepting socket connection from /192.168.11.34:55080 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.ConnectionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 accepted socket connection from /192.168.11.34:55080 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketAcceptor Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 is accepting socket connection from /192.168.11.34:55081 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.ConnectionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 accepted socket connection from /192.168.11.34:55081 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] received auth from Connection [/192.168.11.34:32930 -> null] live=true, client=true, type=CLIENT, successfully authenticated 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] received auth from Connection [/192.168.11.34:55080 -> null] live=true, client=true, type=CLIENT, successfully authenticated 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] received auth from Connection [/192.168.11.34:32929 -> null] live=true, client=true, type=CLIENT, successfully authenticated 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] received auth from Connection [/192.168.11.34:55081 -> null] live=true, client=true, type=CLIENT, successfully authenticated 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - SSE - Client is connecting to Connection [3] [192.168.11.159:5741 -> 192.168.11.159:5741] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - API - Client is connecting to Connection [3] [192.168.11.159:5741 -> 192.168.11.159:5741] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_OPENING 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - API - Client is connecting to Connection [2] [192.168.11.159:5701 -> 192.168.11.159:5701] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - SSE - Client is connecting to Connection [1] [192.168.11.159:5701 -> 192.168.11.159:5701] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_OPENING 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_OPENING 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_OPENING 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_OPENED 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Could not connect to: /192.168.11.159:5702. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Could not connect to: /192.168.11.159:5702. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Could not connect to: /192.168.11.159:5742. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04  debug: last message repeated 2 times
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Could not connect to: /192.168.11.159:5702. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.nio.ConnectionMonitor Slf4jFactory.java:83 - [192.168.11.159]:5741 [event-test] Removing connection to endpoint Address[192.168.11.159]:5742 Cause => java.net.ConnectException {Connection refused}, Error-Count: 5 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.nio.ConnectionMonitor Slf4jFactory.java:83 - [192.168.11.159]:5741 [event-test] Removing connection to endpoint Address[192.168.11.159]:5742 Cause => java.net.ConnectException {Connection refused}, Error-Count: 5 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Removing Address Address[192.168.11.159]:5742 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Starting to send partition replica diffs... 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] #012#012Members [1] {#012#011Member [192.168.11.159]:5741 this#012}#012 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Could not connect to: /192.168.11.159:5702. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.nio.SocketConnector Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Could not connect to: /192.168.11.159:5702. Reason: ConnectException[Connection refused] 
Aug  9 11:40:04 localhost debug: [WARN] com.hazelcast.nio.ConnectionMonitor Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] Removing connection to endpoint Address[192.168.11.159]:5702 Cause => java.net.ConnectException {Connection refused}, Error-Count: 5 
Aug  9 11:40:04 localhost error: [WARN] com.hazelcast.nio.ConnectionMonitor Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] Removing connection to endpoint Address[192.168.11.159]:5702 Cause => java.net.ConnectException {Connection refused}, Error-Count: 5 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Removing Address Address[192.168.11.159]:5702 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Starting to send partition replica diffs... 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.cluster.ClusterManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] #012#012Members [1] {#012#011Member [192.168.11.159]:5701 this#012}#012 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - API - HazelcastClient is CLIENT_CONNECTION_OPENED 
Aug  9 11:40:04 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_OPENED 
Aug  9 11:40:07 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Total 0 partition replica diffs have been processed. 
Aug  9 11:40:07 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5741 [event-test] Re-partitioning cluster data... Immediate-Tasks: 0, Scheduled-Tasks: 0 
Aug  9 11:40:07 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Total 0 partition replica diffs have been processed. 
Aug  9 11:40:07 localhost debug: [INFO] com.hazelcast.impl.PartitionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Re-partitioning cluster data... Immediate-Tasks: 0, Scheduled-Tasks: 0 
Aug  9 11:40:14 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:14 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:14 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 5 seconds. 
Aug  9 11:40:19 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 10 seconds. 
Aug  9 11:40:24 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:24 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:24 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 5 seconds. 
Aug  9 11:40:24 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 15 seconds. 
Aug  9 11:40:29 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 10 seconds. 
Aug  9 11:40:29 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 20 seconds. 
Aug  9 11:40:34 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:34 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:34 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [128] operation=GET_MEMBERS in 5 seconds. 
Aug  9 11:40:34 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 15 seconds. 
Aug  9 11:40:34 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 25 seconds. 
Aug  9 11:40:39 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [128] operation=GET_MEMBERS in 10 seconds. 
Aug  9 11:40:39 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 20 seconds. 
Aug  9 11:40:39 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 30 seconds. 
Aug  9 11:40:44 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:44 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's ping call within 10 seconds! 
Aug  9 11:40:44 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:44 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:44 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [1] [192.168.11.159:5701 -> 192.168.11.159:5701] is lost 
Aug  9 11:40:44 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [1] [192.168.11.159:5701 -> 192.168.11.159:5701] is lost 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.nio.Connection Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Connection [Address[192.168.11.34]:55081] lost. Reason: java.io.EOFException[null] 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.nio.SocketAcceptor Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 is accepting socket connection from /192.168.11.34:55227 
Aug  9 11:40:44 localhost debug: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-0 Closing socket to endpoint Address[192.168.11.34]:55081, Cause:java.io.EOFException 
Aug  9 11:40:44 localhost error: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-0 Closing socket to endpoint Address[192.168.11.34]:55081, Cause:java.io.EOFException 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.nio.ConnectionManager Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] 5701 accepted socket connection from /192.168.11.34:55227 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Client {Connection [null -> Address[192.168.11.34]:55081] live=false, client=true, type=CLIENT} has been removed. 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] received auth from Connection [/192.168.11.34:55227 -> null] live=true, client=true, type=CLIENT, successfully authenticated 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - SSE - Client is connecting to Connection [2] [192.168.11.159:5701 -> 192.168.11.159:5701] 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_OPENING 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [128] operation=GET_MEMBERS in 15 seconds. 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [131] operation=GET_MEMBERS in 5 seconds. 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 25 seconds. 
Aug  9 11:40:44 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 35 seconds. 
Aug  9 11:40:45 localhost debug: [INFO] com.hazelcast.client.OutRunnable Slf4jFactory.java:47 - SSE - Thread[hz.client.1.OutThread,5,main]: Calling on disconnect Member [192.168.11.159]:5701 
Aug  9 11:40:47  debug: last message repeated 14 times
Aug  9 11:40:47 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:47 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:47 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [2] [192.168.11.159:5701 -> 192.168.11.159:5701] is lost 
Aug  9 11:40:47 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Connection to Connection [2] [192.168.11.159:5701 -> 192.168.11.159:5701] is lost 
Aug  9 11:40:47 localhost debug: [INFO] com.hazelcast.nio.Connection Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Connection [Address[192.168.11.34]:55227] lost. Reason: java.io.EOFException[null] 
Aug  9 11:40:47 localhost debug: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-1 Closing socket to endpoint Address[192.168.11.34]:55227, Cause:java.io.EOFException 
Aug  9 11:40:47 localhost error: [WARN] com.hazelcast.nio.ReadHandler Slf4jFactory.java:83 - [192.168.11.159]:5701 [test] hz._hzInstance_1_test.IO.thread-1 Closing socket to endpoint Address[192.168.11.34]:55227, Cause:java.io.EOFException 
Aug  9 11:40:47 localhost debug: [INFO] com.hazelcast.impl.ClientHandlerService Slf4jFactory.java:81 - [192.168.11.159]:5701 [test] Client {Connection [/192.168.11.34:55227 -> Address[192.168.11.34]:55227] live=false, client=true, type=CLIENT} has been removed. 
Aug  9 11:40:47 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is CLIENT_CONNECTION_LOST 
Aug  9 11:40:47 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - SSE - Unable to get alive cluster connection, try in 5,000 ms later, attempt 1 of 1. 
Aug  9 11:40:49 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [131] operation=GET_MEMBERS in 10 seconds. 
Aug  9 11:40:49 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [128] operation=GET_MEMBERS in 20 seconds. 
Aug  9 11:40:49 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [126] operation=GET_MEMBERS in 30 seconds. 
Aug  9 11:40:49 localhost debug: [INFO] com.hazelcast.client.ProxyHelper Slf4jFactory.java:47 - SSE - There is no response for Call [124] operation=GET_MEMBERS in 40 seconds. 
Aug  9 11:40:50 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:50 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:52 localhost debug: [WARN] com.hazelcast.client.OutRunnable Slf4jFactory.java:45 - SSE - Could not restore the connection 
Aug  9 11:40:52 localhost error: [WARN] com.hazelcast.client.OutRunnable Slf4jFactory.java:45 - SSE - Could not restore the connection 
Aug  9 11:40:52 localhost debug: [WARN] com.hazelcast.client.OutRunnable Slf4jFactory.java:45 - SSE - Reconnection: true. Interrupting and shutting down the client! 
Aug  9 11:40:52 localhost error: [WARN] com.hazelcast.client.OutRunnable Slf4jFactory.java:45 - SSE - Reconnection: true. Interrupting and shutting down the client! 
Aug  9 11:40:52 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is SHUTTING_DOWN 
Aug  9 11:40:52 localhost debug: [INFO] com.hazelcast.client.HazelcastClient Slf4jFactory.java:47 - SSE - HazelcastClient[1] is shutting down. 
Aug  9 11:40:52 localhost debug: [INFO] com.hazelcast.client.OutRunnable Slf4jFactory.java:47 - SSE - OutRunnable is finished ok. 
Aug  9 11:40:52 localhost debug: [INFO] com.hazelcast.client.OutRunnable Slf4jFactory.java:47 - SSE - OutRunnable is finished. 
Aug  9 11:40:53 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:53 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:56 localhost debug: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:56 localhost error: [WARN] com.hazelcast.client.ConnectionManager Slf4jFactory.java:45 - SSE - Server didn't respond to client's requests for 30 seconds. Assuming it is dead, closing the connection! 
Aug  9 11:40:57 localhost debug: [INFO] com.hazelcast.client.ConnectionManager Slf4jFactory.java:47 - SSE - ConnectionManager shutdown 
Aug  9 11:40:57 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient shutdown completed in 5110 ms. 
Aug  9 11:40:57 localhost debug: [INFO] com.hazelcast.client.LifecycleServiceClientImpl Slf4jFactory.java:47 - SSE - HazelcastClient is SHUTDOWN 
Aug  9 11:40:58 localhost debug: [INFO] com.hazelcast.client.InRunnable Slf4jFactory.java:47 - SSE - InRunnable is finished ok. 
Aug  9 11:40:58 localhost debug: [INFO] com.hazelcast.client.InRunnable Slf4jFactory.java:47 - SSE - InRunnable is finished.
@notz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2013

Here a more detailed log what's going on: After reconnection 2 Packets are written ok, then every write takes about 3 seconds, while no response is read. After that the client is unable to recover.

http://pastebin.com/tua4kTW1

Has something todo with NearCaching (#620) and the massive reinitialisation of map entry listeners. It's working without near caching and not working with it. While one client connection with 2 EntryListeners is recovering, the other connection with much more entry listeners don't recover.

@notz notz closed this Aug 12, 2013

@notz notz reopened this Aug 12, 2013

@notz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 12, 2013

Here is a test for 2.6.2-SNAPSHOT:

import java.io.Serializable;
import java.util.concurrent.TimeUnit;

import com.hazelcast.client.HazelcastClient;
import com.hazelcast.client.ClientConfig;
import com.hazelcast.config.Config;
import com.hazelcast.config.GroupConfig;
import com.hazelcast.config.MapConfig;
import com.hazelcast.config.NearCacheConfig;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.core.IMap;


public class ClientNearCacheReconnectTest {

    private static int USERS = 10000;

    public static void main(String[] args) throws InterruptedException {

        System.setProperty("hazelcast.client.near.cache.enabled", "true");

        Config config = new Config();
        config.setGroupConfig(new GroupConfig("test", "test"));
        config.getNetworkConfig().setPort(6701);

        MapConfig mapConfig = new MapConfig("*");
        NearCacheConfig nearCacheConfig = new NearCacheConfig();
        nearCacheConfig.setInvalidateOnChange(true);
        nearCacheConfig.setTimeToLiveSeconds(3600);
        mapConfig.setNearCacheConfig(nearCacheConfig);
        config.addMapConfig(mapConfig);


        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);

        Hazelcast.newHazelcastInstance(config);

        HazelcastInstance client;
        ClientConfig clientConfig = new ClientConfig();
        clientConfig.addAddress("localhost:6701");
        clientConfig.setGroupConfig(new GroupConfig("test", "test"));
        clientConfig.setReconnectionAttemptLimit(10);
        client = HazelcastClient.newHazelcastClient(clientConfig);

        try {
            Thread.sleep(10000);
        } catch (InterruptedException e1) {
        }

        for (int m=0; m<=30; m++) {
            IMap<Long, Event> map = client.getMap("testIndex" + m);
            map.put(1l, new Event(1, getUserId()));
            System.out.println("map testIndex " + m + " created");
        }


        hazelcastInstance.getLifecycleService().kill();

        Thread.sleep(10000);
        for (int m=0; m<=30; m++) {
            IMap<Long, Event> map = client.getMap("testIndex" + m);
            map.put(1l, new Event(1, getUserId()), 1, TimeUnit.SECONDS);
            Thread.sleep(20);
            map.remove(2l);
            map.put(3l, new Event(3, getUserId()), 1, TimeUnit.SECONDS);
        }

        Thread.sleep(10000);

        hazelcastInstance = Hazelcast.newHazelcastInstance(config);

        Thread.sleep(10000);

        for (int m=0; m<=30; m++) {
            IMap<Long, Event> map = client.getMap("testIndex" + m);
            map.get(1l);
        }

        System.exit(0);
    }

    private static int getUserId() {
        return (int)Math.round(Math.random() * USERS) + 1;
    }

    public static class Event implements Serializable {

        private static final long serialVersionUID = 1L;

        private long id;
        private int userId;

        public Event(long id, int userId) {
            this.setId(id);
            this.setUserId(userId);
        }

        public void setId(long id) {
            this.id = id;
        }

        public long getId() {
            return id;
        }

        public void setUserId(int userId) {
            this.userId = userId;
        }

        public int getUserId() {
            return userId;
        }

    }
}
@notz

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2013

It's also not working if client is under heavy load. Seems the problem are the amount of queued calls on connection phase. Perhaps writing calls should wait until connection is "opened"

This was referenced Aug 13, 2013

@MrEasy

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2013

I see a similar issue with version 3.0 (at least from reading the log messages).
Will your fix also be targeting 3.0, or is that version too different at this place?

@mdogan mdogan closed this Aug 14, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.