Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[client] client OperationTimeoutException lacks operation details #11241

Closed
pveentjer opened this issue Aug 27, 2017 · 3 comments
Closed

[client] client OperationTimeoutException lacks operation details #11241

pveentjer opened this issue Aug 27, 2017 · 3 comments

Comments

@pveentjer
Copy link
Member

@pveentjer pveentjer commented Aug 27, 2017

So we need to know the data-structure and the operation. E.g. map, employees, put. Currently we see a useless message:

ClientInvocation{clientMessageType=5, target=address [10.174.77.36]:5705, sendConnection=ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{soc

ketChannel=java.nio.channels.SocketChannel[connected local=/10.174.77.43:51480 remote=/10.174.77.36:5705]}, remoteEndpoint=[10.174.77.36]:5705, lastReadTime=2017-08-26 15:07:23.569, lastWriteTime=2017-08-26 15:07:23.543, closedTime=neve

r, lastHeartbeatRequested=2017-08-26 15:05:18.511, lastHeartbeatReceived=2017-08-26 15:05:18.511, connected server version=3.8.3}

We don't see the:

  • type of data-structure (e.g. IMap/ICache etc)
  • the name of the data-structure e.g. employees
  • the method involved e.g. get

Some of the information can be retrieved by going up the callstack and look for e.g. IMapProxy.get.. But it isn't clear which map is involved. And this approach further breaks down when dealing with an async call because then you get exceptions out of thin air since there is no usable stacktrace and there is no reference to data-structure/name/method. This makes debugging such issues nearly impossible.

@pveentjer pveentjer added this to the 3.8.6 milestone Aug 27, 2017
@pveentjer pveentjer changed the title client OperationTimeoutException should include 'operation' client OperationTimeoutException lacks operation details Aug 27, 2017
@pveentjer pveentjer changed the title client OperationTimeoutException lacks operation details [client] client OperationTimeoutException lacks operation details Aug 27, 2017
@sancar
Copy link
Member

@sancar sancar commented Aug 28, 2017

In the message, we have messageType(tells createProxy is done), target info and how much time passed after timeout threshold. messageType 5 -> createProxy
One thing we can improve to is to log message name instead of message type. We don't have a way to look up to that but can be added easily.
Name of the datastructure(employees) is not available in this context. If we see it really valuable, we are gonna need to add another field to ClientInvocation

@pveentjer
Copy link
Member Author

@pveentjer pveentjer commented Aug 29, 2017

message type for the end user is not useful. There really needs to be a Operation 'class' name that identfies the type of data-structure and and the operation.

There also needs to be a data-structure name. The serverside OperationTimeoutException can serve as a good example how the client message should look like. Adding the name is essential to provide good debugging support; I have wasted time Saturday evening/night to help a customer and I could not figure out quickly what happened due to non informative client exceptions.

@asimarslan asimarslan modified the milestones: 3.8.7, 3.8.6 Sep 13, 2017
@ihsandemir
Copy link
Contributor

@ihsandemir ihsandemir commented Sep 25, 2017

Per PR #11419, we also may need to add invocation information into the HazelcastOverloadException.

@sancar sancar self-assigned this Oct 3, 2017
sancar added a commit to sancar/hazelcast that referenced this issue Oct 4, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 4, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 4, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 5, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 5, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 5, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
sancar added a commit to sancar/hazelcast that referenced this issue Oct 5, 2017
An example exception will be as follows:
com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=180, correlationId=22, operation=ExecutorService_submitToPartition, messageType=905, partitionId=55, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, target = partition 55, sendConnection = MockedClientConnection{localAddress=[127.0.0.1]:40001, super=ClientConnection{alive=true, connectionId=1, channel=null, remoteEndpoint=[127.0.0.1]:5001, lastReadTime=2017-10-04 14:38:27.470, lastWriteTime=2017-10-04 14:38:27.468, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.9-SNAPSHOT}}, objectName = test} timed out because exception occurred after client invocation timeout 2000 ms. Current time: 2017-10-04 14:38:27.472. Start time: 2017-10-04 14:38:24.928. Total elapsed time: 2545 ms.

followings are included as new in the exception message
- type of data-structure (e.g. IMap/ICache etc)
- the name of the data-structure e.g. employees
- the method involved e.g. get
- starttime, currenttime, elapsedtime and configured timeout

fixes hazelcast#11241
fixes hazelcast#11236
@sancar sancar closed this in #11511 Oct 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

5 participants
You can’t perform that action at this time.