Timeout handling #472

philib · 2017-11-10T09:52:05Z

Hey,

is there a possibility to subscribe on timeout events on the server side?
I want to register if a ACK package is send as a response to an observation from the client or not.

Thanks in advance,
Philip

boaks · 2017-11-10T12:34:52Z

I'm not sure, what you assume that timeout should be related.
If you send a notify as ACK, this is just sent from the CoapServer, there is no timeout related to that transmission (it's no CON, it's just a ACK).
To send that notify, the CoapServer calls in the end the CoapResource.handleGET(). So if your just interested, if the notify was tried to send, just implement that handleGET according your intention.

philib · 2017-11-10T12:52:50Z

I'm interested if the notify from the server is acknowledged by the client, to check if the client still is connected properly.

Also i'm interested in handling reconnection on the client side. If a client observes a resource, and e.g the server crashes and restarts, the client will no longer get updates from the server. In the case the server doesnt notify the clients in a given time, i'd like to check if the server is still available, otherwise i want to try to request a new observation until the server is available again

boaks · 2017-11-10T14:41:15Z

For the coap-server (the observed):
There are two configuration parameter:
NOTIFICATION_CHECK_INTERVAL (time in milliseconds)
NOTIFICATION_CHECK_INTERVAL_COUNT

if either the count or the time is reached, the coap-server uses a CON notify to check, if the coap-client is still interested.

For the coap-client (the observer):
If you use the CoapClient there CoapObserveRelation will take care of doing a reregister, if it doesn't receive a notify. That's bound to the MAX_AGE option (default 60s).

philib · 2017-11-13T08:05:37Z

if either the count or the time is reached, the coap-server uses a CON notify to check, if the coap-client is still interested.

Is there any method which will get invoked after the notify gets canceld? I want to handle those disconnects but I only get a console output :

Nov 13, 2017 8:39:29 AM org.eclipse.californium.core.network.stack.ObserveLayer$NotificationController onTimeout INFORMATION: Notification for token [5e1644e98fbb0b50] timed out. Canceling all relations with source [/127.0.0.1:51514]

If you use the CoapClient there CoapObserveRelation will take care of doing a reregister, if it doesn't receive a notify. That's bound to the MAX_AGE option (default 60s).

On the client the onError() on the CoapObserveRelation gets invoked, but it takes up to 2 min and doesnt result in a reconnect. Where do i have to set MAX_AGE to to speed up the invocation?

boaks · 2017-11-16T12:48:14Z

Is there any method which will get invoked after the notify gets canceld? I want to handle those disconnects but I only get a console output

Have a look at CoapResource.removeObserveRelation or the ResourceObserver

On the client the onError() on the CoapObserveRelation gets invoked

OK, if that gets invoked, then you not using a commit from our repository! Or which commit you are using, that provides a CoapObserveRelation.onError().

, but it takes up to 2 min and doesnt result in a reconnect. Where do i have to set MAX_AGE to to speed up the invocation?

I'm not sure, what you mean. Why should a "onError" (assuming you mean CoapHandler.onError()) schedule such a "refresh observation"? If you get an error, why do you think, auto-repeat is a good approach? So, I'm not sure, what your plan is. That "refresh observation" is intended for cases, where you have established a observation, but didn't receive an error nor a notify for a longer period. Just in that case, a new observation request is send.

philib · 2017-11-17T08:10:03Z

Have a look at CoapResource.removeObserveRelation or the ResourceObserver

Thanks, that's what i was looking for.

assuming you mean CoapHandler.onError()

yes, my fault

So, I'm not sure, what your plan is

Assuming a server, which a client is connected to, loses its connection and is offline for a longer period (eg. 5 min). The client should recognize such a server-side disconnect immediately and then try to reconnect to the server until it is back again.

boaks · 2017-11-17T12:27:24Z

The client should recognize such a server-side disconnect immediately and then try to reconnect to the server until it is back again.

So, your coap-client was observing some resource on a coap-server.
Then that "notify timeout" occurred and triggered a "observe reregister",
which fails, because the observed coap-server was offline,
what is reported with "onError()".

If you want to do retries in that case, just trigger that retry in the "onError()". I'm not sure, what pattern would match you situation best, but I hope you know it and therefore you could implement your proper retry strategy. Californium only implements the very basic (and safe) functionality for that.

philib · 2017-11-17T14:44:54Z

just trigger that retry in the "onError()"

I wanted to that, but it takes to much time for the onError() to be invoked (up to 10 minutes). Is there a way to speed up this invokation by configuring the coap-client?
I couldn't find a suitable NetworkConfig to achieve this.

Thanks in advance

boaks · 2017-11-17T15:52:32Z

Sorry, it's time, that you provide some wireshark and californium logs, where we can see what happens :-).
And possibly the branch/commit/tag your using.
Unfortunately, I will not be able to work on this next week, but from the 27.11. I will have a look on your logs.

philib · 2017-11-24T09:44:45Z

Thanks for your patience !!
Im using the current version of the master branch.
I started the server and the client and killed the server to simulate downtime.

Here is my CoAP Server
Here is my CoAP Client

Thats the console output:

Thats the wireshark log:

As you can see from the console and the wirkeshark logs, the client starts the first reconnect attempt after about 17 min.

vikram919 · 2017-11-29T13:44:08Z

@philib
From the log information you have provided, I could see Option Max-Age set to 1000 seconds.
One question did you explicitly define Max-Age option to the response in your implementation?

       @Override
       public void handleGET(CoapExchange exchange) {
       	exchange.setMaxAge(1000); // instead set it to 0//
       	// respond to the request
           exchange.respond("Hello World!");
       }

Please read this comment:
#479 (comment)

boaks · 2017-11-29T14:07:05Z

So some more details about the timings.

RFC7641, 3.3.1, page 11:

To make sure it has a current representation and/or to re-register
its interest in a resource, a client MAY issue a new GET request with
the same token as the original at any time. All options MUST be
identical to those in the original request except for the set of ETag
Options. It is RECOMMENDED that the client does not issue the
request while it still has a fresh notification/response for the
resource in its cache. Additionally, the client SHOULD at least wait
for a random amount of time between 5 and 15 seconds after Max-Age
expired to reduce collisions with other clients.

Because of the recommendation at the end, the client waits for that MAX-AGE until it re-registers (in your log 9:42:16 to 9:59:00). Then, because your server is still down, the client retries that re-register, which takes also 1 minute to fail. So, if it takes long to detect that the server is not longer available, this depends on the settings your using.

For the most communication technique, the only possibility to check "aliveness" is to exchange messages. Some technique does this with the protocol, others outside of that protocol.

If you need shorter detection times, then you may design your communication using notifies more frequently. If you send such a notify every 30s (and adjust the MAX_AGE accordingly), you may detect faster, that the server is down. Sure, at the cost of more traffic. So you must find the best trade-off between detection time and traffic for your application.

boaks · 2017-12-05T11:55:58Z

@philib

Do you still have issues related with the timeout handling?
If not, can we close this issue?

philib · 2017-12-05T14:08:52Z

Because of the recommendation at the end, the client waits for that MAX-AGE until it re-registers (in your log 9:42:16 to 9:59:00). Then, because your server is still down, the client retries that re-register, which takes also 1 minute to fail. So, if it takes long to detect that the server is not longer available, this depends on the settings your using.

Thanks a lot, reducing the MAX_AGE serverside solved the problem 👍

boaks added the waiting for feedback label Dec 5, 2017

philib closed this as completed Dec 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout handling #472

Timeout handling #472

philib commented Nov 10, 2017

boaks commented Nov 10, 2017

philib commented Nov 10, 2017

boaks commented Nov 10, 2017

philib commented Nov 13, 2017

boaks commented Nov 16, 2017 •

edited

philib commented Nov 17, 2017

boaks commented Nov 17, 2017

philib commented Nov 17, 2017

boaks commented Nov 17, 2017

philib commented Nov 24, 2017

vikram919 commented Nov 29, 2017 •

edited

boaks commented Nov 29, 2017 •

edited

boaks commented Dec 5, 2017

philib commented Dec 5, 2017

Timeout handling #472

Timeout handling #472

Comments

philib commented Nov 10, 2017

boaks commented Nov 10, 2017

philib commented Nov 10, 2017

boaks commented Nov 10, 2017

philib commented Nov 13, 2017

boaks commented Nov 16, 2017 • edited

philib commented Nov 17, 2017

boaks commented Nov 17, 2017

philib commented Nov 17, 2017

boaks commented Nov 17, 2017

philib commented Nov 24, 2017

vikram919 commented Nov 29, 2017 • edited

boaks commented Nov 29, 2017 • edited

boaks commented Dec 5, 2017

philib commented Dec 5, 2017

boaks commented Nov 16, 2017 •

edited

vikram919 commented Nov 29, 2017 •

edited

boaks commented Nov 29, 2017 •

edited