-
Notifications
You must be signed in to change notification settings - Fork 424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry timeout do not work as expected #197
Comments
Coap server logs |
With test 1, there is only one outstanding unsolicited Observe response waiting for an ACK from the client, and so the retry counters are as expected. With test 2, the server continues to send additional unsolicited Observe responses at the repeat frequency (in your case every 5 secs) until the logic times things completely out and drops the observer. So in your test 2 you have 4 outstanding Observe responses waiting for an ACK before observer is dropped. The issue here is that coap_insert_node() is biasing the timeouts (making it a delta time between 2 entries) of a queue of outstanding requests, making the re-transmit times shorter as this delta bias not re-adjusted when the head of the queue is taken off and processed (because of the Fixing coap_insert_node() by not making it time deltas fixes your issue, but causes the sendqueue tests to blow up.
Fixing coap_write() to keep the bias and cause it to get added to the first entry to the queue also works, but also the LWIP coap_retransmittimer_execute() function needs to get fixed - I suspect in the same way, but I do not have a LWIP version to test against. The CONTIKI version appears to have no issues here from code inspection.
@obgm What was the reasoning for maintaining retry times as deltas between each entry in the coap_queue_t list? |
The reasoning was that numbers will be smaller and, more important, when you insert a node into the list, you only need to update the t value of the subsequent node (and set the value of the inserted node). I still think that this is a good idea (but have not looked into the issue discussed here). |
The issue is when the node is pop'd off the list - the t value of the next entry does not get adjusted. It is quite easy to reproduce using standard coap-client and coap-server
Then abort coap-client after it has received about 2 time updates. Capture output from coap-server and grep for a tid value to see the re-transmit times. |
The node queue uses time deltas between each entry in the queue for retransmissions. The head entry does not have a time delta. When the head of the head of the queue is taken off for retransmission the next in the queue needs to have the time delta updated to be an absolute value. However, this does not happen if the current head has had the time offset fore set to 0 and so the retransmit timer fires more frequently than expected. There are 2 ways to fix this - not maintain time deltas between queue entries, or not set the head entries time to 0. Doing the latter means that code that relies on the deltas (I can only find the test suite) continue to work before calling coap_retransmit(). See Issue obgm#197 for further information.
@mbaril Can you please confirm the issue has now been fixed in the latest develop code. |
Can anyone confirm this to be fixed? |
Closing as the issue seems to be solved. |
I think the retry timeout do not work correctly when multiple CONF messages are sent in parallel.
Test procedure:
Step 1: I run the CoAP server example (coap-server)
Step 2: I create a CoAP observer on the resource /time.
Step 3: I kill the CoAP observer process
Test #1: OK
I use COAP_RESOURCE_CHECK_TIME of 120 and a COAP_OBS_MAX_FAIL is equal to 0.
Messages sent by server after step 3:
After 4 retries the CoAP server remove the observer entry as expected.
In this test the delay between each retries are correct.
Test #2: Delay between retries are not good
I use COAP_RESOURCE_CHECK_TIME of 5 sec and a COAP_OBS_MAX_FAIL is equal to 0.
Messages sent by server after step 3:
This time, the delay between retries do not seems to be correct.
MsgID= 29557.
Retry 1 at 2.2 sec (OK)
Retry 2 at 7.3 sec (OK)
Retry 3 at 12.1 sec (Not ok. Should be approx. 15.4 sec)
Retry #4 at 15.4 sec (Not ok.. Should approx. 33 sec)
MsgID= 29558.
Retry 1 at 3.0 sec (OK)
Retry 2 at 7.8 sec (Not ok. Should be approx. 9 sec)
Retry 3 at 9.8 sec (Not ok. Should be approx. 21 sec)
Retry #4 at 12.8 sec sec (Not ok.. Should approx. 45 sec)
The text was updated successfully, but these errors were encountered: