TSCH: fix a bug in slot scheduling #2140

atiselsts · 2017-03-14T19:59:44Z

The scheduling of a new TSCH timeslot at the end of tsch_slot_operation() will enter a count-to-infinity loop if called less or equal-to RTIMER_GUARD ticks before the start of the next TSCH timeslot (i.e. 1 or 2 ticks on most platforms on which RTIMER_GUARD==2). This patch fixes this behavior by busywaiting until the start of the next slot in that case.

In order to avoid double busywaiting, the busywait code is now executed in TSCH_SCHEDULE_AND_YIELD if and only if it was not done in tsch_schedule_slot_operation().

This code was developed as part of the SPHERE project (http://irc-sphere.ac.uk/)

simonduq · 2017-03-15T10:31:27Z

Hmm, I have seen something similar, I wonder if that was the same issue or not. I think the problem was when skipping a slot (lock or non-active slot), we would directly try and schedule the next slot, although we had not even reached the real start of the current timeslot (we'd be 1 or 2 ticks before, but now looking at tsch_schedule_slot_operation I'm no longer sure why this was the case).

simonduq · 2017-03-15T10:36:26Z

Now about the fix: I'm worried that in cases we really did miss the deadline (e.g. join, or any slot operation that took too long), we'll end up busy-waiting (wait for a rtimer wrap)

atiselsts · 2017-03-15T11:01:14Z

@simonduq that sounds like another thing that would trigger the same behavior: the bug here is triggered because it looks like check_timer_miss() only works correctly if the reference time is not in the future. If there are any doubts anywhere that it might happen, may be better to add something like if(!RTIMER_CLOCK_LT(RTIMER_NOW(), ref_time + RTIMER_GUARD)) condition in the calling code?

(As a side note, I feel that this sort of thing, the core operation of TSCH, is crying for a randomized brute force testing...)

I'm worried that in cases we really did miss the deadline (e.g. join, or any slot operation that took too long), we'll end up busy-waiting (wait for a rtimer wrap)

The waiting will happen while RTIMER_CLOCK_LT(RTIMER_NOW(), (t0) + (offset) returns true. So if the waiting is started more than half the rtimer period in the future (1 second of the 2 period on msp430) then indeed it will wait for the wraparound. However, if this huge delay ever happens it just means that there's another bug somewhere and the system is already broken :)

simonduq · 2017-03-15T11:15:41Z

(As a side note, I feel that this sort of thing, the core operation of TSCH, is crying for a randomized brute force testing...)

I love the idea :)

However, if this huge delay ever happens it just means that there's another bug somewhere and the system is already broken :)

Right, but the problem remains whenever tsch_schedule_slot_operation is called directly. For instance when joining, we often have a few deadline missed.

atiselsts · 2017-03-15T12:09:57Z

I think it's a reasonable assumption to make that the scheduling will not be missed by more than 1 second, even on msp430.

simonduq · 2017-03-15T14:36:41Z

Right, I agree. 👍

TSCH: fix a bug in tsch_schedule_slot_operation scheduling

47962fc

g-oikonomou self-assigned this Mar 14, 2017

This was referenced Oct 13, 2017

cc2650: strange pattern of resets #2357

Open

Pull bugfixes/PRs from contiki-os/contiki contiki-ng/contiki-ng#41

Closed

atiselsts mentioned this pull request Feb 23, 2018

TSCH: fix a bug in slot scheduling contiki-ng/contiki-ng#308

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TSCH: fix a bug in slot scheduling #2140

TSCH: fix a bug in slot scheduling #2140

atiselsts commented Mar 14, 2017

simonduq commented Mar 15, 2017

simonduq commented Mar 15, 2017

atiselsts commented Mar 15, 2017

simonduq commented Mar 15, 2017

atiselsts commented Mar 15, 2017

simonduq commented Mar 15, 2017

TSCH: fix a bug in slot scheduling #2140

Are you sure you want to change the base?

TSCH: fix a bug in slot scheduling #2140

Conversation

atiselsts commented Mar 14, 2017

simonduq commented Mar 15, 2017

simonduq commented Mar 15, 2017

atiselsts commented Mar 15, 2017

simonduq commented Mar 15, 2017

atiselsts commented Mar 15, 2017

simonduq commented Mar 15, 2017