Skip to content

cpu/atmega256rfr2: symbol counter based RTT support#12852

Merged
benpicco merged 1 commit intoRIOT-OS:masterfrom
chudov:atmegarfr2-rtt
Dec 6, 2019
Merged

cpu/atmega256rfr2: symbol counter based RTT support#12852
benpicco merged 1 commit intoRIOT-OS:masterfrom
chudov:atmegarfr2-rtt

Conversation

@chudov
Copy link
Contributor

@chudov chudov commented Dec 1, 2019

Contribution description

This adds RTT support to atmega256rfr2 based on MAC symbol counter. The MAC symbol conter is 32-bit wide working on 62.500 kHz derived from XTAL1 16MHz or TOSC1 32.786 kHz oscillator. Symbol counter automatically switches from XTAL1 oscillator to TOSC1 oscillator when CPU is going to sleep and fallback when CPU wakes up.
Current implementation uses 32.786 kHz oscillator for both modes. The SCOCR2 compare register is used for alarm generation.

The RTT requires a 32.768 kHz oscillator to be connected to TOSC1, TOSC2 of MCU. All supported atmega256rfr2-based boards has it.

###Testing procedure
Tests were executed on two ConBee I USB dongles that are based on deRFmega256-23M12 modules.
tests/periph_rtt sucessfully executed.

Also tests/gnrc_gomach was adopted for testing. Packets were sent out succesfully (checked with 802.15.4 sniffer), but RIOT on receiver module crashed with kernel panic.

###Issues/PRs references
inspired by and based on #12815

@benpicco benpicco added Platform: AVR Platform: This PR/issue effects AVR-based platforms Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation labels Dec 2, 2019
@benpicco
Copy link
Contributor

benpicco commented Dec 2, 2019

Since this shares a lot of code with atmega_common/periph/rtt.c I think it would make sense to just extend that implementation to use the symbol counter when available and fall back to using TIMER2. (#ifdef SCCR0).

I recon the advantage here is better sleep performance (no unnecessary wake-ups for overflow interrupts). TBH I don't even know how well the current implementation handles Deep Sleep.

@herjulf
Copy link
Contributor

herjulf commented Dec 2, 2019

Nice,
Also to see the rtt support based on TIMER2. Worked.
IMO the symbol counter advantage is the HW synchronization between RTC and the XTAL. Very useful for RDC also to sleep MCU.Tried this in Contiki but had to give it up due to legacy platforms. deep sleep should work with both RTT implementations. So the ifdef can be good solution.
tests/periph_rtt on the avr-rss2 with debugging enabled.

[2019-12-02 21:13:18] RTT waits until SC not busy
[2019-12-02 21:13:18] RTT initialized
[2019-12-02 21:13:18] RTT now: 247
[2019-12-02 21:13:18] Setting initial alarm to now + 5 s (312747)
[2019-12-02 21:13:18] RTT set alarm SCCNT: 569, SCOCR2: 312747
[2019-12-02 21:13:18] RTT alarm interrupt active
[2019-12-02 21:13:18] Done setting up the RTT, wait for many Hellos
[2019-12-02 21:13:23] RTT set alarm SCCNT: 312747, SCOCR2: 625247
[2019-12-02 21:13:23] RTT alarm interrupt active
[2019-12-02 21:13:23] Hello
[2019-12-02 21:13:28] RTT set alarm SCCNT: 625247, SCOCR2: 937747
[2019-12-02 21:13:28] RTT alarm interrupt active
[2019-12-02 21:13:28] Hello
[2019-12-02 21:13:33] RTT set alarm SCCNT: 937747, SCOCR2: 1250247
[2019-12-02 21:13:33] RTT alarm interrupt active
[2019-12-02 21:13:33] Hello
[2019-12-02 21:13:38] RTT set alarm SCCNT: 1250247, SCOCR2: 1562747
[2019-12-02 21:13:38] RTT alarm interrupt active
[2019-12-02 21:13:38] Hello
[2019-12-02 21:13:43] RTT set alarm SCCNT: 1562747, SCOCR2: 1875247
[2019-12-02 21:13:43] RTT alarm interrupt active
[2019-12-02 21:13:43] Hello

@herjulf
Copy link
Contributor

herjulf commented Dec 3, 2019

tests/gnrc_gomach was adopted for testing. Packets were sent out succesfully (checked with 802.15.4 sniffer), but RIOT on receiver module crashed with kernel panic.

@chudov do you get tests/gnrc_gomach running with the rtt based on TIMER2?

@benpicco
Copy link
Contributor

benpicco commented Dec 3, 2019

The crash also happens with TIMER2 #12857

@herjulf
Copy link
Contributor

herjulf commented Dec 3, 2019

OK.
Memory use is modest.

text	   data	    bss	    dec	    hex	filename
  61302	   8940	   4174	  74416	  122b0	/home/robert/RIOT/tests/gnrc_gomach/bin/avr-rss2/tests_gnrc_gomach.elf

IRQ race with radio?

One generic problem we have with timer callbacks is that they are running in IRQ context. In Linux we have a softirq.

@chudov
Copy link
Contributor Author

chudov commented Dec 3, 2019

Since this shares a lot of code with atmega_common/periph/rtt.c I think it would make sense to just extend that implementation to use the symbol counter when available and fall back to using TIMER2. (#ifdef SCCR0).

Reworked as proposed.

I recon the advantage here is better sleep performance (no unnecessary wake-ups for overflow interrupts). TBH I don't even know how well the current implementation handles Deep Sleep.

Whould you recommend branch with well working power management for AVR? Then I could try to test.

@benpicco
Copy link
Contributor

benpicco commented Dec 3, 2019

Would you recommend branch with well working power management for AVR? The I could try to test.

That work had been started in #8207, but it hasn't been updated in a while.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sure looks doable!
But rtt_set_alarm() and rtt_set_counter() are drowning in a soup of #ifdefs and rtt_init() is just on the edge. Either factor out the differing parts into helper functions (with just one #ifdef) or simply have two versions of those functions.
The 'common code' in those cases is trivial anyway.

@chudov
Copy link
Contributor Author

chudov commented Dec 3, 2019

I hope it is better now.

@benpicco
Copy link
Contributor

benpicco commented Dec 3, 2019

If the code gets cleaner by doing so, you can always drop some DEBUG macros 😉

@chudov chudov force-pushed the atmegarfr2-rtt branch 2 times, most recently from c2d2800 to 0a11872 Compare December 4, 2019 18:51
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Dec 4, 2019
@benpicco benpicco requested a review from maribu December 4, 2019 19:13
@herjulf
Copy link
Contributor

herjulf commented Dec 4, 2019

Hello,

Did you test the last incarnation?

``Stack Pointer: 0x81f3
*** RIOT kernel panic:
FAILED ASSERTION.

*** halted.

`
Also. I think you should include all boards with AtMega256rfr2 MCU

@herjulf
Copy link
Contributor

herjulf commented Dec 4, 2019

Huh,
My fault. Was complaining about boards. I didn't add it myself... Works.

`

`[2019-12-04 21:11:04] main(): This is RIOT! (Version: 2020.01-devel-1239-gbd254d)
[2019-12-04 21:11:04] Help: Press s to start test, r to print it is ready
[2019-12-04 21:11:14] START
[2019-12-04 21:11:14]
[2019-12-04 21:11:14] RIOT RTT low-level driver test
[2019-12-04 21:11:14] This test will display 'Hello' every 5 seconds
[2019-12-04 21:11:14]
[2019-12-04 21:11:14] Initializing the RTT driver
[2019-12-04 21:11:14] RTT now: 0
[2019-12-04 21:11:14] Setting initial alarm to now + 5 s (312500)
[2019-12-04 21:11:14] Done setting up the RTT, wait for many Hellos
[2019-12-04 21:11:19] Hello
[2019-12-04 21:11:24] Hello
[2019-12-04 21:11:29] Hello
[2019-12-04 21:11:34] Hello

Copy link
Member

@maribu maribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see an RTT implementation for the RFA1 and RFR2 family :-) See inline comments for some preliminary feedback

Comment on lines +31 to +36
* For MCUs with MAC symbol counter (atmega*rfr2):
* The MAC symbol counter is automatically switch to TOSC1 32.768kHz clock
* when transceiver or CPU is going to sleep. The MAC symbol counter is
* sourced by 62.500 kHz derived from 32.768kHz TOSC1 or 16 MHz system clock
* For current implementation symbol counter is always use 32.768kHz TOSC1 clock
* For alarms the SCOCR2 register is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about some rephrasing of this, e.g.:

 * For MCUs with a MAC symbol counter (ATmegaXXXRFA1 and ATmegaXXXRFR2):
 * The MAC symbol counter can be configured to be sourced by the 32.768kHz
 * RTC (TOSC1), or from a 62.500 kHz clock generated by dividing the 16 MHz
 * system clock. When either the CPU or the transceivers is going to sleep,
 * the MAC symbol counter is sourced by the RTC for both options. In order to
 * not have to compensate for a changing clock frequency, this RTT
 * implementation uses the 32.768kHz RTC as source even when both CPU and
 * transceiver are active. The 32 bit comparator in SCOCR2 is used for alarms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wil take the proposed text, except place where it is little bit incorrect: according to the datasheet "The symbol counter is a 32 bit counter which can be sourced by a 62.5 kHz clock, derived from the 16 MHz system clock or from the RTC (32.768 kHz)."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I got confused while reading the data sheet. Thanks for pointing this out!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, on another page the data sheet says "The counter time base can be derived from the 16 MHz crystal or the RTC (32.768 kHz crystal on TOSC) during operation."

Maybe the "62.5 kHz clock, derived from the 16 MHz system clock" does not mean it can be either a 62.5 kHz clock or the be derived from the system clock. Maybe it means that the 62.5 kHz clock is derived from the system clock by dividing the 16 MHz by 256?

Copy link
Contributor Author

@chudov chudov Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in case of deriving from 16MHz clock it is possible to use another divider. See 10.11.35 SCCR1 section where all possible values are mentioned. This is nice flexibility but it is unclear for me how to use it in real 802.15.4 applications.
Even worse, the counter can be clocked from PG2 pin.

I'm not sure it make sense to paste datasheet text to sources...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it make sense to paste datasheet text to sources...

No, only the takeaway needed for understanding the code and the design decisions are needed. To me, here it is: There are multiple options for the clock source. But only the 32.768kHz RTT clock source is available when either the CPU or the transceiver is not active. So therefore, the RTC is chosen as clock source here.

Copy link
Contributor Author

@chudov chudov Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greate feature of the symbol counter that in case of deep sleep it is automatically switched from XTAL to TOSC1 and back. But that leads to some delays for oscillator stabilization and some counter resync that are not yet clear for me. This is why 32K is always used :)
How to reflect that in comments?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment you just made explains it quite good ;-)

The counter resync would also increase ROM and wake up time. It would increase the complexity of the code. To me, it is just not worth the effort.

Let me try to extend your comment above:

The ATmegaXXXRFA1/RFR2 provide multiple clock sources. But only the 32.768kHz RTT (TOSC1) is available when CPU or transceiver or not active. Thus, it is used automatically as a fallback during sleep modes regardless of configured clock source. To avoid the calculations needed to compensate for a change of the clock frequency during sleep, this implementation just uses TOSC1(32.768 kHz) at any point in time.

Comment on lines +64 to +76
__attribute__((always_inline))
static inline uint32_t rg_read32(volatile uint8_t *hh,
volatile uint8_t *hl,
volatile uint8_t *lh,
volatile uint8_t *ll)
{
le_uint32_t rg;

rg.u8[0] = *ll;
rg.u8[1] = *lh;
rg.u8[2] = *hl;
rg.u8[3] = *hh;

return rg.u32;
}
Copy link
Member

@maribu maribu Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __attribute__((always_inline)) should not be needed, unless someone compiles with -O0. But anyone doing so will expect additional overhead in CPU cycles and ROM size anyway...

This could be more simplified, as the layout of those 32 bit registers are always consecutive with the least significant byte having the lowest address.

The current implementation contains a data race: According to the data sheet a single 32 bit temporary register is used to capture the current contents of all available 32 bit registers. And a read of the least significant byte of any 32 bit registers cause the temporary register to be updated. Let's say thread A starts to read a 32 bit registers and has read two bytes, then an IRQ is triggered. If the ISR is now reading a 32 bit registers, it will cause the temporary register to be updated. After the ISR completes, thread A will continue to read the remaining two bytes, but those will return the updated contents of the temporary register.

Also, a short summary of the reason the accesses are atomic as a comment would make the life of people reading the code easier. E.g:

/*
 * Read a 32 bit register as described in section 10.3 of the datasheet: A read
 * of the least significant byte causes the current value to be atomically
 * captured in atemporary 32 bit registers. The remaining reads will access this
 * register instead. Only a single 32 bit temporary register is used to provide 
 * means to atomically access them. Thus, interrupts must be disabled during the
 * read sequence in order to prevent other threads (or ISRs) from updating the
 * temporary 32 bit register before the reading sequence has completed.
 */
static inline uint32_t reg32_read(volatile uint8_t *reg_ll)
{
    le_uint32_t reg;
    unsigned state = irq_disable();
    reg.u8[0] =  reg_ll[0];
    reg.u8[1] =  reg_ll[1];
    reg.u8[2] =  reg_ll[2];
    reg.u8[3] =  reg_ll[3];
    irq_restore(state);
    return reg.u32;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation only calls rg_read32() when interrupts are already disabled though, no need to disable them again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, in rtt_get_alarm() it does so. But that is the only exception; so adding irq_disable() and irq_restore() there and dropping it here would indeed be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pity that we cannot tell the compiler the semantics of irq_disable() and irq_restore() to allow it to just optimize out all inner pairs of irq_disable() and irq_restore() when inlining a function.

Copy link
Contributor

@benpicco benpicco Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the alarm value is not being modified by the MCU.
Only reading SCCNT is racy because this is continuously changing. But this happens inside _safe_cnt_get() where interrupts are disabled.

Copy link
Contributor Author

@chudov chudov Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems these functions should be extracted to another perepherial module named "sc" as they could be used by optimized 802.15.4 network drivers to read other 32-bit registers related to the MAC symbol counter. And in this case it make sense to protect them from interrupts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the alarm value is not being modified by the MCU.

Doesn't matter. It is read throw the 32 bit temporary register. And if while reading the temporary registers another thread or ISR is executed, the contents of the temporary register might be modified.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good from my side, just one more thing:
I wanted to test if the Timer2 based code also still works, but all my ATmegas with 32kHz quarts are RF ones.

Turns out disabling RTT_BACKEND_SC is not as simple as it could be, but that's an easy fix.
Move one define and then run a search & replace #ifdef … -> #if ….

#include <avr/interrupt.h>

#ifdef SCCR0
#define RTT_BACKEND_SC (1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this here.

* @name RTT configuration
* @{
*/
#ifdef SCCR0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#ifdef SCCR0
#if defined(SCCR0) && !defined(RTT_BACKEND_SC)
#define RTT_BACKEND_SC (1)
#endif
#if RTT_BACKEND_SC

and move it here.

Also s/#ifdef RTT_BACKEND_SC/#if RTT_BACKEND_SC/g, s/#ifdnef RTT_BACKEND_SC/#if RTT_BACKEND_SC == 0/g so CFLAGS+=-DRTT_BACKEND_SC=0 works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if it meets expectations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both backends work fine, please squash!

@herjulf
Copy link
Contributor

herjulf commented Dec 5, 2019

I wanted to test if the Timer2 based code also still works, but all my ATmegas with 32kHz quarts are RF ones.

TIMER2 can have some merits and compliment the symbol counter even on RF ones. No radio is involved and can source OS with both events and wake-up with almost nil in power.

No progress with gomach... What is lwmac?

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good and both code paths still work on avr-rss2.

@benpicco
Copy link
Contributor

benpicco commented Dec 5, 2019

TIMER2 can have some merits and compliment the symbol counter even on RF ones. No radio is involved and can source OS with both events and wake-up with almost nil in power.

Does using the symbol counter use more power?
I would actually expect it uses less power because the MCU doesn't have to wake up every 255 iterations for the roll-over.
You can still select the TIMER2 based implementation with #define RTT_BACKEND_SC 0

No progress with gomach... What is lwmac?

Not sure, does it also crash?

@herjulf
Copy link
Contributor

herjulf commented Dec 5, 2019

Does using the symbol counter use more power?
I would actually expect it uses less power because the MCU doesn't have to wake up every 255 iterations for the roll-over.

Yes but you can set a prescaler to slow things down. OS:es still needs some events otherwise things gets weird.

You can still select the TIMER2 based implementation with #define RTT_BACKEND_SC 0

OK. Fine.
Had the boards running 6 weeks @1 pkt/min. With LIC supercapacitors one time charge of 120s charge was needed. TIMER2 was the friend. Symbol counter was useful with TSCH and more demanding applications.

@chudov
Copy link
Contributor Author

chudov commented Dec 5, 2019

No progress with gomach... What is lwmac?

Not sure, does it also crash?

At least test is not compilable (#12869)

@herjulf
Copy link
Contributor

herjulf commented Dec 6, 2019

Complementing the symbol counter (SC) vs TIMER2 discussion. MCU can also be sourced via the internal oscillator (not 16MHz xtal) for lower power. I don't think the SC is working in this settning... Should be verified.

@herjulf
Copy link
Contributor

herjulf commented Dec 6, 2019

Happy. Elegant. Constructive collaboration!

@benpicco benpicco merged commit 29a3a7f into RIOT-OS:master Dec 6, 2019
@herjulf
Copy link
Contributor

herjulf commented Dec 7, 2019

As a follow up work it could interesting to see how the deep MCU sleep can be implemented in RIOT thread architecture.

sleep_enable();
set_sleep_mode(SLEEP_MODE_PWR_SAVE);

sleep_cpu();
sleep_disable();
thread_yield();

In Contiki the sleep code is rtimer (xtimer in RIOT) which fallbacks to TIMER2.

@maribu
Copy link
Member

maribu commented Dec 7, 2019

🎉

@chudov: Thanks for you work!

@benpicco
Copy link
Contributor

benpicco commented Dec 7, 2019

@herjulf check out #11874
This should make it possible to use RTT as system timer.

@chudov chudov deleted the atmegarfr2-rtt branch December 8, 2019 00:05
@herjulf
Copy link
Contributor

herjulf commented Dec 8, 2019

Yes you mentioned. Not crystal clear howto to get rtt running. Seems like xtimer is the focus. Need to read...

@fjmolinas fjmolinas added this to the Release 2020.01 milestone Dec 13, 2019
@fjmolinas fjmolinas added the Area: cpu Area: CPU/MCU ports label Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: cpu Area: CPU/MCU ports CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: AVR Platform: This PR/issue effects AVR-based platforms Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants