Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/ws281x: add SysTick + GPIO LL backend #20646

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

maribu
Copy link
Member

@maribu maribu commented May 2, 2024

Contribution description

This implements a new ws281x backend that is largely inspired by @chrysn's periph_timer_poll + GPIO LL backend, but uses SysTick instead of periph_timer_poll. The advantage is that the SysTick timer is provided by all Cortex M MCUs and therefore (hopefully) should work on all Cortex M MCUs.

Testing procedure

Run the tests/drivers/ws281x app on a Cortex M MCU; it should now be used as default backend on Cortex M MCUs.

I tested this successfully with

  • nucleo-f072rb
  • nucleo-f103rb
  • nucleo-f303re
  • nucleo-f446re
  • nucleo-g070rb

Issues/PRs references

None

@github-actions github-actions bot added Platform: ARM Platform: This PR/issue effects ARM-based platforms Area: build system Area: Build system Area: drivers Area: Device drivers Area: cpu Area: CPU/MCU ports labels May 2, 2024
maribu added 2 commits May 3, 2024 09:01
This allow checking if a SysTick timer (provided by all Cortex M MCUs)
is present.
This backend is largely inspired by the `periph_timer_poll` + GPIO LL
driver, but uses the SysTick timer instead of the `periph_timer`.

The main advantage is that this (hopefully) works across Cortex M MCUs
with no configuration other than the GPIO pin needed.
@maribu maribu added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label May 3, 2024
@riot-ci
Copy link

riot-ci commented May 3, 2024

Murdock results

✔️ PASSED

1bbce1f fixup! drivers/ws281x: add SysTick + GPIO LL backend

Success Failures Total Runtime
10088 0 10088 13m:26s

Artifacts

drivers/ws281x/systick_gpio_ll.c Show resolved Hide resolved
drivers/ws281x/systick_gpio_ll.c Outdated Show resolved Hide resolved
Comment on lines +102 to +103
_systick_start(ticks_low);
_systick_wait();
Copy link
Contributor

@kfessel kfessel May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh i see where something precise is need (high time) there are only: set; wait and clr. and for the low time the inprecise start; wait; and the next start is placed

instead of reconfiguring the systick each time; i would start the systick once configured to some interval that cover high and low and wait for the count flag to raise to raise n times.

eg.:

set the systick to 100ns
for  each bit b in byte do

set gpio 

if b == 1 
3 times  syst_wait()
else 
6 times syst_wait()
endif 

clear gpio 

if b == 1 
6 times  syst_wait()
else 
3 times syst_wait()
endif

done

this probably makes the whole thing less jittery

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That adds a branch in the timing sensitive phase and waiting for 300 ns on a STM32F072 is already pretty challenging with one time starting the systick timer. I doubt that you can start the systick timer three times within that time budget.

Copy link
Contributor

@kfessel kfessel May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not starting you just have it running and wait for n ticks

this is much faster then what you are currently doing ( and also kindof dumber (a little more the stupid of KISS)

maybe the c version in the next comment clears this up

the 3 and the 6 where guess numbers looking at the manual it is

300ns

bit high low
1 2 waits 2 waits
0 1 wait 3 waits

or 400 ns

(ws2512b version 2 but it seems that that should also work with most valid v1 timings)

bit high low
1 2 waits 1 wait
0 1 wait 2 waits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the manuals that i found that had tolerances with the highs an lows would also be fine with the 300 ns and the 1: 2/3 1/3 and 0: 2/3 1/3 ( going close to the tolerances)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the argument to split a single wait into three?

In any case: Each of the three waits would have to spin until a flag is set, clear the flag, and spin again. On the nucleo-f072rb the total wait is 13 cycles. Splitting that into three waits means 4 CPU cycles per wait. That is not possible to implement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://godbolt.org/z/4nn3a3e7n
see function t2 the wait loop has 2 instructions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but a conditional branch will take more than one CPU cycle.

And again: Why divide a single wait? It adds lines or code, bytes to .text. And why into three waits? Wouldn't 42 or 1337 be much better numbers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the has timing that are 1/4:3/4 or 1/3:2/3

you can put the branch outside of the time critical path

when i wrote the first suggestion i wasn't aware that we run the nucleo 72 @ 12MHz (I am still not sure but 13 cycles in about a microseconds would fit 12MHz)

but the reconfiguration of the systick is at least 7 instructions (if the register adresses are loaded) and a wait of unknown length

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it about not reconfiguring the systimer

this spreads the waits a little better so that they each have least amount of empty loops

    _systick_start( 4 );
    for (uint8_t cnt = 8; cnt > 0; cnt--) {
        int bit = !!(data & 0x80);
        if(!bit)
        {
            _systick_wait();
            gpio_ll_set();
            _systick_wait();
            gpio_ll_clear();
            _systick_wait();
        }else{
            _systick_wait();
            gpio_ll_set();
            _systick_wait();
            _systick_wait();
            gpio_ll_clear();
        }
        data <<= 1;
        _systick_wait();
    }

Comment on lines +95 to +105
for (uint8_t cnt = 8; cnt > 0; cnt--) {
uint32_t ticks_high = (data & 0x80) ? ticks_one : ticks_zero;
uint32_t ticks_low = ticks_bit - ticks_high;
_systick_start(ticks_high);
gpio_ll_set(port, mask);
_systick_wait();
gpio_ll_clear(port, mask);
_systick_start(ticks_low);
_systick_wait();
data <<= 1;
}
Copy link
Contributor

@kfessel kfessel May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets try to rewrite my pseudo code in pseudo c

Suggested change
for (uint8_t cnt = 8; cnt > 0; cnt--) {
uint32_t ticks_high = (data & 0x80) ? ticks_one : ticks_zero;
uint32_t ticks_low = ticks_bit - ticks_high;
_systick_start(ticks_high);
gpio_ll_set(port, mask);
_systick_wait();
gpio_ll_clear(port, mask);
_systick_start(ticks_low);
_systick_wait();
data <<= 1;
}
_systick_start( 300ns );
for (uint8_t cnt = 8; cnt > 0; cnt--) {
int bit = !!(data & 0x80);
gpio_ll_set(port, mask);
if(!bit){
_systick_wait();
}else{
_systick_wait();_systick_wait();_systick_wait();
}
gpio_ll_clear(port, mask);
if(!bit){
_systick_wait();_systick_wait();
}else{
_systick_wait();_systick_wait();
}
data <<= 1;
}

Copy link
Contributor

@kfessel kfessel May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then there is the potential use of the systick irq

global byte
global count

systick_irq(){
if(count % 4 = 0)  gpio_set();
else if (count % 4 =1 && !(byte & 0x80 >> (count / 4))) gpio_clr;
else if (count % 4 = 2 ) gpio_clr;
count++;
if count >= 4 * 8 load next byte and reset count
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overhead of entering the IRQ, backing up the state etc. takes way longer than the total time budget

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok the irq thing might not be the right choice if you got 4 cycles

* rather be on the high side by adding a few CPU cycles. */
const uint32_t ticks_one = DIV_ROUND_UP((uint64_t)WS281X_T_DATA_ONE_NS * (uint64_t)coreclk(), NS_PER_SEC) + 16;
/* the low time should rather be on the short side, so rounding down */
const uint32_t ticks_zero = (uint64_t)(WS281X_T_DATA_ZERO_NS - 50) * (uint64_t)coreclk() / NS_PER_SEC;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 48 MHz this is:

(325 - 50) * 48 * 10^6 / 10^9 = 13.2

Copy link
Contributor

@kfessel kfessel May 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so 13 cyles for one wait ( which has (a load, a compare and a jump back if bit 16 isn't set) i think this is plenty of time to do this multiple times.

the short wait (300ns) has exact the same number of wait as with your code but it does not do a setup in the low period but does just three short waits instead.

i thought you where talking about 13cycle for the 1 µsecond

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: build system Area: Build system Area: cpu Area: CPU/MCU ports Area: drivers Area: Device drivers CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ARM Platform: This PR/issue effects ARM-based platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants