core/thread: Allow for inline thread_yield_higher #15788

bergzand · 2021-01-18T16:17:19Z

Contribution description

This PR modifies core/include/thread.h to allow for inlining the thread_yield_higher function similar to the irq api. Patches for the relevant CPUs have been included to either add the thread_arch.h header with the inlined function or as dummy header.

Initial benchmarks show that this shaves of 13 cycles from the bench_thread_yield_pingpong test on the nrf52dk board.

~~TODO: add headers for arm7, esp8266 and msp430.~~

Testing procedure

~~Benchmarks: TODO~~
The threading tests must still work on all affected platforms.

I can run the tests for the cortex-m and the RISC-V platform myself, but I could use some help with the mips32r2 (pic32-wifire or similar) platform.

Benchmarks

Comparing flash size and "ticks" parameter of tests/bench_thread_yield_pingpong:

board	flash pre	flash post	bench pre	bench post
samr21-xpro	11760	11836	575	546
nrf52dk	10812	10892	414	399
msba2	19924	19948	715	697
hifive1b	11108	11148	468	460

Issues/PRs references

None

Similar to irq.h, this allows for inline the often trivial thread_yield_higher function

The thread_yield_higher function is complex enough to not inline it for the avr8 cpu

maribu · 2021-01-19T10:33:20Z

Funny, I had just this also on my wish list. Thanks for tackling this :-)

(I wonder if we will undo this and the other inline headers again once LTO=1 becomes default 🤣 - but that should be easier.)

bergzand · 2021-01-19T10:36:24Z

(I wonder if we will undo this and the other inline headers again once LTO=1 becomes default 🤣 - but that should be easier.)

Probably, but as you say, that should be relative low effort

maribu · 2021-01-19T10:47:54Z

Would be nice to see benchmarks for at least one board per arch, so that this is motivated.

bergzand · 2021-01-19T10:55:40Z

First for the cortex-m0+ and the cortex-m4 case using bench_thread_yield_idle, using the ticks result:

board	flash pre	flash post	bench pre	bench post
samr21-xpro	11760	11836	575	546
nrf52dk	10812	10892	414	399

maribu · 2021-01-19T13:45:56Z

I wonder why this increases ROM noticeably. I smell another missed optimization opportunity in GCC.

bergzand · 2021-01-19T13:59:46Z

I wonder why this increases ROM noticeably. I smell another missed optimization opportunity in GCC.

On the nrf52dk, the sequence is LDR, MOV.W, STR and ISB. The LDR and STR are 16 bit, the other two 32 bit. A 32 bit word is used at the end of the function to store the SCB->ICSR register memory address in. 16 bytes in total for the whole sequence, every time the thread_yield_higher is used.

cpu/arm7_common/include/thread_arch.h

maribu · 2021-01-20T07:58:12Z

We're only missing test / benchmark for the fe310 and MIPS. And the ESP platform has no pseudo header yet.

@aabadie: Would you mind giving tests/bench_thread_yield_pingpong a spin on your Hifive 1?

@francois-berder: Would you mind to do the same on one of your MIPS boards?

bergzand · 2021-01-20T09:38:33Z

And the ESP platform has no pseudo header yet.

Here it is

bergzand · 2021-01-20T12:03:53Z

And the hifive1b (fe310) case using bench_thread_yield_idle, using the ticks result:

board	flash pre	flash post	bench pre	bench post
hifive1b	11108	11148	468	460

Trading 40 bytes for 8 clock ticks :)

maribu · 2021-01-20T12:26:43Z

Add for now only and empty pseudo-header for MIPS to get this in swiftly?

benpicco · 2021-01-20T12:33:59Z

This only moves an existing function into a header, do we really expect any change in behavior because of that?

bergzand · 2021-01-20T12:39:07Z

This only moves an existing function into a header, do we really expect any change in behavior because of that?

The function is inlined, influencing the performance, so yeah, I would prefer to have hard numbers before changing this.

In terms of flash I see a 20B increase on the pick32-wifire and a 24B increase on the 6lowpan-clicker.

benpicco · 2021-01-21T22:14:38Z

native needs an update too.

bergzand · 2021-01-22T08:22:57Z

native needs an update too.

Of course 😑, Added

benpicco

looks good to me.

maribu · 2021-01-22T19:01:21Z

I'd say we don't need to wait for MIPS results. Every architecture without exception got a speed bump im a hot code path by this, as was the expectation. Odds are good that for MIPS theory and practice also match.

bergzand · 2021-01-22T19:26:07Z

Thanks!

bergzand requested review from aabadie and kaspar030 as code owners January 18, 2021 16:17

core/thread: uncrustify header file

f0ce992

bergzand force-pushed the pr/core/inline_thread_yield_higher branch from d162a25 to c88871f Compare January 19, 2021 09:41

bergzand requested review from gschorcht and maribu as code owners January 19, 2021 09:41

bergzand removed the State: WIP State: The PR is still work-in-progress and its code is not in its final presentable form yet label Jan 19, 2021

bergzand added 6 commits January 19, 2021 11:03

core/thread: Allow for inline thread_yield_higher

9d5f87b

Similar to irq.h, this allows for inline the often trivial thread_yield_higher function

cpu/avr8_common: Add dummy thread_arch.h header

0d43c96

The thread_yield_higher function is complex enough to not inline it for the avr8 cpu

cpu/cortexm_common: Inline thread_yield_higher function

0129e73

cpu/fe310: Inline thread_yield_higher function

9979646

cpu/mips32r2_common: Inline thread_yield_higher function

0b01a99

cpu/msp430_common: Add dummy thread_arch.h header

bce9e3c

bergzand force-pushed the pr/core/inline_thread_yield_higher branch from c88871f to 9c59580 Compare January 19, 2021 10:03

bergzand added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 19, 2021

benpicco reviewed Jan 19, 2021

View reviewed changes

cpu/arm7_common/include/thread_arch.h Show resolved Hide resolved

bergzand force-pushed the pr/core/inline_thread_yield_higher branch from 9c59580 to ede0f17 Compare January 19, 2021 20:39

maribu reviewed Jan 20, 2021

View reviewed changes

cpu/arm7_common/include/thread_arch.h Show resolved Hide resolved

cpu/arm7_common: Inline thread_yield_higher function

84dfc88

bergzand force-pushed the pr/core/inline_thread_yield_higher branch from ede0f17 to 84dfc88 Compare January 20, 2021 09:38

maribu added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Jan 20, 2021

cpu/native: Add dummy thread_arch.h header

517fc58

benpicco approved these changes Jan 22, 2021

View reviewed changes

benpicco merged commit 4c403d6 into RIOT-OS:master Jan 22, 2021

bergzand deleted the pr/core/inline_thread_yield_higher branch January 22, 2021 19:26

kaspar030 added this to the Release 2021.04 milestone Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/thread: Allow for inline thread_yield_higher #15788

core/thread: Allow for inline thread_yield_higher #15788

bergzand commented Jan 18, 2021 •

edited by maribu

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021 •

edited

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021

maribu commented Jan 20, 2021

bergzand commented Jan 20, 2021

bergzand commented Jan 20, 2021

maribu commented Jan 20, 2021

benpicco commented Jan 20, 2021

bergzand commented Jan 20, 2021

benpicco commented Jan 21, 2021

bergzand commented Jan 22, 2021

benpicco left a comment

maribu commented Jan 22, 2021

bergzand commented Jan 22, 2021

core/thread: Allow for inline thread_yield_higher #15788

core/thread: Allow for inline thread_yield_higher #15788

Conversation

bergzand commented Jan 18, 2021 • edited by maribu

Contribution description

Testing procedure

Benchmarks

Issues/PRs references

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021 • edited

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021

maribu commented Jan 19, 2021

bergzand commented Jan 19, 2021

maribu commented Jan 20, 2021

bergzand commented Jan 20, 2021

bergzand commented Jan 20, 2021

maribu commented Jan 20, 2021

benpicco commented Jan 20, 2021

bergzand commented Jan 20, 2021

benpicco commented Jan 21, 2021

bergzand commented Jan 22, 2021

benpicco left a comment

Choose a reason for hiding this comment

maribu commented Jan 22, 2021

bergzand commented Jan 22, 2021

bergzand commented Jan 18, 2021 •

edited by maribu

bergzand commented Jan 19, 2021 •

edited