Skip to content

Conversation

lyakh
Copy link

@lyakh lyakh commented Nov 3, 2020

This PR together with thesofproject/sof#3578 enables building Zephyr with SOF for arbitrary architectures and platforms. This is an RFC at the moment.

lyakh added 4 commits November 5, 2020 12:10
rimage dropped its "-m" parameter and switched over to using "-c"
for a configuration file, including a target name.

Add support for extended manifest for all cAVS versions.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
soc/xtensa/intel_adsp/common/include/cavs/memory.h wend missing from
the SOF update, restore it.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
1. SOF doesn't have to be built in .bin format
2. don't include soc.c and soc_mp.c twice in cmake
3. remove an unused mailbox.h header and unused code in adsp.c

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
This allows building SOF for unsupported platforms to enable
compilation testing and to simplify porting to new platforms.

Signed-off-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
@lyakh lyakh marked this pull request as ready for review November 5, 2020 11:14
@lyakh lyakh requested a review from andyross as a code owner November 5, 2020 11:14
@lyakh
Copy link
Author

lyakh commented Nov 5, 2020

@andyross I updated the first rimage patch: I think python now looks more acceptable, at least I removed the hard-coded path to a configuration file

@lyakh
Copy link
Author

lyakh commented Nov 6, 2020

The SOF counterpart is now thesofproject/sof#3591

andyross pushed a commit that referenced this pull request Nov 9, 2020
The _ldiv5() is an optimized divide-by-5 function that is smaller and
faster than the generic libgcc implementation.

Yet it can be made even smaller and faster with this replacement
implementation based on a reciprocal multiplication plus some tricks.

For example, here's the assembly from the original code on ARM:

_ldiv5:
        ldr     r3, [r0]
        movw    ip, zephyrproject-rtos#52429
        ldr     r1, [r0, #4]
        movt    ip, 52428
        adds    r3, r3, #2
        push    {r4, r5, r6, r7, lr}
        mov     lr, #0
        adc     r1, r1, lr
        adds    r2, lr, lr
        umull   r7, r6, ip, r1
        lsr     r6, r6, #2
        adc     r7, r6, r6
        adds    r2, r2, r2
        adc     r7, r7, r7
        adds    r2, r2, lr
        adc     r7, r7, r6
        subs    r3, r3, r2
        sbc     r7, r1, r7
        lsr     r2, r3, #3
        orr     r2, r2, r7, lsl zephyrproject-rtos#29
        umull   r2, r1, ip, r2
        lsr     r2, r1, #2
        lsr     r7, r1, zephyrproject-rtos#31
        lsl     r1, r2, #3
        adds    r4, lr, r1
        adc     r5, r6, r7
        adds    r2, r1, r1
        adds    r2, r2, r2
        adds    r2, r2, r1
        subs    r2, r3, r2
        umull   r3, r2, ip, r2
        lsr     r2, r2, #2
        adds    r4, r4, r2
        adc     r5, r5, #0
        strd    r4, [r0]
        pop     {r4, r5, r6, r7, pc}

And here's the resulting assembly with this commit applied:

_ldiv5:
        push    {r4, r5, r6, r7}
        movw    r4, zephyrproject-rtos#13107
        ldr     r6, [r0]
        movt    r4, 13107
        ldr     r1, [r0, #4]
        mov     r3, #0
        umull   r6, r7, r6, r4
        add     r2, r4, r4, lsl #1
        umull   r4, r5, r1, r4
        adds    r1, r6, r2
        adc     r2, r7, r2
        adds    ip, r6, r4
        adc     r1, r7, r5
        adds    r2, ip, r2
        adc     r2, r1, r3
        adds    r2, r4, r2
        adc     r3, r5, r3
        strd    r2, [r0]
        pop     {r4, r5, r6, r7}
        bx      lr

So we're down to 20 instructions from 36 initially, with only 2 umull
instructions instead of 3, and slightly smaller stack footprint.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
@lyakh lyakh closed this Jan 8, 2021
@lyakh lyakh deleted the sof_gen_arch branch January 8, 2021 10:05
lyakh pushed a commit to lyakh/zephyr that referenced this pull request Jan 28, 2021
With classic volatile pointer access gcc something generates
access instructions with immediate offset value, like

str     w4, [x1], andyross#4

Such instructions produce invalid syndrome in HSR register when are
trapped by hypervisor. This leads to inability to emulate device access
in hypervisor.

So we need to make sure that any access to device memory is done
with plain str/ldr instructions without offset.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
andyross added a commit that referenced this pull request Feb 6, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
andyross added a commit that referenced this pull request Mar 3, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
andyross added a commit that referenced this pull request Aug 1, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
andyross added a commit that referenced this pull request Aug 4, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
andyross added a commit that referenced this pull request Aug 7, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
andyross added a commit that referenced this pull request Aug 11, 2025
1. Mostly complete.  Supports MPU, userspace, PSPLIM-based stack
guards, and FPU/DSP features.  ARMv8-M secure mode "should" work but I
don't know how to test it.

2. Designed with an eye to uncompromising/best-in-industry cooperative
context switch performance.  No PendSV exception nor hardware
stacking/unstacking, just a traditional "musical chairs" switch.
Context gets saved on process stacks only instead of split between
there and the thread struct.  No branches in the core integer switch
code (and just one in the FPU bits that can't be avoided).

3. Minimal assembly use; arch_switch() itself is ALWAYS_INLINE, there
is an assembly stub for exception exit, and that's it beyond one/two
instruction inlines elsewhere.

4. Selectable at build time, interoperable with existing code.  Just
use the pre-existing CONFIG_USE_SWITCH=y flag to enable it.  Or turn
it off to evade regressions as this stabilizes.

5. Exception/interrupt returns in the common case need only a single C
function to be called at the tail, and then return naturally.
Effectively "all interrupts are direct now".  This isn't a benefit
currently because the existing stubs haven't been removed (see #4),
but in the long term we can look at exploiting this.  The boilerplate
previously required is now (mostly) empty.

6. No support for ARMv6 (Cortex M0 et. al.) thumb code.  The expanded
instruction encodings in ARMv7 are a big (big) win, so the older cores
really need a separate port to avoid impacting newer hardware.
Thankfully there isn't that much code to port (see #3), so this should
be doable.

Signed-off-by: Andy Ross <andyross@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant