-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arch/arm: optimize context switch speed #5645
Conversation
1e5dce4
to
8bc40ce
Compare
I tried this PR with my latest imx6-netknsh branches (https://github.com/masayuki2009/incubator-nuttx/tree/imx6-netknsh https://github.com/masayuki2009/incubator-nuttx-apps/tree/imx6-netknsh) and found that init and hello work but getprime fails when it exits.
|
@masayuki2009 san, I can reproduce this issue on your branch but without this PR, seems some wrong with syscall clock_gettime:
|
@anchao |
Yes, could you try your latest branch? it seems does not work as expected
|
Are you using the latest my apps branch as well? It seems that you are using the old version (i.e. non thumb build version). |
Yes, apps branch is latest, I reinitialized another repository, but crash still happen.
|
Hmm, it's very strange to me. |
Please check the attachment |
I confirmed that the binaries are slightly different from mine. |
|
I notice that my toolchain is a little bit old.
So let me confirm the latest toolchain. |
I tried the latest toolchain
Now I can reproduce the crash which you are facing now.
|
That's great! thanks for your time, I will do more test on kernel mode. |
09a9999
to
2659327
Compare
4fd5645
to
9405df4
Compare
@masayuki2009 could you try the update? |
I noticed that lm3s6965-ek:qemu-kostest (QEMU) failed with this PR.
|
The original issue with sabre-6quad:netknsh (QEMU) still happens.
The text region of the getprime (elf) starts at 0x8000:0000, Here, r3 is loaded from
|
@masayuki2009 san, There is a issue with the consistency of banked registers in SVC and USR mode. I submitted an separate PR, Please review. |
Great idea, I'll add the corresponding performance tests later |
754e332
to
faa86d3
Compare
@masayuki2009 please merge this PR as soon as you confirm that everything is ok on your side. |
Did you fix the issue with lm3s6965-ek:qemu-kostest (QEMU) ?
|
No, I am still investigating this issue, update will be later if I have any progress. |
The current context save implementation saves registers of each task to xcp context, which is unnecessary because most of the arm registers are already saved in the task stack, this commit replace the xcp context with stack context to improve context switching performance and reduce the tcb space occupation of tcb instance. Signed-off-by: chao.an <anchao@xiaomi.com>
Signed-off-by: chao.an <anchao@xiaomi.com>
@masayuki2009 san, In the previous implementation, I did not consider the signal delivery in the interrupt context without interrupt stack, Therefore, the solution is that if the interrupt stack is disabled, reserve an signal xcpcontext to ensure signal processing can have a separate xcpcontext to handle signal context. dffd8f6#diff-31f02252f0714ae3f268b632eebec6f1dff7369c6caaa8625fcf597612ecd694R203-R230
|
Thanks for your investigation. |
@masayuki2009 san, In the latest commit set, I also added the signal handling if CONFIG_ARCH_INTERRUPTSTACK=0, The following 3 configs works fine with this PR: |
Thanks for your comments. |
I noticed that applications such as NTP daemon now consume more stack with this PR. The following results are from spresense:wifi_smp. Without this PR.
With this PR.
Please note that spresense:wifi_smp assigns the interrupt stack as follows.
|
No, this is not expected behavior, SP is set to the incorrect during up_initial_state(): https://github.com/apache/incubator-nuttx/blob/master/arch/arm/src/armv7-a/arm_initialstate.c#L96 I will raise the fix PR after finish the test case. |
@masayuki2009 san, please review this PR: |
Summary
arch/arm: optimize context switch speed
The current context save implementation saves registers of each task
to xcp context, which is unnecessary because most of the arm registers are
already saved in the task stack, this commit replace the xcp context with
stack context to improve context switching performance and reduce the tcb
space occupation of tcb instance.
Signed-off-by: chao.an anchao@xiaomi.com
Impact
arm context switch
Testing
Qemu:
sabre-6quad/smp, sabre-6quad/netknsh
Cortex-m33: cycle count test
#define DWT_CYCCNT (DWT_BASE + 0x0004) /* Cycle Count Register */
thread 1: (priority 100)
thread 2: (priority 200)
Before:
semaphore wait test: 797
After:
semaphore wait test: 604
+ ~24%