Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_SysFatalErrorHandler not working properly for arc on quark_se_c1000_ss_devboard #8032

Closed
pswarnak opened this issue May 30, 2018 · 9 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug

Comments

@pswarnak
Copy link
Contributor

pswarnak commented May 30, 2018

The test tests/kernel/mem_protect/stackprot/ hangs after reporting "Stack Check Fail".

Arch: arc
Board: quark_se_c1000_ss_devboard
Zephyr tag: v1.12.0-rc2 - commit d93ecda.

Actual Console Log:

***** Booting Zephyr OS v1.12.0-rc2 *****
Starts alternate_thread
alternate_thread: Input string is too long and stack overflowed!

***** Stack Check Fail! *****
Current thread ID = 0xa8000400

Expected Console Log:

***** Booting Zephyr OS v1.12.0-rc2 *****                                                                             
Starts alternate_thread                                                                                               
alternate_thread: Input string is too long and stack overflowed!

***** Stack Check Fail! *****
Current thread ID = 0xa8007000
eax: 0x00000000, ebx: 0xa8007954, ecx: 0x00000000, edx: 0x0004dd6a
esi: 0x400330dc, edi: 0x4003038a, ebp: 0xa800794c, esp: 0xa8007944
eflags: 0x00000246 cs: 0x0008
eip: 0x40032eb3
Fatal fault in thread 0xa8007000! Aborting.
Running test suite stackprot
===================================================================
starting test - test_stackprot
test_stackprot: Stack ok
test_stackprot: Stack ok
test_stackprot: Stack ok
test_stackprot: Stack ok
test_stackprot: Stack ok
test_stackprot: Stack ok
PASS - test_stackprot
===================================================================
===================================================================
PROJECT EXECUTION SUCCESSFUL

Tried further debugging, the test hangs in _SysFatalErrorHandler function in the file "arch\arc\core\fatal.c".

Steps to reproduce:

**For running stackprot application on the ARC side of quark_c1000:**
cd zephyr.git/tests/kernel/mem_protect/stackprot/
rm -rf build && mkdir build && cd build
cmake -D BOARD=quark_se_c1000_ss_devboard ../
make BOARD=quark_se_c1000_ss_devboard flash

**Additionally, flash the boot stub to enable messages from the ARC processor:**
cd zephyr.git/tests/booting/stub 
mkdir build && cd build 
cmake -D BOARD=quark_se_c1000_devboard  ../  
make BOARD=quark_se_c1000_devboard  flash 

Check the console log.
@andrewboie
Copy link
Contributor

@vonhust can you take a look?

@ruuddw
Copy link
Member

ruuddw commented May 30, 2018

Tried further debugging, the test hangs in _SysFatalErrorHandler function in the file
"arch\x86\core\fatal.c".

The fatal error handling for ARC changed, so there could indeed be a problem.
However, the comment is about hanging in x86 fatal.c
Can you confirm this is about ARC and not Quark? Could it be some interaction between the two processors?

@andrewboie
Copy link
Contributor

We're supposed to get some kind of fault for the stack overflow but it's just hanging on ARC.
He gave expected output on x86 as an example.
I don't think we even turn on the MPU on the ARC processor on arduino 101 so I think we have a regression with how the stack guard registers are set, or possibly how exceptions are handled.

@vonhust
Copy link
Collaborator

vonhust commented May 31, 2018

Hi @andrewboie , i can reproduce it in emsk. It works as expected, because the stack overflow will raise STACK_CHECK exception. In current ARC's code, this will hang the system as shown in the following code

#if defined(CONFIG_ARC_STACK_CHECKING) || defined(CONFIG_STACK_SENTINEL) if (reason == _NANO_ERR_STACK_CHK_FAIL) { goto hang_system; } #endif

There are two ways to let the test pass

  • remove defined(CONFIG_ARC_STACK_CHECKING), but why do we keep defined(CONFIG_STACK_SENTINEL) to hang the system?

  • re-implement the SysFatalErrorHandler as other tests did.

Moreover, in the future, we ask TSC to make it clear which kind of exception should hang the system, which should not.

@ruuddw , what's your opinion?

@pswarnak
Copy link
Contributor Author

The fatal error handling for ARC changed, so there could indeed be a problem.
However, the comment is about hanging in x86 fatal.c
Can you confirm this is about ARC and not Quark? Could it be some interaction between the two processors?

Sorry, my mistake. I was trying to compare the arc/arm/x86 implementation in fatal.c and wrote "x86" by mistake. The test hangs in _SysFatalErrorHandler function in the file "arch\arc\core\fatal.c".

@kumarvikash1 kumarvikash1 added bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug labels May 31, 2018
@ruuddw
Copy link
Member

ruuddw commented May 31, 2018

ARC commit 8da51ee and 3d9ba10 changed the behavior of the default sys_fatal_error_handler to hang the system instead of killing the task and continue. Behavior is not consistent over different architectures, there is an #ifdef that enables the hang for stack errors, but conditions for the #ifdef are different. Bit strange since the error that is triggered is in all cases the same.
My recommendation would be to stick with the conservative 'hang' behavior for all stack errors independent of the cause (sentinel, canary, hw bounds check, MPU, ...), and change tests to overrule the default handler.
@andrewboie @agross-linaro : any other opinions?

vonhust pushed a commit to foss-for-synopsys-dwc-arc-processors/zephyr that referenced this issue Jun 7, 2018
Whether it should hang the system it not decided finally.
But remove it here to let some tests pass.

Fixes zephyrproject-rtos#8032

Signed-off-by: Wayne Ren <wei.ren@synopsys.com>
@ruuddw
Copy link
Member

ruuddw commented Jun 7, 2018

commit 5d58f0f allows the HW stack checking to be non-fatal in the ARC default sys_fatal_handler, so this test passes again (continue after killing the offending task).
Still, not sure this is the default behavior that we want...

@nashif
Copy link
Member

nashif commented Jun 7, 2018

Still, not sure this is the default behavior that we want...

@ruuddw can you open a new issue about default behaviour (enhancement) so we can address that for 1.13?

@ruuddw
Copy link
Member

ruuddw commented Jun 7, 2018

@ruuddw can you open a new issue about default behaviour (enhancement) so we can address that for 1.13?

#8265 opened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug priority: high High impact/importance bug
Projects
None yet
Development

No branches or pull requests

6 participants