Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proteus F7 hard fault on "enable self stimulation" #5354

Closed
rusefillc opened this issue Jun 25, 2023 · 27 comments · Fixed by #5468
Closed

proteus F7 hard fault on "enable self stimulation" #5354

rusefillc opened this issue Jun 25, 2023 · 27 comments · Fixed by #5468

Comments

@rusefillc
Copy link
Contributor

  • take F7 snapshot
  • open console, hit "enable self stimulation"
  • result: RED LED :(

on next boot

2023-06-25_01_33_34_046: EngineState: Last boot had hard fault type: 3 addr: 0 CSFR: 10000
2023-06-25_01_33_34_047: EngineState: Fault ctx 0: 29B55C0
2023-06-25_01_33_34_048: EngineState: Fault ctx 1: 0
2023-06-25_01_33_34_049: EngineState: Fault ctx 2: 29B55C0
2023-06-25_01_33_34_050: EngineState: Fault ctx 3: 200303B0
2023-06-25_01_33_34_051: EngineState: Fault ctx 4: 0
2023-06-25_01_33_34_052: EngineState: Fault ctx 5: 255F2D
2023-06-25_01_33_34_053: EngineState: Fault ctx 6: 255F2C
2023-06-25_01_33_34_054: EngineState: Fault ctx 7: 41000042
2023-06-25_01_33_34_055: EngineState: Fault ctx 8: 221401
2023-06-25_01_33_34_055: EngineState: Fault ctx 9: 200290B4
2023-06-25_01_33_34_056: EngineState: Fault ctx 10: 20021690
2023-06-25_01_33_34_057: EngineState: Fault ctx 11: 20029408
2023-06-25_01_33_34_057: EngineState: Fault ctx 12: 200306A8
2023-06-25_01_33_34_058: EngineState: Fault ctx 13: E0000000
2023-06-25_01_33_34_059: EngineState: Fault ctx 14: 0
2023-06-25_01_33_34_059: EngineState: Fault ctx 15: 20020EF0
2023-06-25_01_33_34_060: EngineState: Fault ctx 16: 200306A8
2023-06-25_01_33_34_061: EngineState: Fault ctx 17: E0000000
2023-06-25_01_33_34_062: EngineState: Fault ctx 18: 0
2023-06-25_01_33_34_062: EngineState: Fault ctx 19: 206D5B
2023-06-25_01_33_34_063: EngineState: Fault ctx 20: 20029F00
2023-06-25_01_33_34_064: EngineState: Fault ctx 21: FFFFFFE1
2023-06-25_01_33_34_064: EngineState: Fault ctx 22: 0
2023-06-25_01_33_34_065: EngineState: Fault ctx 23: 24F851
2023-06-25_01_33_34_066: EngineState: Fault ctx 24: 0
2023-06-25_01_33_34_067: EngineState: Fault ctx 25: 200306EC
@rusefillc
Copy link
Contributor Author

Same using "self stimulate" button in TS

2023-06-25_01_37_19_932: EngineState: Last boot had hard fault type: 3 addr: 0 CSFR: 10000
2023-06-25_01_37_19_932: EngineState: Fault ctx 0: 2A3D198F
2023-06-25_01_37_19_933: EngineState: Fault ctx 1: 0
2023-06-25_01_37_19_934: EngineState: Fault ctx 2: 2A3D198F
2023-06-25_01_37_19_934: EngineState: Fault ctx 3: 200303B0
2023-06-25_01_37_19_935: EngineState: Fault ctx 4: 24EE20
2023-06-25_01_37_19_935: EngineState: Fault ctx 5: 255F2D
2023-06-25_01_37_19_936: EngineState: Fault ctx 6: 255F2C
2023-06-25_01_37_19_937: EngineState: Fault ctx 7: 41000042
2023-06-25_01_37_19_938: EngineState: Fault ctx 8: 221469
2023-06-25_01_37_19_938: EngineState: Fault ctx 9: 20004E40
2023-06-25_01_37_19_939: EngineState: Fault ctx 10: 200007A4
2023-06-25_01_37_19_940: EngineState: Fault ctx 11: 25554F
2023-06-25_01_37_19_940: EngineState: Fault ctx 12: 20029F00
2023-06-25_01_37_19_941: EngineState: Fault ctx 13: 20020E68
2023-06-25_01_37_19_941: EngineState: Fault ctx 14: 1
2023-06-25_01_37_19_942: EngineState: Fault ctx 15: 3CCDF6CB
2023-06-25_01_37_19_943: EngineState: Fault ctx 16: C
2023-06-25_01_37_19_943: EngineState: Fault ctx 17: 20020E98
2023-06-25_01_37_19_944: EngineState: Fault ctx 18: 20028F28
2023-06-25_01_37_19_945: EngineState: Fault ctx 19: 20020E98
2023-06-25_01_37_19_945: EngineState: Fault ctx 20: 0
2023-06-25_01_37_19_946: EngineState: Fault ctx 21: 0
2023-06-25_01_37_19_947: EngineState: Fault ctx 22: 0
2023-06-25_01_37_19_948: EngineState: Fault ctx 23: 24F851
2023-06-25_01_37_19_949: EngineState: Fault ctx 24: 0
2023-06-25_01_37_19_949: EngineState: Fault ctx 25: 200306EC
2023-06-25_01_37_19_950: EngineState: Power cycle count: 3

@rusefillc rusefillc added the bug label Jun 25, 2023
rusefillc pushed a commit that referenced this issue Jun 25, 2023
i feel lucky so I make random changes
@rusefillc
Copy link
Contributor Author

rusefillc commented Jun 25, 2023

master build https://github.com/rusefi/rusefi/actions/runs/5370417084 https://github.com/rusefi/rusefi/suites/13847069639/artifacts/769103925

AFFECTED - fatal with self-stim

2023-06-25_13_22_01_014: EngineState: *** rusEFI v20230625@54879
2023-06-25_13_22_01_015: EngineState: *** Chibios Kernel:       6.1.4
2023-06-25_13_22_01_016: EngineState: *** Compiled:     Jun 25 2023 - 15:35:47
2023-06-25_13_22_01_016: EngineState: *** COMPILER=11.3.1 20220712
2023-06-25_13_22_01_017: EngineState: *** detected HSE clock 7.92 MHz PLLM = 8

GCC12 NOT AFFECTED WOW @mck1117
https://github.com/rusefi/rusefi/actions/runs/5370487676 https://github.com/rusefi/rusefi/suites/13847203648/artifacts/769112365

2023-06-25_13_25_31_839: EngineState: *** rusEFI v20230625@54879
2023-06-25_13_25_31_839: EngineState: *** Chibios Kernel:       6.1.4
2023-06-25_13_25_31_840: EngineState: *** Compiled:     Jun 25 2023 - 15:53:01
2023-06-25_13_25_31_840: EngineState: *** COMPILER=12.2.1 20221205

GCC11 no optimization AFFECTED - fatal with self-stim
https://github.com/rusefi/rusefi/actions/runs/5368499521 https://github.com/rusefi/rusefi/suites/13843462071/artifacts/768845175

2023-06-25_13_31_55_999: EngineState: *** rusEFI v20230625@54879
2023-06-25_13_31_55_999: EngineState: *** Chibios Kernel:       6.1.4
2023-06-25_13_31_56_000: EngineState: *** Compiled:     Jun 25 2023 - 07:15:32
2023-06-25_13_31_56_000: EngineState: *** COMPILER=11.3.1 20220712

@rusefillc
Copy link
Contributor Author

@dron0gus says
addr2line 0x081230000 -e build/rusefi.elf

@rusefillc
Copy link
Contributor Author

2023-06-25_17_35_11_779: EngineState: Last boot had hard fault type: 3 addr: 0 CSFR: 10000
2023-06-25_17_35_11_780: EngineState: Fault ctx 0: 6535130
2023-06-25_17_35_11_781: EngineState: Fault ctx 1: 0
2023-06-25_17_35_11_783: EngineState: Fault ctx 2: 6535130
2023-06-25_17_35_11_783: EngineState: Fault ctx 3: 200303B0
2023-06-25_17_35_11_784: EngineState: Fault ctx 4: 0
2023-06-25_17_35_11_785: EngineState: Fault ctx 5: 255FC9
2023-06-25_17_35_11_785: EngineState: Fault ctx 6: 255FC8
2023-06-25_17_35_11_786: EngineState: Fault ctx 7: 41000042
2023-06-25_17_35_11_787: EngineState: Fault ctx 8: 221469
2023-06-25_17_35_11_788: EngineState: Fault ctx 9: 20004E40
2023-06-25_17_35_11_789: EngineState: Fault ctx 10: 200007A4
2023-06-25_17_35_11_789: EngineState: Fault ctx 11: 2555EB
2023-06-25_17_35_11_790: EngineState: Fault ctx 12: 20029F00
2023-06-25_17_35_11_791: EngineState: Fault ctx 13: 20020E68
2023-06-25_17_35_11_791: EngineState: Fault ctx 14: 1
2023-06-25_17_35_11_792: EngineState: Fault ctx 15: 3D0FF355
2023-06-25_17_35_11_793: EngineState: Fault ctx 16: C
2023-06-25_17_35_11_793: EngineState: Fault ctx 17: 20020E98
2023-06-25_17_35_11_794: EngineState: Fault ctx 18: 20028F28
2023-06-25_17_35_11_794: EngineState: Fault ctx 19: 20020E98
2023-06-25_17_35_11_795: EngineState: Fault ctx 20: 0
2023-06-25_17_35_11_795: EngineState: Fault ctx 21: 0
2023-06-25_17_35_11_796: EngineState: Fault ctx 22: 0
2023-06-25_17_35_11_797: EngineState: Fault ctx 23: 24F8ED
2023-06-25_17_35_11_797: EngineState: Fault ctx 24: 0
2023-06-25_17_35_11_798: EngineState: Fault ctx 25: 200306EC

addr2line 0x20029F00 -e build/rusefi.elf

@dron0gus
Copy link
Member

ctx 5 = lr
ctx 6= pc <- this is most interesting address.
ctx 7 .. ctx 25 - fpu registers - zero interest.

@rusefillc
Copy link
Contributor Author

Still an issue even with gcc 12 :(

2023-07-22_15_07_49_336: EngineState: *** rusEFI v20230722@54879
2023-07-22_15_07_49_336: EngineState: *** Chibios Kernel:       6.1.4
2023-07-22_15_07_49_337: EngineState: *** Compiled:     Jul 22 2023 - 19:01:08
2023-07-22_15_07_49_338: EngineState: *** COMPILER=12.2.1 20221205
2023-07-22_15_07_49_338: EngineState: *** detected HSE clock 0.00 MHz PLLM = 8

@mck1117
Copy link
Member

mck1117 commented Jul 22, 2023

why don't you attach a debugger instead of shooting in the dark

@rusefillc
Copy link
Contributor Author

rusefillc commented Jul 22, 2023

why don't you attach a debugger instead of shooting in the dark

that's exactly what I am currently failing to do! I am failing to compile a debug firmware see #5354 (comment) #5432

At the moment I desperately need a ELI5 on how to figure out what features consume most of flash so that I can reduce binary size at least in a special build. Another option is a true 2MB chip.

@mck1117
Copy link
Member

mck1117 commented Jul 22, 2023

Lua is huge. Output/config value lookup are both huge.

@rusefillc
Copy link
Contributor Author

I was suspecting that Output/config value lookup is part of the deal - let me try that. Still an ELI5 in general would be great :)

@rusefillc
Copy link
Contributor Author

rusefillc commented Jul 22, 2023

#5434 created to make Output/config conditional compilation

rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 24, 2023
making configurations visible for HW tests
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc added a commit that referenced this issue Jul 24, 2023
only:coverage, that's easy!

Co-authored-by: rusefillc <sdfsdfqsf2334234234>
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc added a commit that referenced this issue Jul 24, 2023
only:coverage, that's easy!

Co-authored-by: rusefillc <sdfsdfqsf2334234234>
rusefillc pushed a commit that referenced this issue Jul 24, 2023
rusefillc pushed a commit that referenced this issue Jul 25, 2023
rusefillc added a commit that referenced this issue Jul 25, 2023
rusefillc added a commit that referenced this issue Jul 25, 2023
@rusefillc
Copy link
Contributor Author

rusefillc pushed a commit that referenced this issue Jul 25, 2023
@rusefillc
Copy link
Contributor Author

See #5462

rusefillc pushed a commit that referenced this issue Jul 25, 2023
rusefillc added a commit that referenced this issue Jul 25, 2023
@rusefillc
Copy link
Contributor Author

@dron0gus
Copy link
Member

Reproduced with gcc 12.2.rel1. Not reproduced with 11.3.rel1

Thread 1 "idle" received signal SIGTRAP, Trace/breakpoint trap.
HardFault_Handler_C (sp=<optimized out>) at main_hardfault.c:63
63		bkpt();
(gdb) bt
#0  HardFault_Handler_C (sp=<optimized out>) at main_hardfault.c:63
#1  <signal handler called>
#2  0x00255c7c in Timer::hasElapsedUs (this=0x2003b7bb <_ZL11s_bigBuffer.lto_priv.0+1256>, microseconds=5000000) at ./util/timer.cpp:30
#3  0x002451e6 in Timer::hasElapsedMs (milliseconds=5000, this=0x2003b7bb <_ZL11s_bigBuffer.lto_priv.0+1256>) at ./util/timer.cpp:26
#4  Timer::hasElapsedSec (seconds=5, this=0x2003b7bb <_ZL11s_bigBuffer.lto_priv.0+1256>) at ./util/timer.cpp:22
#5  SetNextCompositeEntry (timestamp=<optimized out>) at ./console/binary/tooth_logger.cpp:225
#6  0x0024526e in LogTriggerTooth (timestamp=142588366, tooth=SHAFT_SECONDARY_RISING) at ./console/binary/tooth_logger.cpp:289
#7  LogTriggerTooth (tooth=<optimized out>, timestamp=<optimized out>) at ./console/binary/tooth_logger.cpp:242
#8  0x00234b94 in handleShaftSignal (signalIndex=1, isRising=<optimized out>, timestamp=<optimized out>) at ./controllers/trigger/trigger_central.cpp:489
#9  0x00230380 in PwmConfig::togglePwmState (this=0x20022310 <triggerEmulatorSignal>) at ./controllers/system/timer/pwm_generator_logic.cpp:215
#10 PwmConfig::togglePwmState (this=0x20022310 <triggerEmulatorSignal>) at ./controllers/system/timer/pwm_generator_logic.cpp:174
#11 timerCallback (state=0x20022310 <triggerEmulatorSignal>) at ./controllers/system/timer/pwm_generator_logic.cpp:255
#12 0x002341b4 in action_s::execute (this=<synthetic pointer>) at ./controllers/system/timer/scheduler.cpp:12
#13 EventQueue::executeOne (now=<optimized out>, this=0x20000d58 <___engine+3416>) at ./controllers/system/timer/event_queue.cpp:220
#14 SingleTimerExecutor::executeAllPendingActions (this=0x20000d40 <___engine+3392>) at ./controllers/system/timer/single_timer_executor.cpp:136
#15 0x00229c36 in SingleTimerExecutor::onTimerCallback (this=<optimized out>) at ./controllers/system/timer/single_timer_executor.cpp:106
#16 globalTimerCallback () at ./controllers/system/timer/single_timer_executor.cpp:39
#17 portMicrosecondTimerCallback () at ./hw_layer/microsecond_timer/microsecond_timer.cpp:100
#18 hwTimerCallback () at ./hw_layer/ports/stm32/microsecond_timer_stm32.cpp:32
#19 0x0024e838 in pwm_lld_serve_interrupt (pwmp=0x2002a7d4 <PWMD5>) at ChibiOS/os/hal/ports/STM32/LLD/TIMv1/hal_pwm_lld.c:1286
#20 0x0020629c in Vector108 () at ChibiOS/os/hal/ports/STM32/LLD/TIMv1/stm32_tim5.inc:116
#21 <signal handler called>
#22 0x0024ba84 in otg_epin_handler.constprop.0 (ep=<optimized out>, usbp=<optimized out>) at ChibiOS/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c:399
#23 0x00206dca in usb_lld_serve_interrupt (usbp=0x2002a640 <USBD1>) at ChibiOS/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c:644
#24 Vector14C () at ChibiOS/os/hal/ports/STM32/LLD/OTGv1/hal_usb_lld.c:684
#25 <signal handler called>
#26 0x0024e878 in port_wait_for_interrupt () at ChibiOS/os/common/ports/ARMCMx/chcore_v7m.h:772
#27 _idle_thread (p=0x0) at ChibiOS/os/rt/src/chsys.c:79
#28 0x00200f46 in _port_thread_start () at ChibiOS/os/common/ports/ARMCMx/compilers/GCC/chcoreasm_v7m.S:201
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@dron0gus
Copy link
Member

Does not look like stack overflow...

(gdb) info reg
r0             0x4002407c          0x4002407c
r1             0x20020dc8          0x20020dc8
r2             0x20020dc8          0x20020dc8
r3             0x10000             0x10000
r4             0x2003b7bb          0x2003b7bb
r5             0x1                 0x1
r6             0x2003b2d3          0x2003b2d3
r7             0x87fb9ce           0x87fb9ce
r8             0x0                 0x0
r9             0x0                 0x0
r10            0x0                 0x0
r11            0x0                 0x0
r12            0x102               0x102
sp             0x20020d60          0x20020d60
lr             0x23985d            0x23985d
pc             0x21518c            0x21518c <HardFault_Handler_C(void*)+44>
xpsr           0x61000003          0x61000003
fpscr          0x0                 0x0
msp            0x20020d60          0x20020d60
psp            0x20024350          0x20024350 <ch_idle_thread_wa+256>
primask        0x0                 0x0
basepri        0x20                0x20
faultmask      0x0                 0x0
control        0x0                 0x0
(gdb) info symbol 0x20020d60
__main_stack_base__ + 3424 in section .mstack

@mck1117
Copy link
Member

mck1117 commented Jul 25, 2023

what's the exact pc where it crashed?

@dron0gus
Copy link
Member

dron0gus commented Jul 25, 2023

what's the exact pc where it crashed?

optimized out

(gdb) info local
ctx = {r0 = 0x7c3af9b, r1 = 0x0, r2 = 0x7c3af9b, r3 = 0xfffffffe, r12 = 0x20022590, lr_thd = 0x228d61, pc = 0x228d64, xpsr = 0x41000042, s0 = 0x2002b7d4, s1 = 0x2, s2 = 0x80000000, s3 = 0x20004db0, 
  s4 = 0x41a0f5cd, s5 = 0x0, s6 = 0x3f800000, s7 = 0x2062a5, s8 = 0x1, s9 = 0x41a0f5cd, s10 = 0xc, s11 = 0x20021eb0, s12 = 0x200239b0, s13 = 0x20021eb0, s14 = 0x0, s15 = 0x7c38427, fpscr = 0x0, 
  reserved = 0x24f68f}
faultType = <optimized out>
faultAddress = <optimized out>
isFaultPrecise = <optimized out>
isFaultImprecise = <optimized out>
isFaultOnUnstacking = <optimized out>
isFaultOnStacking = <optimized out>
isFaultAddressValid = <optimized out>

@rusefillc
Copy link
Contributor Author

@dron0gus you build locally right not master? and that's you EG33 F7 board not proteus right?

I know 4chan4F was affected as long as it's -0s

@dron0gus
Copy link
Member

Master. Build localy. Only optimization level is changed.

@dron0gus
Copy link
Member

Not much details even locals are preserved from optimization

(gdb) info local
ctx = {r0 = 0x94d8079, r1 = 0x0, r2 = 0x94d8079, r3 = 0xfffffffe, r12 = 0xfc, lr_thd = 0x228ded, pc = 0x228df0, xpsr = 0x41000042, s0 = 0x20032898, s1 = 0x25589b, s2 = 0x40012100, s3 = 0x20004db0, 
  s4 = 0x419f4f99, s5 = 0x20004000, s6 = 0x3f800000, s7 = 0x2062fb, s8 = 0x1, s9 = 0x419f4f99, s10 = 0xc, s11 = 0x20021eb0, s12 = 0x200239b0, s13 = 0x20021eb0, s14 = 0x0, s15 = 0x94d7bfe, fpscr = 0x0, 
  reserved = 0x24f71b}
faultType = HardFault
faultAddress = 0x0
isFaultPrecise = 0x0
isFaultImprecise = 0x0
isFaultOnUnstacking = 0x0
isFaultOnStacking = 0x0
isFaultAddressValid = 0x0

@mck1117
Copy link
Member

mck1117 commented Jul 25, 2023

are you building with -Os -g3 -ggdb?

@dron0gus
Copy link
Member

are you building with -Os -g3 -ggdb?

Yes.

To reproduce: start TS, pres "Enable internal trigger simulation", close TS, open Console. Done, you are in HardFault.

@rusefillc
Copy link
Contributor Author

Minor related issue #5467

rusefillc pushed a commit that referenced this issue Jul 25, 2023
rusefillc added a commit that referenced this issue Jul 26, 2023
only:coverage, that's easy!

Co-authored-by: rusefillc <sdfsdfqsf2334234234>
@mck1117
Copy link
Member

mck1117 commented Jul 26, 2023

so what was the bug?

@rusefillc
Copy link
Contributor Author

@mck1117 we would have to wait until @dron0gus wakes up tomorrow

@mck1117
Copy link
Member

mck1117 commented Jul 26, 2023

OK, I went and debugged it myself.

Here's what was happening:

The tooth logger uses an instance of Timer allocated inside the big buffer. The compiler generates Timer with the assumption that it will be allocated aligned properly, which in this case means on a 4 byte boundary. In generating the code for Timer::reset(efitick_t), it emits a single instruction (plus a return): strd r2, r3, [r0]. strd faults on unaligned access.

If you force misalignment of that buffer, you can force this to happen on all ARM processors at any optimization level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants