-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Describe the bug
I've been running the WhetStone Benchmark from the Examples repo on different boards to compare the performance of the MCUs but I got unexpected results. The following results have been obtained with the Generic variants of each MCU and no optimization (O2):
- H503 @ 250 MHz: ~38 MIPS
- G474 @ 150 MHz: ~40 MIPS
- F411 @ 100 MHz: ~30 MIPS
The H5 running at a higher frequency is slightly slower than the G4 running at a lower frequency.
To make sure I'm not doing something wrong, I've run the benchmark using the CubeIDE on all chips, with all 3 Cache configs for the H5 (disabled, 1way, 2way). The results for F4/G4 are similar with Arduino, together with H5 with disabled cache. When enabling the cache, the score is way better:
- H503 @ 250 MHz with 2 way Cache enabled: ~91 MIPS
This shows an improvement of almost 3x compared to the F4 @ 100MHz, as expected .
To Reproduce
Run the WhetStone benchmark from the Examples repo on H5 and F4/G4 MCUs and compare the results, taking into account the CPU frequency.
Expected behavior
I expected the H5 to outperform the G4/F4 only taking into account the frequency, let alone the M33 improvements over M4.
Desktop:
- OS: Windows
- Arduino IDE version: 2.3.2
- STM32 core version: main branch (d7019d1)
- Tools menu settings: O0, Newlib Nano + printf, USB CDC, UART disabled
- Upload method: DFU
Board:
- Name: Generic H503CBUx (on WeAct STM32H503CoreBoard)
Additional context
I know that support for H5 is still relatively new, but the family is supposed to be high performance so it will be a shame to have the it underperform.
I have been looking through the core code and I have seen the Cache being enabled for Cortex-M33 MCUs, but there were also some cache defines missing from the default config header. I don't know what is happening, am I missing something?