Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISCO_F746NG QSPI WriteEnable might Fail on IAR8 #10049

Closed
offirko opened this issue Mar 12, 2019 · 19 comments
Closed

DISCO_F746NG QSPI WriteEnable might Fail on IAR8 #10049

offirko opened this issue Mar 12, 2019 · 19 comments

Comments

@offirko
Copy link
Contributor

offirko commented Mar 12, 2019

Description

Following https://jira.arm.com/browse/IOTSTOR-798 tickect

When running storage tests on DISCO_F746NG with IAR8 it fails on test:
features-storage-tests-kvstore-static_tests

Same board and test pass ok on IAR7 , as well as on GCC_ARM and ARM.

The test fails in this line :
https://github.com/ARMmbed/mbed-os/blob/master/features/storage/TESTS/kvstore/static_tests/main.cpp#L296

When drilling down the failure is on sending write_enable to QSPI Flash, which eventually fails on timeout:

if (HAL_QSPI_Command(&obj->handle, &st_command, HAL_QPSI_TIMEOUT_DEFAULT_VALUE) != HAL_OK) {

Data can not be written afterward to the device… until reset.

The test uses kvstore file system to add key/value pairs which hold the values:
“name_a”, “name_b”, “name_c”,…,”name_z”

For some strange reason, the combination of “name_o” followed by “name_p” causes the bug.
Even if we skip all the previous entries and only set “name_o” followed by “name_p” it fails.

Issue request type

[ ] Question
[ ] Enhancement
[X] Bug
@offirko
Copy link
Contributor Author

offirko commented Mar 12, 2019

@jeromecoutant , @adustm : I'd appreciate your inputs
Thanks.

@ciarmcom
Copy link
Member

Internal Jira reference: https://jira.arm.com/browse/MBOCUSTRIA-990

@offirko
Copy link
Contributor Author

offirko commented Mar 15, 2019

@TuomoHautamaki - my analysis currently is that after several successful program commands to QSPI flash, a new program command fails on Write Enable. All further program/read/erase commands fail on HAL_BUSY. We need ST and HAL support on this case

@offirko
Copy link
Contributor Author

offirko commented Mar 21, 2019

@ARMmbed/mbed-os-maintainers - Please assign this issue to STM people

@offirko
Copy link
Contributor Author

offirko commented Mar 21, 2019

@VVESTM , @jeromecoutant , @adustm : hal qspi get stuck at certain stage of the test at:

status = QSPI_WaitFlagStateUntilTimeout(hqspi, QSPI_FLAG_TC, SET, tickstart, Timeout);

Eventually the 5[sec] timeout expires and error state is set:

hqspi->State = HAL_QSPI_STATE_ERROR;

The QSPI hal then is stuck, and no read/program/erase commands can be made until the device is reset !

@0xc0170
Copy link
Contributor

0xc0170 commented Mar 21, 2019

cc @ARMmbed/team-st-mcd

@VVESTM
Copy link
Contributor

VVESTM commented Mar 21, 2019

There is also something related to the toolchain. Does someone knowing IAR can see what can be the issue ?
Can it be a memory corruption ? For information, the problem always occurs at the same place. If we rename variable or change name_a to name_A, the problem moves or "disappear"... Same if we remove optimizations in compiler options.

@VVESTM
Copy link
Contributor

VVESTM commented Mar 22, 2019

Regarding optimizations, I made a test in develop.json file. We do not see the problem if we remove optimizations on C++ parts : (-On option instead of -Oh)
"IAR": {
"common": [
"--no_wrap_diagnostics", "-e",
"--diag_suppress=Pa050,Pa084,Pa093,Pa082", "--enable_restrict",
"-DMBED_TRAP_ERRORS_ENABLED=1"],
"asm": [],
"c": ["--vla", "--diag_suppress=Pe546", "-Oh"],
"cxx": ["--guard_calls", "--no_static_destruction", "-On"],
"ld": ["--skip_dynamic_initialization", "--threaded_lib"]
}
Does it means that problem can be on C++ part ?

@jeromecoutant
Copy link
Collaborator

@kjbracey-arm @pan- Could you have a look on questions we have around C++ and IAR ?
Thx

@VVESTM
Copy link
Contributor

VVESTM commented Mar 22, 2019

One more point. On @LMESTM side, the test is passed. The difference is the IAR version :
Test passed : IAR ELF Linker V8.32.2.178/W32 for ARM (EWARM-CD-8322-19423.exe)
Test failing : IAR ELF Linker V8.32.3.193/W32 for ARM (EWARM-CD-8323-20228.exe)

@offirko
Copy link
Contributor Author

offirko commented Mar 22, 2019

I've noticed there's a known issue for this device in IAR:
EWARM-5402, EW26024] Missing FIFO definition for register SPI1->CR2 in the SVD file for ST STM32F746

http://supp.iar.com/FilesPublic/UPDINFO/013240/arm/doc/infocenter/ewarm.ENU.html

@offirko
Copy link
Contributor Author

offirko commented Mar 24, 2019

@VVESTM - please note the problem is reproduced on my env using: IAR ELF Linker V8.32.1.169/W32 for ARM .
Also, I've used "none optimization" cxx setup:
"cxx": ["--guard_calls", "--no_static_destruction", "-On"],

And with a bit of code variation, reproduced the problem, this time when trying to set "name_b"

@offirko
Copy link
Contributor Author

offirko commented Mar 25, 2019

CC: @screamerbg

@offirko offirko changed the title DISCO_F746NG QSPI WriteEnable might Fails on IAR8 DISCO_F746NG QSPI WriteEnable might Fail on IAR8 Mar 25, 2019
@cmonr
Copy link
Contributor

cmonr commented Mar 25, 2019

@ARMmbed/mbed-os-test @ARMmbed/mbed-os-core @ARMmbed/mbed-os-maintainers

Fyi: #10049 (comment)

@offirko
Copy link
Contributor Author

offirko commented Mar 26, 2019

@VVESTM - Disabling Data Cache with a call to: SCB_DisableDCache() at begining of the test case resolves the problem.
(rest of the setup is default)

(could it after all be related to: #9934 (comment) ?)

@kjbracey
Copy link
Contributor

Although the STM32F7 is vulnerable to cache issues that other boards don't see, I don't believe there's any direct reason for this interface to be vulnerable. It's not being used as a bus-mastering interface like Ethernet, it just has a FIFO you access as programmed memory/mapped I/O, right? Should be no more problematic than the UART. (On the other hand #9934 is quite likely a cache issue).

So the optimisation and cache effects smell to me like a timing issue - maybe you're just slowing it down.

Alternatively, it could be that the cache change is a red-herring, and that it's just the act of inserting the call that moves code around again. :/

It's possible there's a compiler bug, or some code triggering undefined behaviour only in this compiler, but we'd need to pin down a bit closer what's actually going wrong.

There must be one initial transfer that times out - for that transfer we'd want to see how the peripheral had been programmed. Did we program incorrect values? If so, where did those incorrect values come from? Is the hardware signalling something that we're missing? We're waiting for the TC flag - is it signalling TE?

If there ever is a timeout, as was pointed out above, the state gets locked into "error", so it never works again. Is that reasonable? Is this supposed to be a reliable interface?

@dannybenor
Copy link

@VVESTM We see that this issue is reproducible but also is fragile, meaning small changes to the test, like adding prints, or playing with the cache, will "fix" the problem. We need your help in the investigation of the root cause why the QSPI get stuck.

@VVESTM
Copy link
Contributor

VVESTM commented Apr 1, 2019

@dannybenor, I am working on this issue. I come back when I have news.

@jeromecoutant
Copy link
Collaborator

ST_INTERNAL_REF 64387

VVESTM added a commit to VVESTM/mbed-os that referenced this issue Apr 8, 2019
This patch is missing in F7 HAL.
Fix ARMmbed#10049

Signed-off-by: Vincent Veron <vincent.veron@st.com>
adbridge pushed a commit that referenced this issue Apr 24, 2019
This patch is missing in F7 HAL.
Fix #10049

Signed-off-by: Vincent Veron <vincent.veron@st.com>
Jookia added a commit to Jookia/mbed-os that referenced this issue Mar 11, 2023
On the STM32769NI at least this patch is required for stable QSPI use.
Enable it uncondtionally in case other boards need it too.

Further discussions:

ARMmbed#10049
ARMmbed#15108

STMicroelectronics/STM32CubeF7#52
STMicroelectronics/STM32CubeF7#82
Jookia added a commit to Jookia/mbed-os that referenced this issue Mar 11, 2023
On the STM32769NI at least this patch is required for stable QSPI use.
Enable it uncondtionally in case other boards need it too.

Further discussions:

ARMmbed#10049
ARMmbed#15108

STMicroelectronics/STM32CubeF7#52
STMicroelectronics/STM32CubeF7#82
multiplemonomials pushed a commit to mbed-ce/mbed-os that referenced this issue Mar 21, 2023
* STM32F7: Unconditionally enable QSPI workarounds

On the STM32769NI at least this patch is required for stable QSPI use.
Enable it uncondtionally in case other boards need it too.

Further discussions:

ARMmbed#10049
ARMmbed#15108

STMicroelectronics/STM32CubeF7#52
STMicroelectronics/STM32CubeF7#82

* QSPIF: Attempt 4-byte addressing on Macronix chips

mbed-os PR 11531 introduced 4-byte addressing in the QSPIF block device:

ARMmbed#11531

During testing it was found that this code broke on the NRF52840_DK and
DISCO_F769NI.

The NRF52840_DK controller seems unable to handle 4-byte addressing at
all and has been disabled entirely in another code section.

The DISCO_F769NI breakage was attributed to the flash chip but after more
research I believe this is related to the QSPI controller, not the 4-byte
addressing itself.

Now that the QSPI controller has a workaround, enable 4-byte addressing
again and hope it works fine this time.
Jookia added a commit to Jookia/mbed-os that referenced this issue May 8, 2023
On the STM32769NI at least this patch is required for stable QSPI use.
Enable it uncondtionally in case other boards need it too.

Further discussions:

ARMmbed#10049
ARMmbed#15108

STMicroelectronics/STM32CubeF7#52
STMicroelectronics/STM32CubeF7#82
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants