Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISCO_F746NG QSPI WriteEnable might Fail on IAR8 #10049

Closed
offirko opened this issue Mar 12, 2019 · 19 comments

Comments

Projects
None yet
9 participants
@offirko
Copy link
Contributor

commented Mar 12, 2019

Description

Following https://jira.arm.com/browse/IOTSTOR-798 tickect

When running storage tests on DISCO_F746NG with IAR8 it fails on test:
features-storage-tests-kvstore-static_tests

Same board and test pass ok on IAR7 , as well as on GCC_ARM and ARM.

The test fails in this line :
https://github.com/ARMmbed/mbed-os/blob/master/features/storage/TESTS/kvstore/static_tests/main.cpp#L296

When drilling down the failure is on sending write_enable to QSPI Flash, which eventually fails on timeout:

if (HAL_QSPI_Command(&obj->handle, &st_command, HAL_QPSI_TIMEOUT_DEFAULT_VALUE) != HAL_OK) {

Data can not be written afterward to the device… until reset.

The test uses kvstore file system to add key/value pairs which hold the values:
“name_a”, “name_b”, “name_c”,…,”name_z”

For some strange reason, the combination of “name_o” followed by “name_p” causes the bug.
Even if we skip all the previous entries and only set “name_o” followed by “name_p” it fails.

Issue request type

[ ] Question
[ ] Enhancement
[X] Bug
@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 12, 2019

@jeromecoutant , @adustm : I'd appreciate your inputs
Thanks.

@ciarmcom

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

@TuomoHautamaki - my analysis currently is that after several successful program commands to QSPI flash, a new program command fails on Write Enable. All further program/read/erase commands fail on HAL_BUSY. We need ST and HAL support on this case

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

@ARMmbed/mbed-os-maintainers - Please assign this issue to STM people

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

@VVESTM , @jeromecoutant , @adustm : hal qspi get stuck at certain stage of the test at:

status = QSPI_WaitFlagStateUntilTimeout(hqspi, QSPI_FLAG_TC, SET, tickstart, Timeout);

Eventually the 5[sec] timeout expires and error state is set:

hqspi->State = HAL_QSPI_STATE_ERROR;

The QSPI hal then is stuck, and no read/program/erase commands can be made until the device is reset !

@0xc0170

This comment has been minimized.

Copy link
Member

commented Mar 21, 2019

@VVESTM

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2019

There is also something related to the toolchain. Does someone knowing IAR can see what can be the issue ?
Can it be a memory corruption ? For information, the problem always occurs at the same place. If we rename variable or change name_a to name_A, the problem moves or "disappear"... Same if we remove optimizations in compiler options.

@VVESTM

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

Regarding optimizations, I made a test in develop.json file. We do not see the problem if we remove optimizations on C++ parts : (-On option instead of -Oh)
"IAR": {
"common": [
"--no_wrap_diagnostics", "-e",
"--diag_suppress=Pa050,Pa084,Pa093,Pa082", "--enable_restrict",
"-DMBED_TRAP_ERRORS_ENABLED=1"],
"asm": [],
"c": ["--vla", "--diag_suppress=Pe546", "-Oh"],
"cxx": ["--guard_calls", "--no_static_destruction", "-On"],
"ld": ["--skip_dynamic_initialization", "--threaded_lib"]
}
Does it means that problem can be on C++ part ?

@jeromecoutant

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

@kjbracey-arm @pan- Could you have a look on questions we have around C++ and IAR ?
Thx

@VVESTM

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

One more point. On @LMESTM side, the test is passed. The difference is the IAR version :
Test passed : IAR ELF Linker V8.32.2.178/W32 for ARM (EWARM-CD-8322-19423.exe)
Test failing : IAR ELF Linker V8.32.3.193/W32 for ARM (EWARM-CD-8323-20228.exe)

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

I've noticed there's a known issue for this device in IAR:
EWARM-5402, EW26024] Missing FIFO definition for register SPI1->CR2 in the SVD file for ST STM32F746

http://supp.iar.com/FilesPublic/UPDINFO/013240/arm/doc/infocenter/ewarm.ENU.html

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2019

@VVESTM - please note the problem is reproduced on my env using: IAR ELF Linker V8.32.1.169/W32 for ARM .
Also, I've used "none optimization" cxx setup:
"cxx": ["--guard_calls", "--no_static_destruction", "-On"],

And with a bit of code variation, reproduced the problem, this time when trying to set "name_b"

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

@offirko offirko changed the title DISCO_F746NG QSPI WriteEnable might Fails on IAR8 DISCO_F746NG QSPI WriteEnable might Fail on IAR8 Mar 25, 2019

@cmonr

This comment has been minimized.

@offirko

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

@VVESTM - Disabling Data Cache with a call to: SCB_DisableDCache() at begining of the test case resolves the problem.
(rest of the setup is default)

(could it after all be related to: #9934 (comment) ?)

@kjbracey-arm

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

Although the STM32F7 is vulnerable to cache issues that other boards don't see, I don't believe there's any direct reason for this interface to be vulnerable. It's not being used as a bus-mastering interface like Ethernet, it just has a FIFO you access as programmed memory/mapped I/O, right? Should be no more problematic than the UART. (On the other hand #9934 is quite likely a cache issue).

So the optimisation and cache effects smell to me like a timing issue - maybe you're just slowing it down.

Alternatively, it could be that the cache change is a red-herring, and that it's just the act of inserting the call that moves code around again. :/

It's possible there's a compiler bug, or some code triggering undefined behaviour only in this compiler, but we'd need to pin down a bit closer what's actually going wrong.

There must be one initial transfer that times out - for that transfer we'd want to see how the peripheral had been programmed. Did we program incorrect values? If so, where did those incorrect values come from? Is the hardware signalling something that we're missing? We're waiting for the TC flag - is it signalling TE?

If there ever is a timeout, as was pointed out above, the state gets locked into "error", so it never works again. Is that reasonable? Is this supposed to be a reliable interface?

@dannybenor

This comment has been minimized.

Copy link

commented Mar 27, 2019

@VVESTM We see that this issue is reproducible but also is fragile, meaning small changes to the test, like adding prints, or playing with the cache, will "fix" the problem. We need your help in the investigation of the root cause why the QSPI get stuck.

@VVESTM

This comment has been minimized.

Copy link
Contributor

commented Apr 1, 2019

@dannybenor, I am working on this issue. I come back when I have news.

@jeromecoutant

This comment has been minimized.

Copy link
Contributor

commented Apr 1, 2019

ST_INTERNAL_REF 64387

VVESTM added a commit to VVESTM/mbed-os that referenced this issue Apr 8, 2019

TARGET_STM32F7: Reset QSPI in default mode on abort for all versions.
This patch is missing in F7 HAL.
Fix ARMmbed#10049

Signed-off-by: Vincent Veron <vincent.veron@st.com>

adbridge added a commit that referenced this issue Apr 24, 2019

TARGET_STM32F7: Reset QSPI in default mode on abort for all versions.
This patch is missing in F7 HAL.
Fix #10049

Signed-off-by: Vincent Veron <vincent.veron@st.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.