-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the endless loop issue with GCC O0. #1426
Fix the endless loop issue with GCC O0. #1426
Conversation
Can one of the admins verify this patch? |
0bcba2a
to
f335fdf
Compare
@grasci-arm test this please |
b5b9fb1
to
6b44a52
Compare
More details, see ARM-software#620 The issue only happens when local variables are in stack (GCC O0). If local variables are saved in general purpose register, then the function is OK. When local variables are in stack, after disabling the cache, flush the local variables cache line for data consistency.
71ca609
to
d42faff
Compare
Is the stack alignment to the cache line size really necessary in commit 3070886 ? Non-optimized builds with the IAR compiler also suffers from the endless loop issue, but the stack can not be aligned to match the cache line size due to the way the stack works in IAR builds. Is my understanding correct or not? |
Is there a reason why these fixes are only applied to one location |
Hi @Wastus, This was a community contribution from @Masmiseim36. Can you check if the same change would solve your issue in As a last resort, I could only propose to implement cache maintenances routines in assembly specific to your requirements. You could use disassembled code using high optimization as a starting point. |
Hallo @Wastus, Cache stuff is a quite complicated thing. In PR #1611, the flush for the local variables was also extended for IAR and Keil. Probably the behavior of both compilers has changed in the meantime, maybe my test was not correct. Because of the complexity and because my first request in this direction (#620) was rejected with reference to a compiler bug, I had concentrated in this PR on the functionality that I could test well and where I had problems in my application. |
Hi All, Expanding on this topic with some exposition. Problem OneThe problem with For the Cortex-M7, if Reading locations that are still dirty in the cache while Writing locations that are still dirty in the cache while Further ExplanationFor this issue with unoptimized builds using stack variables, a window of opportunity exists in It is circumstantial that the stack is affected, as any dirty cached location could be affected even in optimized builds. This means that even if registers were used for the variables, or even with this mitigation for unoptimized builds, there are still issues that may occur. For example, if an exception is taken before the clean and invalidate operation in The stack variables could also feasibly be in a non-cacheable location, and this specific issue would not be seen. The problem is situational. Problem TwoThe problem of stack variable usage with For Cortex-M7, Cortex-M55, and Cortex-M85, the Invalidation destroys cache lines whether they are dirty or clean. If the stack variables are cached, they will be destroyed by full invalidation or may be partially or fully destroyed by address-based invalidation. This has nothing to do with the This PR and the companion IAR/Keil PR do not mitigate the issue with these functions. Further ExplanationThe nature of invalidation creates this problem. Also, consider that if an exception is taken before the invalidate operation in The stack variables could feasibly be in a non-cacheable location, may be write-through, or may be spared destruction by an address-based invalidation, and this specific issue would not be seen. The problem is situational. Why Problem One is CM7 OnlyThe newer Cortex-M55 and Cortex-M85 should not have this issue because of new cache behaviors provided by the For the specific example of For CM7, because both cache allocations and cache lookups stop when A clean and invalidate will thus write the stack variable locations with the dirty cached data, now outdated, which they had before For CM55 and CM85, because only cache allocations stop when A clean and invalidate will thus write the stack variable locations with the correct dirty cached data, when they are finally cleaned from the data cache. This no-allocation, but continued lookup cache behavior solves the edge case for read and write access to dirty locations when Recent No-Optimization Compilation ResultsI checked the disassembly on CMSIS 5.9.0 (prior to this patch) on the following latest compilers and built for a Cortex-M85 target, and all generated similar stack access code in the affected functions that would cause an issue when their optimization was turned off.
The latest IAR did not seem to generate stack access code and was OK.
I assume similar code would be generated for a Cortex-M7 target. Snippet of
QuestionsBecause I have not tested the Cortex-M7 issue myself, was this truly an "infinite loop" or did the stack variables simply become really large from stale data by coincidence? In theory the stack variables could be overwritten to any value, meaning this issue could have gone unnoticed as an "infinite loop" if the values were overwritten to small values and caused the loop to terminate early, leaving dirty lines in the data cache. I do not see how this could occur for The problem with Data Cache Functions
I assume the hardware will properly hazard read and write access to locations where an invalidate, clean, or clean and invalidate cache maintenance operation is currently taking place, so long as cache lookup is enabled for the respective Cortex-M type. An application still has to be responsible for knowing if interrupting cache maintenance in any of these functions will be a problem for the application logic. Generally, the data cache invalidate-only functions should not be used and should be replaced with clean-and-invalidate functions, since they eliminate the data loss risk with cache lines that accidentally mix dirty locations and locations to be invalidated on the same cache line(s). Cache Behavior
ReferencesFor Cortex-M7, reference
For Cortex-M55 and Cortex-M85, reference
|
Hello @njankowski-renesas Thank you for your detailed explanation. The whole cache-topic is really complicated. According to my observation, the following happens without an invalidate cache of the local variables: Precondition:
Process: Incidentally, my original statement that this is an infinite loop was incorrect. The loop is finite, but this can take up to ten minutes on a fast microcontroller. If you clear/invalidate the stack before the clean/invalidate loop, you can avoid this error. In my opinion, this is a valid solution. Regards |
Hi @Masmiseim36, Thanks for your reply!
Clean and invalidate, that is :)
This is great to hear, because it helps validate my thought experiment on the issue.
Agreed. Unfortunately the invalidate-only functions for data cache are also affected as I described for the second problem, but there is no mitigation that can be done without breaking the expectations of the functions. The fortunate aspect is I would assume running into this scenario with the invalidate-only functions would be unlikely, since they are probably seldom used by most applications, or are used in ways where this problem would not occur. Regardless, problem two persists as an issue to be aware of. |
More details, see #620
The issue only happens when local variables are in stack (GCC O0). If local variables are saved
in general purpose register, then the function is OK.
When local variables are in stack, after disabling the cache, flush the local variables cache
line for data consistency.