-
Notifications
You must be signed in to change notification settings - Fork 316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem clearing software breakpoints with other halted hart on smp #893
Comments
But |
Can you elaborate a bit more on what exactly the problem case is? OpenOCD already tries to execute @TommyMurphyTM1234, you are right, and OpenOCD has always just executed |
In practice that's probably ok until (if it ever happens) someone has an implementation that doesn't support |
RV32G/RV64G are general purpose platforms, G includes IMAFD_Zicsr_Zifencei. so general purpose platforms support Zifencei extensions, we may need to test if the target platform supports Zifencei or add the option. But the patch to clear software breakpoints through available cores doesn't just affect RISC-V, I believe other platforms have cache synchronization issues as well. |
Although Zifencei is an optional extension, it often appears, and only some simple embedded platforms do not need to implement this extension. |
Apologies for going off topic on the Edit: I see that the issue was previously mentioned here: And there's a potentially related but wider in scope discussion here: |
What I mean is: this patch introduces bugs and needs to be removed. Whether the Zifencei extension is supported is another question. |
But you haven't addressed @timsifive's question here: |
Assume that the software breakpoint of hart0 is cleared through hart1, hart0 is running, and hart1 modifies the memory. The memory modified by hart1 may not be synchronized to the instruction cache of hart0. hart0 itself is needed to run fence.i.
|
That will only happen in unusual circumstances. We're looking for harts within an SMP group, and OpenOCD tries to keep all harts in a single SMP group in the same state. If it's not successful, then at best gdb is going to get confused. Having one hart in an SMP group running while another is halted I don't think is a supported use case. (It could happen because the debugger doesn't have complete control over a target, but if things change spontaneously on the target there are all kinds of possible problems.) |
39a4f37#diff-926a25c5eb17bac57cb7ba08fb85a02d9ffe1df033fd3336368b906ea3672805R373-R379 |
Yes, and the only way I'm aware of that code actually being executed is if the running hart became running/unavailable without the debugger requesting it. What do you think should happen in such a case? |
Therefore, using available hart instead of running hart to clear software breakpoints has the problem of cache synchronization. Maybe available hart only writes instructions to its own D-Cache, and the I-Cache of running hart is not updated. |
|
The I-Cache, unlike the D-Cache, does not require frequent synchronization with the next level of memory, and only needs to be loaded when the cache is not hit. When this happens there are two result:
|
Could it be the case that such HW configurations should not use SMP configuration at all? I believe such configurations have difficulties even with setting SW breakpoints. The problem is the same - we set SW breakpoint only from 1 hart. |
@aap-sc, when setting SW breakpoints generally all harts are halted. OpenOCD writes the breakpoint through one hart, and then has all of them execute some |
But does execution of |
Each hart must perform write and fence.i |
|
Is this article of any relevance in this context? |
RISC-V and caches are still a bit behind feature completeness. This is in part because for a long time the official position was that caches are an implementation detail not visible to the ISA. There are more extensions focused on cache behavior these days. In any case, the problem of downloading a program through a debugger and executing a "process" on a different hart is identical, so the debugger should be able to do whatever software does, because it can execute all the instructions that software can. This is a long way of saying I don't think there will be any changes to the debug spec to handle caching using the program buffer. |
Thanks for the article. However, it focuses mostly on cache-maintenance operations issued by the same core. I guess that the current implementation in OpenOCD is fine, if the hart supports ifence. My understanding that the issue discussed here is mostly about a situation when we need to maintain some kind of i-cache coherence when we have several cores.
Got it, thanks. Sorry for the ratholing. However, as per @wxjstz comment:
This means that the current implementation is not quite correct. Looks like when a SW breakpoint is set or cleared the memory should be updated by all harts. This is not what OpenOCD does currenly. Should this be fixed? |
Nevermind. My bad. Looks like I'm wrong. Sorry for the potential confusion. If we set a breakpoint then this operation is performed only by the active hart, and this is by design. So the potential issue is only for the case when we clear the breakpoint in case when the original hart is unavailable. |
This patch(39a4f37) introduces a problem. After clearing the software breakpoint, need to run fence.i to refresh the instruction cache. This instruction can only be run by the hart with the breakpoint set.
The text was updated successfully, but these errors were encountered: