-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uVisor prevents LwIP/Hal from properly using ENET hardware when enabled #315
Comments
|
I hooked SVC_Handler and confirmed that this function is not returning when called from the NVIC_EnableIRQ proxy. What might be keeping uVisor from returning, at the very least, an error in this case? |
|
I was think I was able to solve the problem related to A simple program that only reads from the MPU and has the MPU ACL does not work. |
|
@sherrellbc : Please don't access the MPU from unprivileged code. We need to figure why lwip insists on disabling the MPU - which is clearly a dirty hack that needs to be solved differently. |
|
I agree. Is there a way to read or write to this memory without a uVisor fault or is it strictly disallowed in unprivileged mode? I do not have access to a debugger at the moment, but when I try to even read MPU (or FB) registers I get some sort of error and trap into the uVisor, or so I can only assume. Execute will halt on the read instruction. This behavior is observed regardless of whether the MPU ACL is included. I observed that LWIP will break if the MPU is not disabled. That is, if the line is commented out and uVisor is disabled LWIP will not work. It all works as expected, of course, if LWIP is allowed to disable the MPU. |
|
@sherrellbc writing to the MPU is out of question. We can't allow that for security reasons while uVisor is enabled. We need to find out why lwIP believes disabling the MPU is a good idea and fix the underlying problem without touching the MPU configuration from user mode. |
|
Perhaps it is related to the necessary permissions for the ENET hardware. https://github.com/ARMmbed/uvisor/blob/master/core/system/inc/mpu/vmpu_freescale_k64_mem.h#L30 As I am sure you have seen already, I submitted an issue over at mbed since they own the LWIP implementation. |
|
It looks like on reset, the K64F MPU defaults to a single, full-access memory-region covering the entire address space. That is, except for bus master 3, which is dedicated to the ENET hardware. Whoever implemented the original lwip port must have just disabled the MPU as a quick solution, since as far as I'm aware is has no impact on vanilla mbed. Granting access to bus master 3 seems to work as an alternative to disabling the MPU completely: MPU->RGDAAC[0] |= 0x007c0000;If we don't want lwip to make system wide changes, disabling the MPU could be moved to the k64f's startup sequence. In this case, does disabling the MPU need to be conditionally defined on the presence of uVisor? I'm unfamiliar with how uVisor handles DMA, but is there a method to grant access to bus master 3? This will be needed for the ethernet to operate. k64f mpu datasheet: |
|
This is a fix that I have also been working on with various modes of permissions. I (temporarily) patched the uVisor code to give the above permission and it still does not quite work (with uVisor enabled). It does work, however, when uVisor is disabled. As far as changing the permission, I think the default background will be modified with the inclusion of ACLs. In particular: If the above ACL is not included then LWIP/Hal will immediately trigger a memory access exception. Still, the below is valid even if the ENET ACL is provided. The observation: The (a) reason:
Put this before line 457 of fsl_enet.c: Put this before line 461 of fsl_enet.c: However, we are still not actually getting any interrupts regardless if the vectors are registered or not. Comments That said, it seems like the permission problem is related to the ENET hardware being allowed to trigger interrupts. Can this be tied back to MPU permissions? I was able to falsely trigger an ENET interrupt by using To Summarize I cannot find any other failures in the implementation, but uVisor is either preventing interrupts from being routed to the box that enabled them or is keeping data from going over the wire entirely. I will try to find a managed switch so I can observe network traffic to see if anything is getting out. |
|
ARM Internal Ref: IOTSFW-2879 |
|
Confirmed with a managed switch that no Ethernet frames are going over the wire when uVisor is enabled. That said, the ENET interrupts are probably working as expected, we are just not properly being allowed to configure the hardware. Unfortunately, this seems to be a silent failure because no uVisor faults are observed at run-time. How can we determine where uVisor is blocking ENET configuration when there are no obvious errors? Does uVisor report silently under certain circumstances? The test case: Using: PORTA and PORTB ACLs are required due to the design, pg7. Some data lines are routed through PORTA and management lines are routed through PORTB. |
|
@sherrellbc : we have a patch for uVisor in the working. We'll release that by the end of this week (pending merge in mbed-os). |
|
I tried making the
Actually, since I already went this far to do a bit of testing I went ahead and gave wide-open privileges for DMA as well:
None of that seemed to work, although this was admittedly a bruteforce attempt to just "see what happens" without a lot of background investigation. If I may exploit this opportunity to learn something about the architecture of Uvisor: To your point: if unprivileged code cannot access certain regions of memory (DMA, MPU, FB) why are they provided in the form of ACLs? Does uVisor take complete ownership of DMA in its current implementation? |
|
@sherrellbc : uVisor is for portability and practical reasons agnostic of the security impact of an ACL. uVisors concept is whitelisting - if you give something access, you must know best whether that's justified. uVisor deals with conflicts between ACLs instead - let's say two applications wanting access to the same peripheral. The way uVisor is designed, it's impossible to create an ACL during runtime - each application has to commit on the ACLs it requires passively at boot time by quoting them in the box configuration. This allows a future update server to deny signing firmware images that require access to critical components. Another example why I believe this behaviour is good: Imagine the DMA engine only being accessible in one box - but exposing DMA-service via a secure API that verifies whether source/destination pointers are owned by the caller. |
|
@sherrellbc: I have an early version of my solution merged (#320). This allows ENET DMA transactions just to and from box-0 (default box) memories: Not affecting uVisor security any more. You still need to add an ACL for the ENET peripheral itself. We are waiting for another pull request merged into mbed-os - but I will prepare an update example application for you so you have access earlier. |
|
@sherrellbc: The mbed-os-example-uvisor dev branch now points to my mbed-os fork with the required changes. |
|
@sherrellbc : The only thing that uVisor silently ignores is setting an unowned IRQ vector to NULL. If uVisor does not object, then the NVIC function did succeed. |
|
@sherrellbc : For the time being you are forced to use NVIC_SetVector for the ENET related IRQs. We are considering adding a fallback for legacy drivers to accommodate making these changes for drivers residing in box 0. |
|
The interrupts are now firing but RTX will error at run-time (when an ENET IRQ fires): I forked the mbed-os-example-uvisor here with changes to make testing easier. I removed the LED boxes; all of the memory allocation between each box was causing an error at link time. I also hooked the ENET interrupts to make sure they were getting triggered; although they are getting triggered now, it is at this point that we exit with error. If I do not hook the ENET IRQs (i.e. register them directly rather than my local versions
It seems it is originating from LwIP. However, it might just be related to the stack overflow mentioned above. Now, for the strangest part: If I do not hook the ENET IRQs, but I do throw print statements into the ENET_*_IRQHandler routines in
From what I could find it seems error 0x01 indicates a stack overflow. Is there a way to increase the stack size of the main box? It seems that the UVISOR_BOX_CONFIG macro does not directly apply to box 0. Also, your patch in libraries/ needs to be in features/ instead. As I understand it, the EDIT I commented out the failure mode (but left the call to osMutexRelease) from the LwIP Maybe the stack is actually just too small when using LwIP? I would test but it's not immediately obvious how to change the main box's stack size and you caveated your uVisor patch with the fact that it only working with box 0.
It seems the DHCP and ENET hardware is actually working. |
|
@sherrellbc box 0 has the default OS-provided stack - uVisor only deals with configuration of stacks for secure boxes. This means that you need to increase the stack size using normal mbed methods (linker script edit). On a hunch you can try to increase the ISR_STACK_SIZE, too. A note on your example code: From all I can see the NVIC_SetVector/NVIC_GetVector calls need to happen before instantiating the ethernet interface. |
|
I tried increasing ISR_STACK_SIZE but, alas, it does not fix the problem. Considering that we typically are exiting with a stack overflow only when the first ENET receive interrupt is triggered you may be on the right track with your hunch. To your point about the location of NVIC: the EthernetInterface implementation uses a default class constructor, so no initialization happens until the The observed behavior is either a stack overflow (RTX 0x01), an osMutexRelease error -- or both. I also tried doubling the stack size in the linker script as you suggest, but this did work either. It seems that there is no upper bound on the stack size that will not cause an RTX 0x01 error when uVisor is enabled. What could uVisor be doing that could cause either of these to happen? There are no such issues when uVisor is disabled. |
|
Diving a bit further into the ENET Hal layer I found that it is relatively simple. When an ENET IRQ is triggered (which is working) the call tree looks like (for Rx): ENET_Transmit_IRQHandler This final call to The stack overflow (RTX 0x01) is happening before the ISR can post to the semaphore to wake the thread. I cannot find a location in the code from that call-stack that could lead to a stack overflow. Edit: I noticed the stack overflow errors were only occurring when I included If At this point I would move this issue outside of uVisor, but the |
|
I updated my fork (of your fork/branch) to print more run-time information without overflowing the stack. I also included a macro to try with and without uVisor enabled. There seems to be some problem with system mutexes when uVisor is enabled that I cannot track down. See above. If it helps at all, I've noticed that the board seems to (almost) always at least DHCP Discover. On occasion there is no DHCP activity and the device hangs. The DHCP server will always respond (DHCP Offer) and it's questionable whether or not the device will DHCP Request back. A successful DHCP negotiation (with uVisor enabled): If the device manages to at least DHCP Discover then result is always the same: Also, your patch in libraries/ needs to be in features/ instead. |
|
Manually increasing OS_MAINSTKSIZE did not seem to help. Adjusting the DEFAULT_THREAD_STACKSIZE for the LwIP threads running |
|
With a slight modification to LwIP I was able to enable debug messages. I hooked into the ENET interrupts and prevented them from executing the actual ISRs (i.e. immediate return) to test whether the interrupts had anything to do with the above failure mode. I found that the LwIP code continues to fail even if the ISRs are prevented from running. Further, since I enabled LwIP debugging, I found the final message printed just before the code stops working tends to be from pbuf_alloc. Might uVisor be interfering with heap allocation and/or data alignment? Getting ENET to work under uVisor is central to a project I am currently working on. Do you have any insight into what may be causing the mutex failure? This thread is getting exceedingly long for a git issue, but uVisor is still breaking ENET in some way. |
|
@sherrellbc One thing that we observed recently is that mallocs might file due to under-dimensioned heap sizes. These kinds of problems turn into weird followup-bugs that look like uVisor faults. To identify them I would put an assertion into malloc that breaks into the debugger whenever malloc would return a null pointer. Also you must run the debug version of uVisor to get uVisor "blue screens" with detailed information on the cause. See our debug documentation for further information. |
|
Although unrelated to this git issue entirely, I was not able to successfully debug uVisor using the documentation provided. I was able to get this working with slightly modified gdb instructions. no other combination of commands worked. I include this only in case this debugging documentation has not been updated for some time. At any rate, the only error I get is:
OR
Depending on if I first jump through local hooks (to set flags for debugging from within the main thread) that call down to the actual ENET IRQs or directly register the ENET IRQs as the interrupts, respectively. Could you provide more insight into the implications of the information contained in this dump? In the case of the latter (no local ENET hooks) it seems that a call to vmpu_load_boxes is causing the failure after the ENET interrupts have been enabled within the execution context of the main box? It seems the boxes should already be loaded considering we are executing code from box 0 well before this HALT error. I am having trouble making sense of this. The actual HALT_ERROR message that is being printed has origins here, which does not, as far as I can tell, have a logical pathway from vmpu_load_boxes as reported in the debug error. If the ENET interrupts are never registered then the system run perfectly well. |
|
Apparently, development in other areas of uVisor has had the side-effect of allowing ENET to properly work, (#320) alone presented all of the above effects. A lot has changed since the last cut. Confirmed to work as of 0.25.0: I will continue to test over the coming weeks, but as of now we are able to successfully DHCP and respond via ICMP. |
|
Hi @sherrellbc and @meriac, I have nearly the same problems as described above #315 (comment) Here is the exact point, where uVisor crashs: fls_enet.c#L476 After I registered the interrupts by adding On 23.Sep this Issue here is closed after it properly worked with #320 ? I already tryed it with this version, but get the same problems. |
|
I can confirm no functional issues on the K64F platform with the latest mbed-os/uVisor release. Of course, a workaround for the above was developed and is still in effect; you must declare the appropriate ACLs in the public, or main, box. Attempting to isolate all ENET access to a box other than 0 remains broken. You can still use the ENET hardware from other boxes, you just cannot attempt to reserve it as a private resource. The associated HALT error is as follows: The minimum ACL set to support ENET seems to be: Of course, you must also manually register the ENET interrupts as shown above. |
|
Has there been any further progress to fix ENET only working from box 0 mentioned at #315 (comment) above? The issue is still present as of the mbedos release candidate 5.3.4. |

EDIT: I changed the issue's name to more appropriately track the purpose of this thread after relevant information has been uncovered surrounding the original problem.
I have been trying to determine the appropriate uVisor ACL such that I can use the Ethernet hardware on the FRDM K64F development board. After a few hours of debugging and tracing the problem I managed to determine two points of failure when using EthernetInterface.cpp and (a dependency thereof) the LWIP implementation; the root of the corresponding source tree can be found here.
The first problem I found was that the LWIP implementation was trying to disable the MPU, which I thought might be problematic for uVisor.
MPU->CESR &= ~MPU_CESR_VLD_MASK;https://github.com/ARMmbed/mbed-os/blob/master/features/net/FEATURE_IPV4/lwip-interface/lwip-eth/arch/TARGET_Freescale/hardware_init_MK64F12.c#L41
I have defined the MPU as an enabled memory range in my ACL, but that does not prevent the system from stopping on that instruction.
The second problem was related to a call to
EnableIRQinfsl_enet.c. At the highest level, a call toEthernetInterface.connect()will lead to execution of the following.https://github.com/ARMmbed/mbed-os/blob/master/hal/targets/hal/TARGET_Freescale/TARGET_KSDK2_MCUS/TARGET_MCU_K64F/drivers/fsl_enet.c#L453
As it turns out,
EnableIRQ()is an inline function defined infsl_common.has a link to an NVIC_EnableIRQ call:https://github.com/ARMmbed/mbed-os/blob/master/hal/targets/hal/TARGET_Freescale/TARGET_KSDK2_MCUS/TARGET_MCU_K64F/drivers/fsl_common.h#L182
I could not trace the NVIC definition down any further as there were a very large number of them. However, I did check out the generated source code in a disassembler and it seems that
NVIC_EnableIRQis being translated to an SVC 3 call.Are there a special configuration required for this? Does the default SVC_Handler function not handled this case for enabling interrupts? The behavior I am observing is that a call to this function never returns, so I can only guess (since I do not have a debugger) that the uVisor is trapping on this call.
As a side note, I tried using
UVISOR_PERMISSIVEfor my main ACL, but the problem still persists.Are there any obvious problems with using
NVIC_EnableIRQor writing to the MPU? Are there additional configuration requirements to support either?If I comment out the offending MPU and NVIC_EnableIRQ lines then everything executes without issue -- except the Ethernet hardware does not work, of course.
The text was updated successfully, but these errors were encountered: