Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uVisor prevents LwIP/Hal from properly using ENET hardware when enabled #315

Open
sherrellbc opened this issue Aug 25, 2016 · 30 comments
Open

Comments

@sherrellbc
Copy link

sherrellbc commented Aug 25, 2016

EDIT: I changed the issue's name to more appropriately track the purpose of this thread after relevant information has been uncovered surrounding the original problem.


I have been trying to determine the appropriate uVisor ACL such that I can use the Ethernet hardware on the FRDM K64F development board. After a few hours of debugging and tracing the problem I managed to determine two points of failure when using EthernetInterface.cpp and (a dependency thereof) the LWIP implementation; the root of the corresponding source tree can be found here.

The first problem I found was that the LWIP implementation was trying to disable the MPU, which I thought might be problematic for uVisor.

MPU->CESR &= ~MPU_CESR_VLD_MASK;

https://github.com/ARMmbed/mbed-os/blob/master/features/net/FEATURE_IPV4/lwip-interface/lwip-eth/arch/TARGET_Freescale/hardware_init_MK64F12.c#L41

I have defined the MPU as an enabled memory range in my ACL, but that does not prevent the system from stopping on that instruction.

const UvisorBoxAclItem g_main_acl[] = {     
         {SIM,    sizeof(*SIM),    UVISOR_TACLDEF_PERIPH}, 
         {OSC,    sizeof(*OSC),    UVISOR_TACLDEF_PERIPH}, 
         {MCG,    sizeof(*MCG),    UVISOR_TACLDEF_PERIPH}, 
         {PORTA,  sizeof(*PORTA),  UVISOR_TACLDEF_PERIPH}, 
         {PORTB,  sizeof(*PORTB),  UVISOR_TACLDEF_PERIPH}, 
         {PORTC,  sizeof(*PORTC),  UVISOR_TACLDEF_PERIPH}, 
         {PORTD,  sizeof(*PORTD),  UVISOR_TACLDEF_PERIPH}, 
         {PORTE,  sizeof(*PORTE),  UVISOR_TACLDEF_PERIPH}, 
         {RTC,    sizeof(*RTC),    UVISOR_TACLDEF_PERIPH}, 
         {LPTMR0, sizeof(*LPTMR0), UVISOR_TACLDEF_PERIPH}, 
         {PIT,    sizeof(*PIT),    UVISOR_TACLDEF_PERIPH}, 
         {SMC,    sizeof(*SMC),    UVISOR_TACLDEF_PERIPH}, 
         {UART0,  sizeof(*UART0),  UVISOR_TACLDEF_PERIPH}, 
         {I2C0,   sizeof(*I2C0),   UVISOR_TACLDEF_PERIPH},
         {SPI0,   sizeof(*SPI0),   UVISOR_TACLDEF_PERIPH},
         {MPU,  sizeof(*MPU), UVISOR_TACLDEF_PERIPH}, 

         /* EthernetInterface */
         {ENET,   sizeof(*ENET),   UVISOR_TACLDEF_PERIPH}, 
         {EWM,    sizeof(*EWM),   UVISOR_TACLDEF_PERIPH}, 
};

The second problem was related to a call to EnableIRQ in fsl_enet.c. At the highest level, a call to EthernetInterface.connect() will lead to execution of the following.

/* Enables Ethernet interrupt and NVIC. */
    ENET_EnableInterrupts(base, config->interrupt);
    if (config->interrupt & (kENET_RxByteInterrupt | kENET_RxFrameInterrupt))
    {
        EnableIRQ(s_enetRxIrqId[instance]);
    }
    if (config->interrupt & (kENET_TxByteInterrupt | kENET_TxFrameInterrupt))
    {
        EnableIRQ(s_enetTxIrqId[instance]);
    }
    if (config->interrupt & (kENET_BabrInterrupt | kENET_BabtInterrupt | kENET_GraceStopInterrupt | kENET_MiiInterrupt |
                             kENET_EBusERInterrupt | kENET_LateCollisionInterrupt | kENET_RetryLimitInterrupt |
                             kENET_UnderrunInterrupt | kENET_PayloadRxInterrupt | kENET_WakeupInterrupt))
    {
        EnableIRQ(s_enetErrIrqId[instance]);
    }

https://github.com/ARMmbed/mbed-os/blob/master/hal/targets/hal/TARGET_Freescale/TARGET_KSDK2_MCUS/TARGET_MCU_K64F/drivers/fsl_enet.c#L453

As it turns out, EnableIRQ() is an inline function defined in fsl_common.h as a link to an NVIC_EnableIRQ call:

static inline void EnableIRQ(IRQn_Type interrupt)
{
#if defined(FSL_FEATURE_SOC_INTMUX_COUNT) && (FSL_FEATURE_SOC_INTMUX_COUNT > 0)
    if (interrupt < FSL_FEATURE_INTMUX_IRQ_START_INDEX)
#endif
    {
        NVIC_EnableIRQ(interrupt);
    }
}

https://github.com/ARMmbed/mbed-os/blob/master/hal/targets/hal/TARGET_Freescale/TARGET_KSDK2_MCUS/TARGET_MCU_K64F/drivers/fsl_common.h#L182

I could not trace the NVIC definition down any further as there were a very large number of them. However, I did check out the generated source code in a disassembler and it seems that NVIC_EnableIRQ is being translated to an SVC 3 call.

Are there a special configuration required for this? Does the default SVC_Handler function not handled this case for enabling interrupts? The behavior I am observing is that a call to this function never returns, so I can only guess (since I do not have a debugger) that the uVisor is trapping on this call.


As a side note, I tried using UVISOR_PERMISSIVE for my main ACL, but the problem still persists.

Are there any obvious problems with using NVIC_EnableIRQ or writing to the MPU? Are there additional configuration requirements to support either?

If I comment out the offending MPU and NVIC_EnableIRQ lines then everything executes without issue -- except the Ethernet hardware does not work, of course.

@sherrellbc sherrellbc changed the title uVisor seems to break when using NVIC_EnableIRQ or writing to MPU uVisor breaks when using NVIC_EnableIRQ (supervisor SVC calls) or writing to the MPU Aug 25, 2016
@sherrellbc
Copy link
Author

sherrellbc commented Aug 26, 2016

I hooked SVC_Handler and confirmed that this function is not returning when called from the NVIC_EnableIRQ proxy. What might be keeping uVisor from returning, at the very least, an error in this case?

@sherrellbc
Copy link
Author

sherrellbc commented Aug 26, 2016

I was think I was able to solve the problem related to NVIC_EnableIRQ, but I am still unable to read the MPU.

A simple program that only reads from the MPU and has the MPU ACL does not work.

@meriac
Copy link
Contributor

meriac commented Aug 26, 2016

@sherrellbc : Please don't access the MPU from unprivileged code. We need to figure why lwip insists on disabling the MPU - which is clearly a dirty hack that needs to be solved differently.

@sherrellbc
Copy link
Author

sherrellbc commented Aug 26, 2016

@meriac,
@geky,

I agree. Is there a way to read or write to this memory without a uVisor fault or is it strictly disallowed in unprivileged mode?

I do not have access to a debugger at the moment, but when I try to even read MPU (or FB) registers I get some sort of error and trap into the uVisor, or so I can only assume. Execute will halt on the read instruction. This behavior is observed regardless of whether the MPU ACL is included.

I observed that LWIP will break if the MPU is not disabled. That is, if the line is commented out and uVisor is disabled LWIP will not work. It all works as expected, of course, if LWIP is allowed to disable the MPU.

@meriac
Copy link
Contributor

meriac commented Aug 26, 2016

@sherrellbc writing to the MPU is out of question. We can't allow that for security reasons while uVisor is enabled. We need to find out why lwIP believes disabling the MPU is a good idea and fix the underlying problem without touching the MPU configuration from user mode.

@sherrellbc
Copy link
Author

sherrellbc commented Aug 27, 2016

Perhaps it is related to the necessary permissions for the ENET hardware.

https://github.com/ARMmbed/uvisor/blob/master/core/system/inc/mpu/vmpu_freescale_k64_mem.h#L30

As I am sure you have seen already, I submitted an issue over at mbed since they own the LWIP implementation.

@geky
Copy link

geky commented Aug 29, 2016

It looks like on reset, the K64F MPU defaults to a single, full-access memory-region covering the entire address space.

That is, except for bus master 3, which is dedicated to the ENET hardware. Whoever implemented the original lwip port must have just disabled the MPU as a quick solution, since as far as I'm aware is has no impact on vanilla mbed.

Granting access to bus master 3 seems to work as an alternative to disabling the MPU completely:

MPU->RGDAAC[0] |= 0x007c0000;

If we don't want lwip to make system wide changes, disabling the MPU could be moved to the k64f's startup sequence. In this case, does disabling the MPU need to be conditionally defined on the presence of uVisor?

I'm unfamiliar with how uVisor handles DMA, but is there a method to grant access to bus master 3? This will be needed for the ethernet to operate.

k64f mpu datasheet:
https://developer.mbed.org/media/uploads/GregC/k64f_rm_rev2.pdf#d204e3a1310_d819e16

@sherrellbc
Copy link
Author

sherrellbc commented Aug 30, 2016

This is a fix that I have also been working on with various modes of permissions. I (temporarily) patched the uVisor code to give the above permission and it still does not quite work (with uVisor enabled). It does work, however, when uVisor is disabled.

As far as changing the permission, I think the default background will be modified with the inclusion of ACLs. In particular:

static const UvisorBoxAclItem g_main_box_acls[] = 
{
    {ENET,  sizeof(*ENET),  UVISOR_TACLDEF_PERIPH}
}

If the above ACL is not included then LWIP/Hal will immediately trigger a memory access exception. Still, the below is valid even if the ENET ACL is provided.


The observation:
The ENET IRQs are not being triggered unless the MPU is entirely disabled (or if 0x007c0000 permissions are set in RGDAAC[0] and with uVisor disabled .. as pointed out by @geky), so the implementation is still a bit broken. If the below patches are included then LWIP/Hal just times out rather than faulting.

The (a) reason:
The HAL layer does not register for ENET interrupts -- it just tries to enable them (which will cause a uVisor fault). The default ENET interrupt vectors are set properly in startup_MK64F12.S, but uVisor relocates (and presumably clears) the vector table into SRAM. To verify, I checked using

vIRQ_GetVector() on interrupts 83/84 (ENET Transmit/Receieve) and found they were set to 0x00. I have read that it is necessary to register your handlers. As such, I think the fix for that is to first register as so:

Put this before line 457 of fsl_enet.c:
NVIC_SetVector(s_enetRxIrqId[instance], ENET_Receive_IRQHandler);

Put this before line 461 of fsl_enet.c:
NVIC_SetVector(s_enetTxIrqId[instance], ENET_Transmit_IRQHandler);

However, we are still not actually getting any interrupts regardless if the vectors are registered or not.

Comments
I can confirm that without these patches uVisor will halt with an exception since a box is trying to enable interrupts for which it has not first registered. The interrupts will properly fire if uVisor is disable and the MPU is disabled (with or without the above patches, but I think they are necessary when uVisor is enabled).

That said, it seems like the permission problem is related to the ENET hardware being allowed to trigger interrupts. Can this be tied back to MPU permissions?

I was able to falsely trigger an ENET interrupt by using vIRQ_SetPendingIRQ, so they appear to now be at least configured properly.

To Summarize

Working:
1) MPU and uVisor disabled
2) MPU->RGDAAC[0] |= 0x007c0000 with uVisor disabled

Partial:
3) uVisor Enabled, ENET ACL included, Above IRQ patch -> No interrupts generated

I cannot find any other failures in the implementation, but uVisor is either preventing interrupts from being routed to the box that enabled them or is keeping data from going over the wire entirely. I will try to find a managed switch so I can observe network traffic to see if anything is getting out.

@sherrellbc sherrellbc changed the title uVisor breaks when using NVIC_EnableIRQ (supervisor SVC calls) or writing to the MPU uVisor prevents LwIP/Hal from properly using ENET hardware when enabled Aug 30, 2016
@ciarmcom
Copy link
Member

ARM Internal Ref: IOTSFW-2879

@sherrellbc
Copy link
Author

sherrellbc commented Aug 31, 2016

Confirmed with a managed switch that no Ethernet frames are going over the wire when uVisor is enabled. That said, the ENET interrupts are probably working as expected, we are just not properly being allowed to configure the hardware.

Unfortunately, this seems to be a silent failure because no uVisor faults are observed at run-time. How can we determine where uVisor is blocking ENET configuration when there are no obvious errors? Does uVisor report silently under certain circumstances?


The test case:

#include "mbed.h"
#include "rtos.h"
#include "EthernetInterface.h"
#include "uvisor-lib/uvisor-lib.h"

extern "C" void SVC_Handler(void);
extern "C" void PendSV_Handler(void);
extern "C" void SysTick_Handler(void);

UVISOR_SET_PRIV_SYS_IRQ_HOOKS(SVC_Handler, PendSV_Handler, SysTick_Handler);

/* Main box Access Control List */
static const UvisorBoxAclItem g_main_box_acls[] = { 
    {SIM,   sizeof(*SIM),   UVISOR_TACLDEF_PERIPH},         
    {PIT,   sizeof(*PIT), UVISOR_TACLDEF_PERIPH},       
    {SMC,   sizeof(*SMC),   UVISOR_TACLDEF_PERIPH},     

    /* For messages printed on the serial port */
    {OSC,   sizeof(*OSC),   UVISOR_TACLDEF_PERIPH},    
    {MCG,   sizeof(*MCG),   UVISOR_TACLDEF_PERIPH},
    {UART0, sizeof(*UART0), UVISOR_TACLDEF_PERIPH},  
    {PORTC, sizeof(*PORTC), UVISOR_TACLDEF_PERIPH},   

    /* Ethernet Hardware */
    {ENET,  sizeof(*ENET),  UVISOR_TACLDEF_PERIPH},
    {PORTA, sizeof(*PORTA), UVISOR_TACLDEF_PERIPH},     /* PTA5, 12-17 */
    {PORTB, sizeof(*PORTB), UVISOR_TACLDEF_PERIPH},   /* PTB0, 1 */
};

/* Enable uVisor, using the ACLs we just created. */
UVISOR_SET_MODE_ACL(UVISOR_ENABLED, g_main_box_acls);

int main(){
    EthernetInterface eth;
    printf("\r\n\r\n--------------- New Run ---------------\r\n");

    if(0 == eth.connect())  printf("IP: %s\n", eth.get_ip_address());
    else                    printf("DHCP timed out");
}

Using:
uVisor Enabled, ENET ACL included (Also PORTA/PORTB ACLs), Above IRQ patch

PORTA and PORTB ACLs are required due to the design, pg7. Some data lines are routed through PORTA and management lines are routed through PORTB.

@meriac
Copy link
Contributor

meriac commented Aug 31, 2016

@sherrellbc : we have a patch for uVisor in the working. We'll release that by the end of this week (pending merge in mbed-os).

@sherrellbc
Copy link
Author

sherrellbc commented Aug 31, 2016

@meriac

I tried making the MPU->RGDAAC[0] |= 0x007c0000 change in a uVisor K64F MPU header and then rebuilding and manually deploying the generated archives to mbed-os. This change did not seem to work.

#define UVISOR_TACL_BACKGROUND 0x000827D0U | 0x007C0000

Actually, since I already went this far to do a bit of testing I went ahead and gave wide-open privileges for DMA as well:

#define UVISOR_TACL_BACKGROUND 0x000827D0U | 0x007C0000 | 0x0001F000

None of that seemed to work, although this was admittedly a bruteforce attempt to just "see what happens" without a lot of background investigation.


If I may exploit this opportunity to learn something about the architecture of Uvisor:

To your point: if unprivileged code cannot access certain regions of memory (DMA, MPU, FB) why are they provided in the form of ACLs? Does uVisor take complete ownership of DMA in its current implementation?

@meriac
Copy link
Contributor

meriac commented Sep 2, 2016

@sherrellbc : uVisor is for portability and practical reasons agnostic of the security impact of an ACL. uVisors concept is whitelisting - if you give something access, you must know best whether that's justified.

uVisor deals with conflicts between ACLs instead - let's say two applications wanting access to the same peripheral.

The way uVisor is designed, it's impossible to create an ACL during runtime - each application has to commit on the ACLs it requires passively at boot time by quoting them in the box configuration.

This allows a future update server to deny signing firmware images that require access to critical components.

Another example why I believe this behaviour is good: Imagine the DMA engine only being accessible in one box - but exposing DMA-service via a secure API that verifies whether source/destination pointers are owned by the caller.

@meriac
Copy link
Contributor

meriac commented Sep 2, 2016

@sherrellbc: I have an early version of my solution merged (#320). This allows ENET DMA transactions just to and from box-0 (default box) memories: Not affecting uVisor security any more. You still need to add an ACL for the ENET peripheral itself.

We are waiting for another pull request merged into mbed-os - but I will prepare an update example application for you so you have access earlier.

@meriac
Copy link
Contributor

meriac commented Sep 2, 2016

@sherrellbc: The mbed-os-example-uvisor dev branch now points to my mbed-os fork with the required changes.

@meriac
Copy link
Contributor

meriac commented Sep 2, 2016

@sherrellbc : The only thing that uVisor silently ignores is setting an unowned IRQ vector to NULL. If uVisor does not object, then the NVIC function did succeed.

@meriac
Copy link
Contributor

meriac commented Sep 2, 2016

@sherrellbc : For the time being you are forced to use NVIC_SetVector for the ENET related IRQs. We are considering adding a fallback for legacy drivers to accommodate making these changes for drivers residing in box 0.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 2, 2016

@meriac

The interrupts are now firing but RTX will error at run-time (when an ENET IRQ fires):
RTX error code: 0x00000001, task ID: 0x00000000

I forked the mbed-os-example-uvisor here with changes to make testing easier.

I removed the LED boxes; all of the memory allocation between each box was causing an error at link time.

[ERROR] /home/user/.programs/gcc-arm-none-eabi-5_4-2016q2/bin/../lib/gcc/arm-none-eabi/5.4.1/../../../../arm-none-eabi/bin/ld: Warning: alignment 1 of symbol `__uvisor_ps' in ../mbed-os/features/FEATURE_UVISOR/targets/TARGET_UVISOR_SUPPORTED/TARGET_MCU_K64F/TARGET_RELEASE/TARGET_M4/libconfiguration_kinetis_m4_0x1fff0000.a(uvisor-output.o) is smaller than 4 in ../mbed-os/features/FEATURE_UVISOR/targets/TARGET_UVISOR_SUPPORTED/TARGET_MCU_K64F/TARGET_RELEASE/TARGET_M4/libconfiguration_kinetis_m4_0x1fff0000.a(disabled.o)
/home/user/.programs/gcc-arm-none-eabi-5_4-2016q2/bin/../lib/gcc/arm-none-eabi/5.4.1/../../../../arm-none-eabi/bin/ld: region m_data_2 overflowed with stack and heap
collect2: error: ld returned 1 exit status

I also hooked the ENET interrupts to make sure they were getting triggered; although they are getting triggered now, it is at this point that we exit with error.

If I do not hook the ENET IRQs (i.e. register them directly rather than my local versions local_*) then no RTX error is printed as above, but the system still fails with error:

sys_arch_unprotect error

It seems it is originating from LwIP. However, it might just be related to the stack overflow mentioned above.

Now, for the strangest part: If I do not hook the ENET IRQs, but I do throw print statements into the ENET_*_IRQHandler routines in fsl_enet.c then we, again, exit with an RTX error same as before (but no sys_arch_unprotect error):

RTX error code: 0x00000001, task ID: 0x00000000

From what I could find it seems error 0x01 indicates a stack overflow. Is there a way to increase the stack size of the main box? It seems that the UVISOR_BOX_CONFIG macro does not directly apply to box 0.


Also, your patch in libraries/ needs to be in features/ instead. As I understand it, the libraries directory is there only for legacy reasons for mbed-os 2; mbed-os 5 ignores it.


EDIT

I commented out the failure mode (but left the call to osMutexRelease) from the LwIP sys_arch_protect error and was actually able to get an IP address. This only happened on occasion, though. Most of the time I am met with an RTX error similar to the above, but sometimes it actually works. Perhaps the entire root of the above problem is related to this mutex -- or whatever the underlying problem is.

Maybe the stack is actually just too small when using LwIP? I would test but it's not immediately obvious how to change the main box's stack size and you caveated your uVisor patch with the fact that it only working with box 0.

RTX error code: 0x00000001, task ID: 0x2000F7F0

It seems the DHCP and ENET hardware is actually working.

@meriac
Copy link
Contributor

meriac commented Sep 6, 2016

@sherrellbc box 0 has the default OS-provided stack - uVisor only deals with configuration of stacks for secure boxes. This means that you need to increase the stack size using normal mbed methods (linker script edit). On a hunch you can try to increase the ISR_STACK_SIZE, too.

A note on your example code: From all I can see the NVIC_SetVector/NVIC_GetVector calls need to happen before instantiating the ethernet interface.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 6, 2016

@meriac

I tried increasing ISR_STACK_SIZE but, alas, it does not fix the problem. Considering that we typically are exiting with a stack overflow only when the first ENET receive interrupt is triggered you may be on the right track with your hunch.

To your point about the location of NVIC: the EthernetInterface implementation uses a default class constructor, so no initialization happens until the connect() method is called. And it simply calls down into LwIP, so the class is just a simple wrapper. I did try your suggestion, though, and it also did not help the problem.

The observed behavior is either a stack overflow (RTX 0x01), an osMutexRelease error -- or both.

I also tried doubling the stack size in the linker script as you suggest, but this did work either. It seems that there is no upper bound on the stack size that will not cause an RTX 0x01 error when uVisor is enabled.

What could uVisor be doing that could cause either of these to happen? There are no such issues when uVisor is disabled.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 6, 2016

@meriac

Diving a bit further into the ENET Hal layer I found that it is relatively simple. When an ENET IRQ is triggered (which is working) the call tree looks like (for Rx):

ENET_Transmit_IRQHandler

ENET_Transmit_IRQ_Handler

ethernet_callback

enet_mac_rx_isr

sys_sem_signal

This final call to sys_sem_signal() is on a semaphore that is being sys_arch_sem_waited on by a thread at packet_rx.

The stack overflow (RTX 0x01) is happening before the ISR can post to the semaphore to wake the thread. I cannot find a location in the code from that call-stack that could lead to a stack overflow.


Edit:

I noticed the stack overflow errors were only occurring when I included printf statements. Specifically, I had one in the packet_rx routine mentioned above. Without this statements the code does not seem to exit with RTX 0x01, but rather is afflicted by the failure of osMutexRelease mentioned above.

If sys_arch_unprotect is not allowed to fail (i.e. remove the error checking) from sys_arch_unprotect then the application code will successfully DHCP and acquire an IP address about 50% of the time. The other 50% it will hang, presumable due to a logic failure that relied on the mutex being mutually exclusive. Interestingly, osMutexRelease will fail with error code 3, which is not defined for type osStatus.

At this point I would move this issue outside of uVisor, but the osMutexRelease only seems to fail when uVisor is enabled. The odd stack overflow issue happens regardless (but only with the print statements included, as mentioned).

@sherrellbc
Copy link
Author

sherrellbc commented Sep 7, 2016

I updated my fork (of your fork/branch) to print more run-time information without overflowing the stack. I also included a macro to try with and without uVisor enabled. There seems to be some problem with system mutexes when uVisor is enabled that I cannot track down. See above.

If it helps at all, I've noticed that the board seems to (almost) always at least DHCP Discover. On occasion there is no DHCP activity and the device hangs. The DHCP server will always respond (DHCP Offer) and it's questionable whether or not the device will DHCP Request back.

A successful DHCP negotiation (with uVisor enabled):
image

If the device manages to at least DHCP Discover then result is always the same: sys_arch_unprotect failure.


@meriac,

Also, your patch in libraries/ needs to be in features/ instead.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 8, 2016

Manually increasing OS_MAINSTKSIZE did not seem to help.

Adjusting the DEFAULT_THREAD_STACKSIZE for the LwIP threads running packet_rx, packet_tx, and k64f_phy_task did not help the osMutexRelease failure, but allows for more diagnostic printf statements to be included. The ENET IRQs are being triggered and are properly waking these threads to handle the RX and TX events.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 13, 2016

@meriac

With a slight modification to LwIP I was able to enable debug messages. I hooked into the ENET interrupts and prevented them from executing the actual ISRs (i.e. immediate return) to test whether the interrupts had anything to do with the above failure mode. I found that the LwIP code continues to fail even if the ISRs are prevented from running. Further, since I enabled LwIP debugging, I found the final message printed just before the code stops working tends to be from pbuf_alloc. Might uVisor be interfering with heap allocation and/or data alignment?

Getting ENET to work under uVisor is central to a project I am currently working on. Do you have any insight into what may be causing the mutex failure? This thread is getting exceedingly long for a git issue, but uVisor is still breaking ENET in some way.

@meriac
Copy link
Contributor

meriac commented Sep 19, 2016

@sherrellbc One thing that we observed recently is that mallocs might file due to under-dimensioned heap sizes. These kinds of problems turn into weird followup-bugs that look like uVisor faults. To identify them I would put an assertion into malloc that breaks into the debugger whenever malloc would return a null pointer.

Also you must run the debug version of uVisor to get uVisor "blue screens" with detailed information on the cause. See our debug documentation for further information.

@sherrellbc
Copy link
Author

sherrellbc commented Sep 22, 2016

@meriac

Although unrelated to this git issue entirely, I was not able to successfully debug uVisor using the documentation provided. I was able to get this working with slightly modified gdb instructions. no other combination of commands worked. I include this only in case this debugging documentation has not been updated for some time.

file ./build/${target}/source/${your_app}.elf
target remote localhost:2331
monitor reset
monitor semihosting enable
load

At any rate, the only error I get is:

... [cut] ...
visor initialized
IRQ 51 registered to box 0
IRQ 51 enabled
IRQ 83 registered to box 0
IRQ 84 registered to box 0
IRQ 84 enabled
IRQ 83 enabled


                BUS FAULT

  • CFSR : 0x00008200
  • BFAR : 0x009B0394
  • MPU FAULT:
    Slave port: 0
    Address: 0x000B0394
    Faulting regions:
    R00: 0x00000000 0xFFFFFFFF 0x000007D0 0x00000001
    Master port: 0
    Error attribute: Data WRITE (user mode)
  • MEMORY MAP
    Address: 0x009B0394
    Region/Peripheral: [not available]
    Base address: 0x009B0394
    End address: 0x009B0394
  • EXCEPTION STACK FRAME
    Exception from unprivileged code
    psp: 0x20004C60
    lr: 0xFFFFFFFD
    Exception stack frame:
    psp[07]: 0x21000000 | xPSR
    psp[06]: 0x0002071C | pc
    psp[05]: 0x00020839 | lr
    psp[04]: 0x00001D7C | r12
    psp[03]: 0x009B0390 | r3
    psp[02]: 0x00000001 | r2
    psp[01]: 0x00000000 | r1
    psp[00]: 0x009B0390 | r0
  • MPU CONFIGURATION
    CESR: 0x80815101
    Slave 0 Slave 1 Slave 2 Slave 3 Slave 4
    EAR: 0x000B0394 0x00000000 0x00000000 0x00000000 0x00000000
    EDR: 0x80000003 0x00000000 0x00000000 0x00000000 0x00000000
    Start End Perm. Valid
    R00: 0x00000000 0xFFFFFFFF 0x000007D0 0x00000001
    R01: 0x00000000 0x000334DF 0x0010000D 0x00000001
    R02: 0x20000000 0x2003001F 0x00180017 0x00000001
    R03: 0x00000000 0x0000001F 0x00000000 0x00000000
    R04: 0x00000000 0x0000001F 0x00000000 0x00000000
    R05: 0x00000000 0x0000001F 0x00000000 0x00000000
    R06: 0x00000000 0x0000001F 0x00000000 0x00000000
    R07: 0x00000000 0x0000001F 0x00000000 0x00000000
    R08: 0x00000000 0x0000001F 0x00000000 0x00000000
    R09: 0x00000000 0x0000001F 0x00000000 0x00000000
    R10: 0x00000000 0x0000001F 0x00000000 0x00000000
    R11: 0x00000000 0x0000001F 0x00000000 0x00000000

HALT_ERROR(./core/system/src/mpu/vmpu_freescale_k64.c#126): Access to restricted resource denied

OR

...[cut]...
IRQ 84 registered to box 0
IRQ 84 enabled
IRQ 83 enabled
HALT_ERROR(./core/system/src/mpu/vmpu.c#350): This is not the PC (0x61000000) your were searching for

Depending on if I first jump through local hooks (to set flags for debugging from within the main thread) that call down to the actual ENET IRQs or directly register the ENET IRQs as the interrupts, respectively.

void local_ENET_Transmit_IRQHandler(void){
    g_enet_flag_byte |= ENET_TX_FLAG;
    return ENET_Transmit_IRQHandler();
}

void local_ENET_Receive_IRQHandler(void){
    g_enet_flag_byte |= ENET_RX_FLAG;
    return ENET_Receive_IRQHandler();
}

//NVIC_SetVector((IRQn_Type) 83, (uint32_t) local_ENET_Transmit_IRQHandler);
//NVIC_SetVector((IRQn_Type) 84, (uint32_t) local_ENET_Receive_IRQHandler);
NVIC_SetVector((IRQn_Type) 83, (uint32_t) ENET_Transmit_IRQHandler);
NVIC_SetVector((IRQn_Type) 84, (uint32_t) ENET_Receive_IRQHandler);

Could you provide more insight into the implications of the information contained in this dump? In the case of the latter (no local ENET hooks) it seems that a call to vmpu_load_boxes is causing the failure after the ENET interrupts have been enabled within the execution context of the main box? It seems the boxes should already be loaded considering we are executing code from box 0 well before this HALT error. I am having trouble making sense of this. The actual HALT_ERROR message that is being printed has origins here, which does not, as far as I can tell, have a logical pathway from vmpu_load_boxes as reported in the debug error.

If the ENET interrupts are never registered then the system run perfectly well.

core/system/src/mpu/vmpu.c#350

core/system/src/mpu/vmpu.c#126

@sherrellbc
Copy link
Author

sherrellbc commented Sep 23, 2016

@meriac,

Apparently, development in other areas of uVisor has had the side-effect of allowing ENET to properly work, (#320) alone presented all of the above effects. A lot has changed since the last cut. Confirmed to work as of 0.25.0:

ARMmbed/mbed-os@2a42255

I will continue to test over the coming weeks, but as of now we are able to successfully DHCP and respond via ICMP.

@Iceberg1988
Copy link

Hi @sherrellbc and @meriac,

I have nearly the same problems as described above #315 (comment)
I can compile and run code which contains a TCPSocket and the EthernetInterface without active uVisor. I get the connection and can communicate with the TCP socket. But with active uVisor I get the following error message at runtime:
HALT_ERROR(./core/system/src/unvic.c#178): IRQ 84 is unregistered; state cannot be changed

Here is the exact point, where uVisor crashs: fls_enet.c#L476
mbed-os is at master: ARMmbed/mbed-os@aeabcc9 with uVisor Version v0.26.1

After I registered the interrupts by adding
NVIC_SetVector(s_enetRxIrqId[instance], ENET_Receive_IRQHandler);
in fsl_enet.c#L476 and
NVIC_SetVector(s_enetTxIrqId[instance], ENET_Transmit_IRQHandler);
fsl_enet.c#L480 I added the prototypes by inserting
void ENET_Transmit_IRQHandler(void);
void ENET_Receive_IRQHandler(void);
in fsl_enet.c#L222
Then uVisor throws no error anymore, but the ethernet interface don't get a IP-Address like described above.

On 23.Sep this Issue here is closed after it properly worked with #320 ? I already tryed it with this version, but get the same problems.
Can somebody give me a Link to a working project with ethernet and uVisor working? Maybe there is some problem with my code? Or is this still a issue of uVisor and mbed?

@meriac meriac reopened this Nov 28, 2016
@meriac meriac added the issue label Nov 28, 2016
@sherrellbc
Copy link
Author

sherrellbc commented Jan 19, 2017

@Iceberg1988, @meriac

I can confirm no functional issues on the K64F platform with the latest mbed-os/uVisor release. Of course, a workaround for the above was developed and is still in effect; you must declare the appropriate ACLs in the public, or main, box. Attempting to isolate all ENET access to a box other than 0 remains broken. You can still use the ENET hardware from other boxes, you just cannot attempt to reserve it as a private resource. The associated HALT error is as follows:

***********************************************************
                    HARD FAULT
***********************************************************

* FAULT SYNDROME REGISTERS

  HFSR: 0x40000000
  --> FORCED: another priority escalated to hard fault.

* EXCEPTION STACK FRAME
  Exception from privileged code
    msp:     0x1FFF1EF8
    lr:      0xFFFFFFF1
  Exception stack frame:
    msp[07]: 0x0100000F | xPSR
    msp[06]: 0x00001176 | pc
    msp[05]: 0x00026507 | lr
    msp[04]: 0xFFFFFFFF | r12
    msp[03]: 0x00001175 | r3
    msp[02]: 0x00000033 | r2
    msp[01]: 0x0001ED9D | r1
    msp[00]: 0x00000033 | r0

* MPU CONFIGURATION
  CESR: 0x00815101
       Slave 0    Slave 1    Slave 2    Slave 3    Slave 4
  EAR: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
  EDR: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
       Start      End        Perm.      Valid
  R00: 0x00000000 0xFFFFFFFF 0x000007D0 0x00000001
  R01: 0x00000000 0x0003307F 0x0010000D 0x00000001
  R02: 0x20000000 0x2003001F 0x00180017 0x00000001
  R03: 0x1FFF2080 0x1FFF249F 0x0000001E 0x00000000
  R04: 0x1FFF2500 0x1FFF4A9F 0x0000001E 0x00000000
  R05: 0x00000000 0x0000001F 0x00000000 0x00000000
  R06: 0x00000000 0x0000001F 0x00000000 0x00000000
  R07: 0x00000000 0x0000001F 0x00000000 0x00000000
  R08: 0x00000000 0x0000001F 0x00000000 0x00000000
  R09: 0x00000000 0x0000001F 0x00000000 0x00000000
  R10: 0x00000000 0x0000001F 0x00000000 0x00000000
  R11: 0x00000000 0x0000001F 0x00000000 0x00000000

***********************************************************

HALT_ERROR(./core/vmpu/src/kinetis/vmpu_kinetis.c#155): Cannot recover from a hard fault.

The minimum ACL set to support ENET seems to be:

static const UvisorBoxAclItem g_main_box_acls[] = { 
    /* System requirements for timing, monitor calls, etc */
    {SIM,   sizeof(*SIM),   UVISOR_TACLDEF_PERIPH},
    {PIT,   sizeof(*PIT),   UVISOR_TACLDEF_PERIPH},    
    {OSC,   sizeof(*OSC),   UVISOR_TACLDEF_PERIPH},   
    {MCG,   sizeof(*MCG),   UVISOR_TACLDEF_PERIPH},

    /* Ethernet Hardware */
    {ENET,  sizeof(*ENET),  UVISOR_TACLDEF_PERIPH},
    {PORTA, sizeof(*PORTA), UVISOR_TACLDEF_PERIPH}, 
    {PORTB, sizeof(*PORTB), UVISOR_TACLDEF_PERIPH},
    {PORTC, sizeof(*PORTC), UVISOR_TACLDEF_PERIPH},
};

Of course, you must also manually register the ENET interrupts as shown above.

@sherrellbc
Copy link
Author

sherrellbc commented Feb 9, 2017

@meriac

Has there been any further progress to fix ENET only working from box 0 mentioned at #315 (comment) above? The issue is still present as of the mbedos release candidate 5.3.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants