Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDC Serial write and flush lock up when interrupts are disabled #803

Closed
matthijskooijman opened this issue Nov 28, 2019 · 5 comments
Closed

Comments

@matthijskooijman
Copy link
Contributor

matthijskooijman commented Nov 28, 2019

In the CDC Serial, when the write() method is called and the buffer is full, it blocks until buffer room becomes available. When flush() is called, it blocks until the buffer is empty.

However, when these methods are called with interrupts disabled (or from an ISR with higher priority than the USB interrupt), the ISR that makes room in the buffer will never run, so this waiting blocks indefinitely, locking up the system.

Of course, one can say that serial printing from an ISR or with interrupts disabled is not a good idea, but it often makes sense for debugging (or e.g. for printing an error on a failed assertion). This is also the usecase that triggered this report: I was trying to print a message from the CPU fault handlers to let me know what was going on.

On AVR, this is fixed by detecting that interrupts are disabled, and if so, manually checking the interrupt flag and run the ISR manually when the interrupt flag is set (code). This makes sure that printing and flushing with interrupts disabled works as expected (there is still the related issue of USB CDC code that can get interrupted by an ISR that accesses CDC Serial, leading to inconsistent state, but that is a separate issue and far less likely to occur in practice, so not so problematic for debugging).

It would be good to implement the same thing in the SMT32 CDC serial. One complication is that while on AVR there is a single "interrupts are disabled" flag, on STM32F4 (Cortex M4) you would have to check:

  • Whether an exception is currently running, and if its priority is higher than the USB priority (Can be done using this method.
  • Whether the PRIMASK bit is set (which disables all interrupts)
  • Whether the minimum exception priority specified in BASEPRI is low enough.

This might vary between different cortex cores, e.g. on M0 BASEPRI is not available, there is no VECTACTIVE (but the same value is in IPSR) and PRIMASK is called PM.

An implementation could look like (pseudocode):

while (wait_for_buffer_space) {
  if (IPSR && GetPriority(IPSR) <= GetPriority(USB_IRQn) || (PRIMASK & PM) || BASEPRI <= GetPriority(USB_IRQn)
	run_isr();
}

This might also need to clear the pending bit if that is not automatic. Also, the if might be good to put into a utility function, so it can be used in other places too.

A complication for implementing this is the layers of code that are involved in the CDC Serial code currently (I think there's at least two, probably three layers of USB code?), but maybe this could be added to the lowest layer and called from the upper layers?

One alternative implementation would be to detect this situation and temporarily raise the USB interrupt priority until the loop completes. However, this will not work when the PRIMASK bit is set, when the current interrupt already has priority 0 (or less, for e.g. NMI or hardfault), so this probably is not a viable option.

A completely different behaviour would be to detect that the ISR cannot run and drop bytes instead of waiting, but that significantly changes the behaviour of these functions (and this deviates from the AVR behaviour), so I don't think that's a good idea.

The workaround I'm currently using is to lower the priority of the current ISR to less than the USB ISR priority, so the USB ISR can interrupt. In general, this is a bad idea, since that will also allow other interrupts to preempt.

I suspect that the hardware serial / UART code suffers from exactly the same problem and should probably be fixed in the same way, but have not tested this.

The same problem also happens with the Arduino SAMD core HardwareSerial, see arduino/ArduinoCore-samd#472 (mostly identical to this issue).

@fpistm
Copy link
Member

fpistm commented Nov 28, 2019

Hi @matthijskooijman
Thanks for this nice detailed issue 👍

About:

Of course, one can say that serial printing from an ISR or with interrupts disabled is not a good idea, but it often makes sense for debugging (or e.g. for printing an error on a failed assertion). This is also the usecase that triggered this report: I was trying to print a message from the CPU fault handlers to let me know what was going on.

In that case I advise to use printf() or core_debug() which print on the UART_DEBUG and not dependent of the interrupt. Else use a debugger 😉

@fpistm fpistm added this to To do in STM32 core based on ST HAL via automation Nov 28, 2019
@matthijskooijman
Copy link
Contributor Author

In that case I advise to use printf() or core_debug() which print on the UART_DEBUG and not dependent of the interrupt. Else use a debugger wink

That won't help us, since that prints to the UART and we have nothing attached to that (so we need the USB serial) :-)

Using a debugger is indeed another option, but we would like to have some logging also when our device is in use by a customer.

But the workaround works for use for now, it would just be nice to fix this generally in the core as well :-)

@fpistm
Copy link
Member

fpistm commented Dec 2, 2019

Using a debugger is indeed another option, but we would like to have some logging also when our device is in use by a customer.

Right. Just for my interest in which case you use this core for having customer?

But the workaround works for use for now, it would just be nice to fix this generally in the core as well :-)

Of course. Any help are greatly appreciated to harden the core.

@matthijskooijman
Copy link
Contributor Author

matthijskooijman commented Dec 10, 2019

Right. Just for my interest in which case you use this core for having customer?

I'm working with 3devo to develop an STM32-based control board for one of their new products. We have previously worked with ATmega2560-based boards, but kept running out of RAM, so decided to switch to something a bit more modern :-)

Of course. Any help are greatly appreciated to harden the core.

I'm happy to give a stab at this later if I can find some time, but certainly not until we have a first completely working version...

@fpistm
Copy link
Member

fpistm commented Jul 13, 2021

Added to [USB] Features request list #687

@fpistm fpistm closed this as completed Jul 13, 2021
STM32 core based on ST HAL automation moved this from To do to Done Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants