dealing with spurious GPIO notifications #1700

hawkw · 2024-04-01T19:35:53Z

While I was working on #1696, @cbiffle pointed out to me that a task that receives a notification mapped to an EXTI GPIO interrupt cannot rely on just the notification bit to determine whether a GPIO pin has actually changed state, because any task can post any notification to any other task. This means that a buggy (or malicious, but what is malicious if not 'buggy, but on purpose?) task may post us the bit corresponding to the EXTI notification by mistake, and the task can't determine whether that notification actually came from the stm32xx-sys task. Thus, we must still check what state the GPIO pin is actually in if we want to be resilient against spurious notifications.

I can think of a few ways to deal with this:

Make this the user's responsibility --- the "null solution". User code is responsible for reading a GPIO pin on receipt of an EXTI interrupt. If we do this, we should probably update documentation and examples so that anyone handling GPIO IRQ notifications knows they should check the pin's state.
Consider a higher-level API in drv-stm32xx-sys-api. We could also consider adding a higher level wait_for_gpio_pin_state type API in drv-stm32xx-sys-api. This would be implemented on top of the lower-level gpio_irq_control IPCs, and would take a notification mask and GPIO pin set, and provide an API for waiting for that pin to change state, actually check the pin's level, and continue waiting if it hasn't actually transitioned. We could also maybe have a type for doing this, where constructing that type does the appropriate Sys::gpio_configure_input and Sys::gpio_irq_configure calls before returning the value that allows actually waiting for the pin state change.

One potential downside of this approach, though, is that it doesn't compose nicely with code that waits for multiple notifications. For example, in stm32xx-sprot-server, we have code that either waits for an EXTI interrupt or a timeout, which couldn't use an API that just waits for a GPIO interrupt. We could potentially also add a variant of a "wait for pin state" function that also takes a timeout...but this still doesn't compose super well in cases where we want to, for example, wait for any of a set of GPIO notifications...
Have drv-stm32xx-sys read a pin's state before dispatching an interrupt. We could also have the sys task check whether a pin's level has actually transitioned before dispatching a notification to a user task. However, this only guards against one of the two potential failure modes: it handles the case where a task incorrectly posts the exti-wildcard-irq notification to sys, but it does not handle the case where a task incorrectly posts the notification mapped to an EXTI interrupt to the end-user task. So, user tasks still have to look at pin states (since they might be notified by a task other than sys), and we would now be doing the check twice. I'm not sure whether this is actually all that useful.

There are also potential issues around whether a GPIO pin's state transition triggers an IRQ and then it goes back to the previous state between when the sys task is notified and when the end-user task is notified. This is more of a problem for IRQs raised by stuff like buttons than for IRQs raised by a different MCU or similar device, where the IRQ line is often used as a "handshake" that only gets unset when the device receiving the IRQ line actually performs some operation. I'm not sure whether we should be concerned about this case or not...

The text was updated successfully, but these errors were encountered:

cbiffle · 2024-04-01T19:43:19Z

Initial brief thoughts:

On (3), sys already checks for the pending flags in EXTI before dispatching any interrupts, so if you post its irq notification and nothing has happened, it just goes back to sleep. We don't want it to check the pin state before distributing notifications because, as-is, we're relying on the edge detectors in EXTI to catch potentially fleeting edges. If the net fell but then popped back up before sys got scheduled, we'd still like to detect the falling edge.

I'm concerned about the round-trip overhead of (2) and the potential for missing fleeting edges as in (3) (it's slightly worse than (3) in that regard).

Here are two more possibilities for consideration.

(4) add MAC for notification bits. This is a thing we've talked about forever, but haven't done. This would be a kernel change and would have design implications.

(5) add an API to sys that works more like net, where it pokes the task that owns the pin, and the task calls back to retrieve the edge information. That interaction can't be forged or spuriously triggered, because if someone posts to the client task, it'll call sys and sys will say "nada." This is higher-overhead than just responding to the notification but is more obviously correct, I think.

hawkw · 2024-04-01T19:57:11Z

Yeah, I'm also concerned about the potential for missing fleeting edges as I mentioned in the last paragraph; you make a good point that sys already knows whether the EXTI flag is pending, and needn't actually look at the pin.

I'm concerned about the round-trip overhead of (2) and the potential for missing fleeting edges as in (3) (it's slightly worse than (3) in that regard).

In this case, I don't actually think there's additional round-trip overhead --- the new API would be entirely in the client crate, rather than something you have to send another IPC to sys to do. But, you're right that it has the potential to miss fleeting edges and we should probably be pursuing a solution that doesn't have that problem, regardless.

Regarding your proposals, I think (5) seems like the most tactical solution --- I could wire that up this afternoon. The overhead of doing an additional IPC roundtrip is a shame, though. I think (4) is in many ways the Right Thing, but it would require substantially more design work.

An additional thought is that an IPC that asks sys to please ack that a GPIO IRQ is "real", as in (5), could also include an "and also, re-enable my IRQ" flag. That way, when a task wants to receive another notification for that pin, it can elide a subsequent sys.gpio_irq_control call to enable it again, so we're not adding another round trip in that case, at least.

cbiffle · 2024-04-02T17:25:09Z

An additional thought is that an IPC that asks sys to please ack that a GPIO IRQ is "real", as in (5), could also include an "and also, re-enable my IRQ" flag.

That'd be great -- that would actually keep our current IPC round trip count in practice, since clients tend to immediately ask to turn their IPCs back on.

@cbiffle

Currently, there is no mechanism for a task woken by a notification mapped to an EXTI GPIO interrupt to determine whether the pin it cares about has actually changed state, rather than receiving a notification posted by some other task (see #1700). In order to make this possible, @cbiffle [suggested] that the `stm32xx-sys` task provide an IPC interface that a task receiving an EXTI notification can call to ack that the IPC actually was caused by a pin state transition. This branch extends the `Sys::gpio_irq_control` IPC to return a bool if any EXTI IRQs mapped to the notifications in a task's notification maskhave triggered since the last time `gpio_irq_control` was called. Adding this functionality to `gpio_irq_control` --- which a task that wishes to receive another EXTI notification must call anyway --- allows us to keep the number of IPC roundtrips the same when a task receives such notifications in a loop. In order to support other cases, where a task only cares about one such notification, the `gpio_irq_control` interface has been changed to determine whether to enable or disable the IRQs in the mask based on a second argument, rather than taking two masks as it did previously, in order to allow a third operation which *only* checks the state of the interrupt, leaving it enabled if it has not been triggered but not re-enabling it if it has. We no longer have the ability to both enable and disable disjunct sets of IRQs in one IPC, but in practice, no task we have currently even *uses* more than one GPIO IRQ anyway. And, getting rid of the double-mask interface also fixes #1714. I've implemented this by tracking a 16-bit bitmap in the `Sys` task that mirrors the EXTI pending register. This is because `sys` must clear the pending bit in order to receive another interrupt from _any_ EXTI source. Closes #1700 Closes #1714 [suggested]: #1700 (comment)

@cbiffle

Currently, there is no mechanism for a task woken by a notification mapped to an EXTI GPIO interrupt to determine whether the pin it cares about has actually changed state, rather than receiving a notification posted by some other task (see #1700). In order to make this possible, @cbiffle [suggested] that the `stm32xx-sys` task provide an IPC interface that a task receiving an EXTI notification can call to ack that the IPC actually was caused by a pin state transition. This branch extends the `Sys::gpio_irq_control` IPC to return a bool if any EXTI IRQs mapped to the notifications in a task's notification maskhave triggered since the last time `gpio_irq_control` was called. Adding this functionality to `gpio_irq_control` --- which a task that wishes to receive another EXTI notification must call anyway --- allows us to keep the number of IPC roundtrips the same when a task receives such notifications in a loop. In order to support other cases, where a task only cares about one such notification, the `gpio_irq_control` interface has been changed to determine whether to enable or disable the IRQs in the mask based on a second argument, rather than taking two masks as it did previously, in order to allow a third operation which *only* checks the state of the interrupt, leaving it enabled if it has not been triggered but not re-enabling it if it has. We no longer have the ability to both enable and disable disjunct sets of IRQs in one IPC, but in practice, no task we have currently even *uses* more than one GPIO IRQ anyway. And, getting rid of the double-mask interface also fixes #1714. I've implemented this by tracking a 16-bit bitmap in the `Sys` task that mirrors the EXTI pending register. This is because `sys` must clear the pending bit in order to receive another interrupt from _any_ EXTI source. Closes #1700 Closes #1714 [suggested]: #1700 (comment)

hawkw added 🤔 design Forward-looking discussions and proposals stm32 Has specific implications for STM32 processors labels Apr 1, 2024

hawkw mentioned this issue Apr 2, 2024

stm32xx-sys: add interrupt acking to gpio_irq_control #1712

Merged

hawkw closed this as completed in #1712 Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dealing with spurious GPIO notifications #1700

dealing with spurious GPIO notifications #1700

hawkw commented Apr 1, 2024

cbiffle commented Apr 1, 2024

hawkw commented Apr 1, 2024

cbiffle commented Apr 2, 2024

dealing with spurious GPIO notifications #1700

dealing with spurious GPIO notifications #1700

Comments

hawkw commented Apr 1, 2024

cbiffle commented Apr 1, 2024

hawkw commented Apr 1, 2024

cbiffle commented Apr 2, 2024