"Can this task panic" as part of its external interface? #1716

cbiffle · 2024-04-03T18:46:09Z

We're approaching having some core tasks that can't panic. (...other than idle, that is, which isn't an interesting case.)

We've always had one task that de facto can't panic in a working system: the supervisor. Since nobody will be there to restart it, the supervisor is not allowed to panic. (It currently technically can panic for compiler reasons but I'm working on it.)

We have other core tasks that can't panic in practice, such as stm32xx-sys and packrat.

@hawkw pointed out that it might be interesting to make this fact a property of their external (IPC) interface. This would let us simplify idol code generation and remove some illusory error cases / panics from their clients.

To be comfortable removing the error checking from clients, we'd need some assurance that the task can't panic. The userlib/no-panic feature I'm in the process of adding takes one step in that direction by not compiling in a #[panic_handler]. (Sort of; Rust doesn't actually support that but we're faking it using linker trickery.) However, such a task can still access the sys_panic syscall.

If we have some piece of metadata declaring that the task can't panic, it would be nice to close that hole. At minimum we probably want to detect it attempting to call sys_panic and do something about it, like convert it into a fault so the supervisor will record a crash instead of a panic, maybe? This doesn't help the clients, however, which were previously assured that this task could not panic, and generated code relying on this fact.

We could add some sort of post-build analysis step that looks for use of the syscall, but this would only catch deliberate uses of the syscall; in the event of an exploit, a task could wind up calling the syscall through non-obvious means.

We probably do not want to alter the syscall for such tasks such that it returns an error. What is the task to do with such an error? sys_panic is defined as returning ! in Rust, so, it can't return or we're in UB land.

A scary but effective solution here is: convert a panic in a no-panic task into a system restart. This ensures that no client will hit UB by observing the panic, and ensure that whatever sneaky method for calling the syscall we failed to detect statically isn't allowed to do any further damage. While I have some concerns about this as a fault amplifier, I think it's possibly the right decision....

The text was updated successfully, but these errors were encountered:

hawkw · 2024-04-03T18:55:51Z

Another potential concern about doing something with this is that (in theory) the client-server design of Idol IPCs allows us to decouple clients from specific server implementations (although I'm not sure whether we really do this in practice all that much, with the exception of the gimletlet mock stuff). Making what is arguably an implementation detail part of the interface limits other implementations of that interface: they now also may not panic. In practice, I dunno how big a deal that is.

cbiffle · 2024-04-03T19:57:01Z

Given that getting rustc to generate code without panics is currently a somewhat error-prone and frustrating exercise at times, keeping it as an internal implementation detail gives us the ability to say "well, crap, we need to ship feature X and rustc has decided to stop inlining the Index impl for slice for some reason, turn the allow panics flag back on for now."

So, that's kinda nice, I suppose.

We might also allow a weaker "promises not to panic" setting in the IPC for things like Jefe where, seriously, even if the text contains a panic, it's not going to restart.

cbiffle added the 🤔 design Forward-looking discussions and proposals label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Can this task panic" as part of its external interface? #1716

"Can this task panic" as part of its external interface? #1716

cbiffle commented Apr 3, 2024

hawkw commented Apr 3, 2024

cbiffle commented Apr 3, 2024

"Can this task panic" as part of its external interface? #1716

"Can this task panic" as part of its external interface? #1716

Comments

cbiffle commented Apr 3, 2024

hawkw commented Apr 3, 2024

cbiffle commented Apr 3, 2024