Tracking: HIL methods that can fail should have ErrorCodes #1052

ppannuto · 2018-06-30T22:30:32Z

Several of the older HILs just drop errors on the floor, and callers will assume they succeeded when they actually failed.

I seeded this tracking issue with a quick skim of the existing HILs. Some of these that are questions may not actually need changes, some may be missing. Please edit this comment as appropriate.

GPIO: Some pins don't have interrupt support, so enable_interrupt should fail. Maybe also make_[out/in]put.
I2C: Any bus operations that could fail
RNG: Maybe get can fail?
SPI: Why can read_write_bytes fail, but read_write_byte, write_byte, and read_byte can't?
symmetric_encryption: start_message should return EBUSY rather than dropping the new request
time: Are there cases where the tics value for set_alarm should return EINVAL or ESIZE? Should repeat be able to fail if the interval is too small?
Uart: Add ReturnCodes to UART HIL, change abort policy #1049
USB: At a minimum attach can fail
Watchdog: Should start be able to reject impossible periods?

The text was updated successfully, but these errors were encountered:

phil-levis · 2018-07-04T00:00:10Z

Generally speaking, almost every call in a HIL can have a failure condition. E.g., suppose that the operation is performed by a separate chip over a bus. E.g., an external encryption accelerator, a GPIO MUX, etc. This will involve an interaction over the bus to turn on/configure the other chip; what if the external chip does not respond correctly, e.g., there is a power problem. The interfaces to specific implementations might not have failure conditions, but the HIL needs to represent the union of all possible implementations.

Also, callbacks usually need to have ReturnCodes. If the call could fail, then it's possible the failure isn't detected until later, at which point it has to be signaled through the callback. E.g., suppose a call to symmetric encryption might fail because the data is the wrong length. The symmetric encryption engine is virtualized, so the fact that it's the wrong length isn't detected until after the initial call is made (and the call is forwarded to the underlying engine). If the virtualizer can detect it's the wrong length, this means the virtualizer has to repeat encryption-specific logic that's in the encryption implementation.

alevy · 2018-07-04T21:30:26Z

E.g., suppose that the operation is performed by a separate chip over a bus. E.g., an external encryption accelerator, a GPIO MUX, etc.

Maybe there are other examples of this that don't involve a chip over a bus, but in general, whether a particular HIL supports operations for a bus is explicit (e.g. gpio vs. gpio_async), and I think we don't want to change that.

So, for example, I think it's certainly fine for GPIO#set not to have a ReturnCode.

Even for some cases that are over a bus, like an external encryption accelerator, I'm not entirely sure how a client of the HIL would deal with an error like the peripheral chip not responding correctly in a reasonable way. I'm not saying suggesting one way or another whether encryption HIL's methods or callbacks should have a ReturnCode, just that I don't think it's that clear cut.

In general, having error cases greatly complicates writing client code, so it should be avoided if it's not necessary or useful.

ppannuto · 2018-07-04T22:25:15Z

I've been thinking about this a little, and I wonder if it's suboptimal for HIL methods to return generic ReturnCodes. There are currently 13 different types of failures in enum ReturnCode, but most HIL interfaces specify (in documentation only) that the HIL can only return 1-3 of these.

I don't have this idea totally fleshed out, but perhaps we could add some form of macro that would look something like:

return_codes![RNG::get,
    /// If SUCCESS is returned then the implementation MUST issue a randomness_available callback sometime in the future.
    SUCCESS,

    /// Indicates that the RNG is not currently powered/available. No callback will be generated.
    EOFF,

    /// Indicates a more general failure condition.
    FAIL,
]

pub trait RNG<'a> {
    fn get(&self) -> RNG::get::ReturnCode;
    ...
}

Where the macro enforces that the return codes supplied are all kernel ReturnCode types, and automatically implements from/to methods as appropriate. Now, however, callers could match over the HIL return and guarantee that they're handling all of the error cases.

Too complex to be useful / worth the effort, or worthwhile to help ease error handling / ensure correctness?

bradjc · 2018-07-06T00:13:54Z

Handling ReturnCodes is hard, but if there is no mechanism to report errors that writing reliable long-running applications is hard too, so in general I think we do need a way to signal errors, but, I agree that we don't in the cases where something can't realistically fail.

Too complex to be useful / worth the effort, or worthwhile to help ease error handling / ensure correctness?

I'm not sure. I wonder how this could be implemented, because I imagine that in a lot of cases the handler of the HIL error is simply going to pass it to userspace.

ppannuto · 2018-07-06T00:28:14Z

I think the pass-through case is actually the easiest, since the syscall interface already takes the generic ReturnCode, the macros that would generate the HIL-specific types can implement from/to to allow casting from the HIL-specific return type to the generic ReturnCode, which can then be passed off to syscall response.

I think where this would really see use is where callers of the HIL do want to handle errors, so they match over the possible HIL return types, then if the HIL changes you get compiler assistance for cleaning up. If you don't match (or even if you do and you include a _ case) then you won't get help because you aren't really using the types. It's a sort of niche aide, but perhaps a useful one for letting people make guarantees about implementing the interface correctly/completely.

phil-levis · 2018-07-06T18:54:48Z

On Jul 4, 2018, at 3:25 PM, Pat Pannuto ***@***.***> wrote: There are currently 13 different types of failures in enum ReturnCode, but most HIL interfaces specify (in documentation only) that the HIL can only return 1-3 of these.

This is not uncommon; ReturnCode is like errno in UNIX. Any given HIL may only return a subset, but those subsets are somewhat distinct. ENOACK, for example, is a very specific one for radios with synchronous acknowledgments, while ENODEVICE is for system calls. In Tock’s case, I think there’s an additional problem, that in many HILs we have yet to think through all of the error cases and assign them to ReturnCodes. For example, sdcard.rs uses ERESERVE to indicate the card wasn’t initialized and ENOMEM to indicate there’s no buffer available. The console driver, in contrast, uses ERESERVE to indicate that there’s no buffer available. On one hand, I think your suggestion of more strongly typing the error codes would be useful: it explicitly specifies in code (rather than documentation) which error codes. If I felt that the HILs were fully nailed down, that might make sense: you can match on the ReturnCode and know you hit them all. The one challenge here is what you do when you decide there needs to be a new error code (the HIL changes). If the return codes are all explicit, then all code calling the HIL will break. If the return codes are not explicit, then your code will have a default match. This is nice, because code that knows about the new error code can match on it, while code that doesn’t will just treat it like a generic failure. You always need to have a FAIL case. Phil

ppannuto mentioned this issue Jun 30, 2018

cortex-m4: move hardfault handler to arch #1051

Merged

2 tasks

ppannuto mentioned this issue Jul 5, 2018

Tracking & RFC: UART HIL Refinements #1072

Closed

9 tasks

bradjc added the HIL This affects a Tock HIL interface. label Jul 6, 2018

alevy mentioned this issue Jul 6, 2018

Specify units for ADC HIL #1032

Merged

2 tasks

ppannuto added the tracking label Jul 6, 2018

bradjc changed the title ~~Tracking: HIL methods that can fail should have ReturnCodes~~ Tracking: HIL methods that can fail should have ErrorCodes Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking: HIL methods that can fail should have ErrorCodes #1052

Tracking: HIL methods that can fail should have ErrorCodes #1052

ppannuto commented Jun 30, 2018 •

edited by bradjc

Loading

phil-levis commented Jul 4, 2018

alevy commented Jul 4, 2018

ppannuto commented Jul 4, 2018

bradjc commented Jul 6, 2018

ppannuto commented Jul 6, 2018

phil-levis commented Jul 6, 2018 via email

Tracking: HIL methods that can fail should have ErrorCodes #1052

Tracking: HIL methods that can fail should have ErrorCodes #1052

Comments

ppannuto commented Jun 30, 2018 • edited by bradjc Loading

phil-levis commented Jul 4, 2018

alevy commented Jul 4, 2018

ppannuto commented Jul 4, 2018

bradjc commented Jul 6, 2018

ppannuto commented Jul 6, 2018

phil-levis commented Jul 6, 2018 via email

ppannuto commented Jun 30, 2018 •

edited by bradjc

Loading