chips: e310: improve interrupt handling #2116

bradjc · 2020-09-18T18:35:17Z

Pull Request Overview

This pull request changes the e310 chip to:

Not only look at the plic, but also the MIE register for if interrupts have occurred.
Not accidentally disable interrupts when servicing them.

This fixes timer issues on the hifive1b.

Testing Strategy

Running the alarm multi test on hifive1b hardware.

TODO or Help Wanted

n/a

Documentation Updated

Updated the relevant files in /docs, or no updates are required.

Formatting

Ran make prepush.

ppannuto

The interrupt handling change looks good and makes sense, one other question though

ppannuto · 2020-09-18T18:47:03Z

chips/e310x/src/chip.rs

    }

    fn has_pending_interrupts(&self) -> bool {
-        unsafe { plic::has_pending() }
+        CSR.mip.matches_any(mip::mext::SET + mip::mtimer::SET)


This change confuses me, will this match list need to grow as more interrupts are supported?

Specifically, your comment said

Not only look at the plic, but also the MIE register for if interrupts have occurred.

But this change seems to look at MIE instead of rather than in addition to?

PLIC == mext

This looks good. Can OT have the same change as well?

alistair23 · 2020-09-18T19:40:19Z

chips/e310x/src/chip.rs

+        // Re-enable all MIE interrupts that we care about. Since we looped
+        // until we handled them all, we can re-enable all of them.
+        CSR.mie.modify(mie::mext::SET + mie::mtimer::SET);


This doesn't seem right as we disable the mtimer when we want to disable the alarm: https://github.com/tock/tock/blob/master/arch/rv32i/src/machine_timer.rs#L141

But the e310 platform (hifive) isn't using that SchedulerTimer implementation. I agree they aren't compatible, but we can't have the scheduler timer changing interrupts for the only timer source we have.

If someone adds a RISC-V board that does have other timers they are going to look at this code (and OT which is the same) as an example. I don't think we should be enabling timer interrupts if they weren't already enabled AND have code that disables them and expects them to stay disabled..

Ideally I think it would be best to record that we disabled them handling an interrupt and then only enable them if that's true.

I actually think that is more confusing, since Tock's design is all interrupts are enabled at the NVIC/MIE/PLIC layer, but putting that aside, I don't actually know how to write that code in a way that is both correct and I would be ok with someone else copying.

But this (always re-enabling mext and mtimer) is clearly incorrect -- what if the kernel disabled mtimer (for whatever reason)? When there is only a single interrupt source, the scheduler timer should not be disabling its interrupts; it goes through a virtualizer. The virtualizer may disable underlying interrupts if there are no outstanding alarms. Breaking abstraction boundaries (e.g., who owns mtimer) in random ways due to software engineering issues is bad.

Right now in Tock the kernel cannot dynamically disable the mtimer interrupt (it could choose not to use it at all, and it can disable it when it is also pending, but the design as I understand it is all interrupts at the NVIC/PLIC/CLIC/CLINT/MIE layer are always enabled unless pending).

Even if it is always enabled, blinding force enabling it still seems wrong. What if someone is trying to debug something and wants the timer disabled? This will force enable it after it's disabled. There could also be future code changes that allow it to be disabled in the future.

but the design as I understand it is all interrupts at the NVIC/PLIC/CLIC/CLINT/MIE layer are always enabled unless pending).

Nope. This would mean you cannot suppress interrupts and therefore wakeup. Super-bad.

Instead, the current design -- which is fundamentally flawed -- is that you can suppress interrupts from occurring/waking up the MCU but cannot suppress their handling. I.e, #1306 .

tock/arch/cortex-m/src/nvic.rs

Line 167 in ad9387a

pub fn disable(&self) {

and

tock/arch/cortex-m/src/nvic.rs

Line 109 in ad9387a

pub unsafe fn disable_all() {

let you disable interrupts in the CortexM, NVIC, for example.

The CortexM semantics are a problem for all kinds of reasons. But it's even more of a problem is the semantics aren't even consistent across architectures!

I agree with @bradjc on this: it is currently how we are treating interrupts. I agree with @phil-levis and @alistair23 that is not necessarily the right way to be dealing with interrupts, but that is the architecture we're using in Tock atm.

And generally, you cannot currently disable interrupts from waking the CPU: nvic::disable is an instance method of a struct that requires unsafe to construct and is handed to nobody and nvic::disable_all is unsafe.

I don't understand why it would be too hard to just record that we disabled the interrupt in interrupt handling and then only re-enable interrupts if that happened. What am I missing here?

alistair23 · 2020-09-18T19:40:40Z

chips/e310x/src/chip.rs

@@ -76,35 +75,32 @@ impl<A: 'static + Alarm<'static>> kernel::Chip for E310x<A> {
    }

    fn service_pending_interrupts(&self) {
-        let mut reenable_intr = FieldValue::<u32, mie::Register>::new(0, 0, 0);


This change also needs to happen to OpenTitan, which uses the same logic.

Let's make that a separate PR?

Fine with me, as long as it doesn't get lost. They are pretty much exactly the same code.

alistair23 · 2020-09-18T19:42:31Z

chips/e310x/src/chip.rs

    }

    fn has_pending_interrupts(&self) -> bool {
-        unsafe { plic::has_pending() }
+        CSR.mip.matches_any(mip::mext::SET + mip::mtimer::SET)


This looks good. Can OT have the same change as well?

hudson-ayers · 2020-09-28T16:15:16Z

This is one of two remaining blockers for 1.6. I propose that we have two options here:

Modify this PR to use Alistair's approach of saving and re-enabling interrupts. This is probably a better design, but also important to think about carefully as it has been a source of bugs in the past, and is not necessarily aligned with the spirit of how Tock treats interrupts today.
Leave this PR as-is, but delete the lines of the machine timer SchedulerTimer implementation that turn off the machine timer interrupt enable (this is okay, because the scheduler timer trait specifies that arm()/disarm() are optional optimizations). This approach keeps things in line with how the Tock interrupt architecture currently works, and gets us to a release sooner.

I am in favor of option 2, combined with @alistair23 or @phil-levis submitting a separate PR for review after the release that introduces the save/restore approach on all risc-v platforms, as well as possibly any changes on Cortex-M to allow the kernel to disable interrupts.

bradjc · 2020-09-29T14:57:42Z

I'm writing code that only re-enables interrupts that were serviced, instead of all (both) interrupts unconditionally. I can't say I'm a fan of the code. Here is my function:

fn service_pending_interrupts(&self) {
        let mut serviced_interrupt_mtimer: bool = false;
        let mut serviced_interrupt_mext: bool = false;

        loop {
            let mip = CSR.mip.extract();

            if mip.is_set(mip::mtimer) {
                unsafe {
                    timer::MACHINETIMER.handle_interrupt();
                }
                serviced_interrupt_mtimer = true;
            }
            if mip.is_set(mip::mext) {
                unsafe {
                    Self::handle_plic_interrupts();
                }
                serviced_interrupt_mext = true
            }

            if !mip.matches_any(mip::mext::SET + mip::mtimer::SET) {
                break;
            }
        }

        // Enable the interrupts that we serviced. Since the interrupts we
        // serviced were pending, we assume they should be re-enabled and are
        // not intended to be disabled for some reason.
        if serviced_interrupt_mtimer {
            CSR.mie.modify(mie::mtimer::SET);
        }
        if serviced_interrupt_mext {
            CSR.mie.modify(mie::mext::SET);
        }
    }

First, this still goes against Tock's current interrupt model where all PLIC/NVIC interrupts and above are always enabled. Second, without a clear description somewhere of what the interrupt model is exactly (if not our current model), this change can lead to some weird behavior if different layers can assume they can disable interrupts. For example, what if the peripheral does the (I would argue reasonable) thing of disabling the interrupt in handle_interrupt() after handling the interrupt? This code will then immediately re-enable it. At least with the current version in this PR there is no pretense that other layers can disable the interrupt.

bradjc · 2020-09-29T14:58:33Z

Implementing a new interrupt model for Tock is a bigger change, and one that I don't think this PR should start on.

hudson-ayers · 2020-09-30T16:29:28Z

I submitted #2134 which removes the peripheral code that directly disabled the mtimer interrupt.

2134: riscv mtimer: don't directly modify interrupt enables r=phil-levis a=hudson-ayers ### Pull Request Overview This pull request removes all direct manipulations of the risc-v mtimer interrupt enable that do not occur from within the top level interrupt handler or a board main file. This is to bring RISC-V chips in line with the *current* Tock interrupt architecture, so that any discussion of allowing peripherals to directly enable/disable interrupts can wait until after the 1.6 release. Why I think these removals do not affect correctness of the code: - `SchedulerTimer::arm()` and `SchedulerTimer::disarm()` are optional optimizations - `set_alarm()` uses the approach recommended in the RISC-V ISA to avoid triggering alarms while setting them, so disabling interrupts while setting the alarm should no longer be required. Enabling interrupts at the end of `set_alarm()` is not required, because with this change timer interrupts should never be disabled. With this change, #2116 can safely be merged as-is. ### Testing Strategy This pull request was tested by compiling. ### TODO or Help Wanted N/A ### Documentation Updated - [x] No updates are required. ### Formatting - [x] Ran `make prepush`. Co-authored-by: Hudson Ayers <hayers@stanford.edu>

hudson-ayers · 2020-10-02T16:41:24Z

bors r+

2116: chips: e310: improve interrupt handling r=hudson-ayers a=bradjc ### Pull Request Overview This pull request changes the e310 chip to: 1. Not only look at the plic, but also the MIE register for if interrupts have occurred. 2. Not accidentally disable interrupts when servicing them. This fixes timer issues on the hifive1b. ### Testing Strategy Running the alarm multi test on hifive1b hardware. ### TODO or Help Wanted n/a ### Documentation Updated - [x] Updated the relevant files in `/docs`, or no updates are required. ### Formatting - [x] Ran `make prepush`. Co-authored-by: Brad Campbell <bradjc5@gmail.com>

Stale review as discussed on the weekly call

alevy · 2020-10-02T20:05:49Z

bors r+

bors · 2020-10-02T20:38:49Z

Build succeeded:

Similar to tock#2116 let's ensure that timer and external interrupts are always enabled. Signed-off-by: Alistair Francis <alistair.francis@wdc.com>

2152: chips: earlgrey: Ensure interrupts are always enabled r=bradjc a=alistair23 ### Pull Request Overview Similar to #2116 let's ensure that timer and external interrupts are always enabled. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> ### Testing Strategy QEMU ### TODO or Help Wanted ### Documentation Updated - [X] Updated the relevant files in `/docs`, or no updates are required. ### Formatting - [X] Ran `make prepush`. Co-authored-by: Alistair Francis <alistair.francis@wdc.com>

chips: e310: correctly service interrupts

c05a8f1

ppannuto reviewed Sep 18, 2020

View reviewed changes

ppannuto approved these changes Sep 18, 2020

View reviewed changes

alistair23 previously requested changes Sep 18, 2020

View reviewed changes

bradjc mentioned this pull request Sep 21, 2020

Time redesign v3 #2089

Merged

9 tasks

bradjc added the release-blocker Issue or PR that must be resolved before the next release label Sep 25, 2020

hudson-ayers mentioned this pull request Sep 30, 2020

riscv mtimer: don't directly modify interrupt enables #2134

Merged

2 tasks

hudson-ayers approved these changes Sep 30, 2020

View reviewed changes

alevy approved these changes Oct 2, 2020

View reviewed changes

bors bot merged commit edc53ee into master Oct 2, 2020

bors bot deleted the hifive1b-interrupt-updates branch October 2, 2020 20:38

alistair23 mentioned this pull request Oct 12, 2020

chips: earlgrey: Ensure interrupts are always enabled #2152

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chips: e310: improve interrupt handling #2116

chips: e310: improve interrupt handling #2116

bradjc commented Sep 18, 2020

ppannuto left a comment

ppannuto Sep 18, 2020

ppannuto Sep 18, 2020 •

edited

bradjc Sep 18, 2020

alistair23 Sep 18, 2020

alistair23 Sep 18, 2020

bradjc Sep 18, 2020

alistair23 Sep 18, 2020

bradjc Sep 18, 2020

phil-levis Sep 21, 2020

bradjc Sep 22, 2020

alistair23 Sep 22, 2020

phil-levis Sep 22, 2020

alevy Sep 25, 2020

alistair23 Sep 25, 2020

alistair23 Sep 18, 2020

phil-levis Sep 21, 2020

alistair23 Sep 23, 2020

alistair23 Sep 18, 2020

hudson-ayers commented Sep 28, 2020

bradjc commented Sep 29, 2020

bradjc commented Sep 29, 2020

hudson-ayers commented Sep 30, 2020

hudson-ayers commented Oct 2, 2020

alevy commented Oct 2, 2020

bors bot commented Oct 2, 2020

chips: e310: improve interrupt handling #2116

chips: e310: improve interrupt handling #2116

Conversation

bradjc commented Sep 18, 2020

Pull Request Overview

Testing Strategy

TODO or Help Wanted

Documentation Updated

Formatting

ppannuto left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ppannuto Sep 18, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hudson-ayers commented Sep 28, 2020

bradjc commented Sep 29, 2020

bradjc commented Sep 29, 2020

hudson-ayers commented Sep 30, 2020

hudson-ayers commented Oct 2, 2020

alevy commented Oct 2, 2020

bors bot commented Oct 2, 2020

ppannuto Sep 18, 2020 •

edited