Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chips: e310: improve interrupt handling #2116

Merged
merged 1 commit into from Oct 2, 2020
Merged

Conversation

bradjc
Copy link
Contributor

@bradjc bradjc commented Sep 18, 2020

Pull Request Overview

This pull request changes the e310 chip to:

  1. Not only look at the plic, but also the MIE register for if interrupts have occurred.
  2. Not accidentally disable interrupts when servicing them.

This fixes timer issues on the hifive1b.

Testing Strategy

Running the alarm multi test on hifive1b hardware.

TODO or Help Wanted

n/a

Documentation Updated

  • Updated the relevant files in /docs, or no updates are required.

Formatting

  • Ran make prepush.

Copy link
Member

@ppannuto ppannuto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interrupt handling change looks good and makes sense, one other question though

}

fn has_pending_interrupts(&self) -> bool {
unsafe { plic::has_pending() }
CSR.mip.matches_any(mip::mext::SET + mip::mtimer::SET)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change confuses me, will this match list need to grow as more interrupts are supported?

Copy link
Member

@ppannuto ppannuto Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, your comment said

Not only look at the plic, but also the MIE register for if interrupts have occurred.

But this change seems to look at MIE instead of rather than in addition to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLIC == mext

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Can OT have the same change as well?

Comment on lines +97 to +99
// Re-enable all MIE interrupts that we care about. Since we looped
// until we handled them all, we can re-enable all of them.
CSR.mie.modify(mie::mext::SET + mie::mtimer::SET);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right as we disable the mtimer when we want to disable the alarm: https://github.com/tock/tock/blob/master/arch/rv32i/src/machine_timer.rs#L141

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the e310 platform (hifive) isn't using that SchedulerTimer implementation. I agree they aren't compatible, but we can't have the scheduler timer changing interrupts for the only timer source we have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone adds a RISC-V board that does have other timers they are going to look at this code (and OT which is the same) as an example. I don't think we should be enabling timer interrupts if they weren't already enabled AND have code that disables them and expects them to stay disabled..

Ideally I think it would be best to record that we disabled them handling an interrupt and then only enable them if that's true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think that is more confusing, since Tock's design is all interrupts are enabled at the NVIC/MIE/PLIC layer, but putting that aside, I don't actually know how to write that code in a way that is both correct and I would be ok with someone else copying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this (always re-enabling mext and mtimer) is clearly incorrect -- what if the kernel disabled mtimer (for whatever reason)? When there is only a single interrupt source, the scheduler timer should not be disabling its interrupts; it goes through a virtualizer. The virtualizer may disable underlying interrupts if there are no outstanding alarms. Breaking abstraction boundaries (e.g., who owns mtimer) in random ways due to software engineering issues is bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now in Tock the kernel cannot dynamically disable the mtimer interrupt (it could choose not to use it at all, and it can disable it when it is also pending, but the design as I understand it is all interrupts at the NVIC/PLIC/CLIC/CLINT/MIE layer are always enabled unless pending).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it is always enabled, blinding force enabling it still seems wrong. What if someone is trying to debug something and wants the timer disabled? This will force enable it after it's disabled. There could also be future code changes that allow it to be disabled in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but the design as I understand it is all interrupts at the NVIC/PLIC/CLIC/CLINT/MIE layer are always enabled unless pending).

Nope. This would mean you cannot suppress interrupts and therefore wakeup. Super-bad.

Instead, the current design -- which is fundamentally flawed -- is that you can suppress interrupts from occurring/waking up the MCU but cannot suppress their handling. I.e, #1306 .

pub fn disable(&self) {
and
pub unsafe fn disable_all() {
let you disable interrupts in the CortexM, NVIC, for example.

The CortexM semantics are a problem for all kinds of reasons. But it's even more of a problem is the semantics aren't even consistent across architectures!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @bradjc on this: it is currently how we are treating interrupts. I agree with @phil-levis and @alistair23 that is not necessarily the right way to be dealing with interrupts, but that is the architecture we're using in Tock atm.

And generally, you cannot currently disable interrupts from waking the CPU: nvic::disable is an instance method of a struct that requires unsafe to construct and is handed to nobody and nvic::disable_all is unsafe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it would be too hard to just record that we disabled the interrupt in interrupt handling and then only re-enable interrupts if that happened. What am I missing here?

@@ -76,35 +75,32 @@ impl<A: 'static + Alarm<'static>> kernel::Chip for E310x<A> {
}

fn service_pending_interrupts(&self) {
let mut reenable_intr = FieldValue::<u32, mie::Register>::new(0, 0, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change also needs to happen to OpenTitan, which uses the same logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make that a separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me, as long as it doesn't get lost. They are pretty much exactly the same code.

}

fn has_pending_interrupts(&self) -> bool {
unsafe { plic::has_pending() }
CSR.mip.matches_any(mip::mext::SET + mip::mtimer::SET)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Can OT have the same change as well?

@bradjc bradjc mentioned this pull request Sep 21, 2020
9 tasks
@bradjc bradjc added the release-blocker Issue or PR that must be resolved before the next release label Sep 25, 2020
@hudson-ayers
Copy link
Contributor

This is one of two remaining blockers for 1.6. I propose that we have two options here:

  1. Modify this PR to use Alistair's approach of saving and re-enabling interrupts. This is probably a better design, but also important to think about carefully as it has been a source of bugs in the past, and is not necessarily aligned with the spirit of how Tock treats interrupts today.

  2. Leave this PR as-is, but delete the lines of the machine timer SchedulerTimer implementation that turn off the machine timer interrupt enable (this is okay, because the scheduler timer trait specifies that arm()/disarm() are optional optimizations). This approach keeps things in line with how the Tock interrupt architecture currently works, and gets us to a release sooner.

I am in favor of option 2, combined with @alistair23 or @phil-levis submitting a separate PR for review after the release that introduces the save/restore approach on all risc-v platforms, as well as possibly any changes on Cortex-M to allow the kernel to disable interrupts.

@bradjc
Copy link
Contributor Author

bradjc commented Sep 29, 2020

I'm writing code that only re-enables interrupts that were serviced, instead of all (both) interrupts unconditionally. I can't say I'm a fan of the code. Here is my function:

fn service_pending_interrupts(&self) {
        let mut serviced_interrupt_mtimer: bool = false;
        let mut serviced_interrupt_mext: bool = false;

        loop {
            let mip = CSR.mip.extract();

            if mip.is_set(mip::mtimer) {
                unsafe {
                    timer::MACHINETIMER.handle_interrupt();
                }
                serviced_interrupt_mtimer = true;
            }
            if mip.is_set(mip::mext) {
                unsafe {
                    Self::handle_plic_interrupts();
                }
                serviced_interrupt_mext = true
            }

            if !mip.matches_any(mip::mext::SET + mip::mtimer::SET) {
                break;
            }
        }

        // Enable the interrupts that we serviced. Since the interrupts we
        // serviced were pending, we assume they should be re-enabled and are
        // not intended to be disabled for some reason.
        if serviced_interrupt_mtimer {
            CSR.mie.modify(mie::mtimer::SET);
        }
        if serviced_interrupt_mext {
            CSR.mie.modify(mie::mext::SET);
        }
    }

First, this still goes against Tock's current interrupt model where all PLIC/NVIC interrupts and above are always enabled. Second, without a clear description somewhere of what the interrupt model is exactly (if not our current model), this change can lead to some weird behavior if different layers can assume they can disable interrupts. For example, what if the peripheral does the (I would argue reasonable) thing of disabling the interrupt in handle_interrupt() after handling the interrupt? This code will then immediately re-enable it. At least with the current version in this PR there is no pretense that other layers can disable the interrupt.

@bradjc
Copy link
Contributor Author

bradjc commented Sep 29, 2020

Implementing a new interrupt model for Tock is a bigger change, and one that I don't think this PR should start on.

@hudson-ayers
Copy link
Contributor

I submitted #2134 which removes the peripheral code that directly disabled the mtimer interrupt.

bors bot added a commit that referenced this pull request Oct 2, 2020
2134: riscv mtimer: don't directly modify interrupt enables r=phil-levis a=hudson-ayers

### Pull Request Overview

This pull request removes all direct manipulations of the risc-v mtimer interrupt enable that do not occur from within the top level interrupt handler or a board main file. This is to bring RISC-V chips in line with the *current* Tock interrupt architecture, so that any discussion of allowing peripherals to directly enable/disable interrupts can wait until after the 1.6 release.

Why I think these removals do not affect correctness of the code:

- `SchedulerTimer::arm()` and `SchedulerTimer::disarm()` are optional optimizations

- `set_alarm()` uses the approach recommended in the RISC-V ISA to avoid triggering alarms while setting them, so disabling interrupts while setting the alarm should no longer be required. Enabling interrupts at the end of `set_alarm()` is not required, because with this change timer interrupts should never be disabled.

With this change, #2116 can safely be merged as-is.

### Testing Strategy

This pull request was tested by compiling.


### TODO or Help Wanted

N/A

### Documentation Updated

- [x] No updates are required.

### Formatting

- [x] Ran `make prepush`.


Co-authored-by: Hudson Ayers <hayers@stanford.edu>
@hudson-ayers
Copy link
Contributor

bors r+

bors bot added a commit that referenced this pull request Oct 2, 2020
2116: chips: e310: improve interrupt handling r=hudson-ayers a=bradjc

### Pull Request Overview

This pull request changes the e310 chip to:

1. Not only look at the plic, but also the MIE register for if interrupts have occurred.
2. Not accidentally disable interrupts when servicing them.

This fixes timer issues on the hifive1b.


### Testing Strategy

Running the alarm multi test on hifive1b hardware.


### TODO or Help Wanted

n/a


### Documentation Updated

- [x] Updated the relevant files in `/docs`, or no updates are required.

### Formatting

- [x] Ran `make prepush`.


Co-authored-by: Brad Campbell <bradjc5@gmail.com>
@alevy alevy dismissed alistair23’s stale review October 2, 2020 20:05

Stale review as discussed on the weekly call

@alevy
Copy link
Member

alevy commented Oct 2, 2020

bors r+

@bors bors bot merged commit edc53ee into master Oct 2, 2020
@bors bors bot deleted the hifive1b-interrupt-updates branch October 2, 2020 20:38
alistair23 added a commit to alistair23/tock that referenced this pull request Oct 12, 2020
Similar to tock#2116 let's ensure that
timer and external interrupts are always enabled.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
bors bot added a commit that referenced this pull request Oct 12, 2020
2152: chips: earlgrey: Ensure interrupts are always enabled r=bradjc a=alistair23

### Pull Request Overview

Similar to #2116 let's ensure that
timer and external interrupts are always enabled.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>

### Testing Strategy

QEMU


### TODO or Help Wanted

### Documentation Updated

- [X] Updated the relevant files in `/docs`, or no updates are required.

### Formatting

- [X] Ran `make prepush`.


Co-authored-by: Alistair Francis <alistair.francis@wdc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker Issue or PR that must be resolved before the next release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants