Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] user defined monotonic timer #200

Closed
japaric opened this issue Jun 14, 2019 · 10 comments
Closed

[RFC] user defined monotonic timer #200

japaric opened this issue Jun 14, 2019 · 10 comments
Labels
RFC This issue needs you input! S-accepted This RFC has been accepted but not yet implemented
Milestone

Comments

@japaric
Copy link
Collaborator

japaric commented Jun 14, 2019

Current behavior

The schedule API internally makes uses of two timers: the DWT_CYCCNT (the
cycle counter) as a monotonic timer / counter and the SysTick (system timer)
to generate timeouts. Currently, these can't be changed and the result is that
the schedule API can't be used on ARMv6-M where DWT_CYCCNT doesn't exist.

Proposal

Require that the user specifies the monotonic timer that will be used to
implement the schedule API. The user will still be able to use DWT_CYCCNT as
the monotonic timer but this will not be the default; the user must specify
this timer, or some other timer, in their application.

This RFC does not propose a mechanism to sub the SysTick timer with a
different timer.

Rationale

The DWT_CYCCNT is not an appropriate monotonic timer for multi-core
applications: each Cortex-M core has its own DWT peripheral and its own cycle
counter so it's not possible to synchronize these two counters (there's no
register to do this), plus the counters may be operating at different frequencies
resulting in one core's Instant having a different meaning that other cores'
Instants. Also, as was previously mentioned the DWT_CYCCNT is not available
on ARMv6-M cores; this limits on which devices one can use the schedule API.

In multi-core applications it's better to use a device specific, constant
frequency timer visible to all cores as the monotonic timer. And in single-core
applications one may want to use a 64-bit timer or a prescaled 32-bit timer as
the monotonic timer; this way one can schedule tasks with long periods, e.g.
in the order of seconds (cc @jamesmunns).

Lastly, using a device-specific timer as the monotonic timer lets programmers
use the schedule API on single-core ARMv6-M devices.

Detailed design

The Monotonic trait

The following trait will be added to the cortex-m-rtfm crate.

struct Fraction { num: u32, den: u32 }

/// A monotonic clock / counter
pub trait Monotonic {
    /// A measurement of this clock
    type Instant: Copy + Ord + Sub;

    /// The ratio between the SysTick (system timer) frequency and the frequency of this clock
    fn ratio() -> Fraction;

    /// Returns the current time
    ///
    /// NOTE this must be an atomic operation
    fn now() -> Self::Instant;

    /// Resets the counter to *zero*
    ///
    /// NOTE the runtime will execute this method with interrupts disabled
    unsafe fn reset();

    /// A `Self::Instant` that represents a count of *zero*
    fn zero() -> Self::Instant;
}

This trait represents a monotonic timer. The trait is meant to be implemented on
global singleton ZSTs.

Also, although it's not shown in the trait interface, the RTFM runtime expects
that subtracting two Monotonic::Instants produces a value that implements the
TryInto<u32> trait. This fallible conversion must return a number of
Monotonic clock cycles that when multiplied by Monotonic::ratio() produce
SysTick clock cycles. The conversion is fallible to allow for Monotonic
implementations that use 64-bit counters / timers.

The monotonic argument

The #[rtfm::app] attribute will gain a monotonic argument that takes a path
to a struct that implements the Monotonic trait. This struct must be a
public unit struct, a struct with no fields. Also, the runtime expects that the
application initializes the specified timer during the init-ialization phase.

#[rtfm::app(device = some::path, monotonic = some::other::path)]
const APP: () = {
    // ..
};

The CYCCNT implementation

For ARMv7-M, the cortex-m-rtfm crate will provide an implementation of the
Monotonic trait that uses the cycle counter. The proposal is to place all this
API, which is basically today's Instant + Duration API, in a module named
cyccnt.

// `cortex-m-rtfm` crate

/// Data Watchpoint and Trace (DWT) Unit's CYCle CouNTer
#[cfg(armv7m)]
pub mod cyccnt {
    // omitted: imports

    const DWT_CYCCNT: *mut i32 = 0xE000_1004 as *const i32;

    #[derive(Clone, Copy, Eq, PartialEq)]
    pub struct Instant { inner: i32 }

    impl Instant {
        pub fn now() -> Self {
            Self {
                inner: unsafe { DWT_CYCCNT.read_volatile() }
            }
        }

        // ..
    }

    impl Ord for Instant {
       fn cmp(&self, rhs: &Self) -> Ordering {
           self.inner.wrapping_sub(rhs.inner).cmp(&0)
       }
    }

    impl Sub for Instant {
        type Output = Duration;

        // ..
    }

    pub struct Duration { inner: u32 }

    impl TryInto<u32> for Duration {
        type Error = Infallible;

        fn try_into(self) -> Result<u32, Infallible> {
            Ok(self.inner)
        }
    }

    /// A monotonic timer implemented on top of the CYCle CouNTer
    pub struct CYCCNT;

    impl super::Monotonic for CYCCNT {
        type Instant = Instant;

        // neither the SysTick or the CYCCNT can't be prescaled and both are
        // clocked at the same frequency
        fn ratio() -> u32 {
            1
        }

        unsafe fn reset() {
            DWT_CYCCNT.write_volatile(0)
        }

        fn now() -> Instant {
            Instant::now()
        }

        fn zero() -> Instant {
            Instant { inner: 0 }
        }
    }

    pub trait U32Ext {
        fn cycles(self) -> Duration;
    }

    impl U32Ext for u32 { .. }
}

Also note that U32Ext will no longer be automatically imported in RTFM
applications so one would need manually import U32Ext to use the cycles
method / constructor on u32 integers.

The MultiCore marker trait

Not all Monotonic implementations behave correctly in multi-core contexts.
The cycle counter (CYCCNT) doesn't for example because each core has its own
cycle counter and these counters are not synchronized and may even be running
at different frequencies.

To accommodate this fact we'll also provide a MultiCore marker trait in the
cortex-m-rtfm crate:

pub trait MultiCore {}

This marker trait should be implemented for monotonic timers that can be used in
multi-core context. The CYCCNT type will not implement this marker trait.

When the schedule API is used in more than one core the #[app] DSL will
compile time check that the monotonic argument specified by the user
implements the MultiCore trait. This way we'll prevent using CYCCNT in
multi-core context.

Migrating existing code

Existing users of the schedule API can migrate to the new API with just a few
changes:

// cortex-m-rtfm v0.4.x

use rtfm::Instant;

#[rtfm::app(device = some::path)]
const APP: () = {
    #[init(schedule = [foo])]
    fn init(c: init::Context) {
        c.schedule(c.start + 1_000_000.cycles()).ok();

        // ..
    }

    #[task]
    fn foo(c: foo::Context) {
        let now = Instant::now();

        // ..
    }

    // ..
};
// cortex-m-rtfm v0.5.x with this proposal

// (a) change `Instant` import
// (b) import `U32Ext` which is no longer automatically imported
use rtfm::cyccnt::{Instant, U32Ext};

// (c) add the `monotonic` argument
#[rtfm::app(device = some::path, monotonic = rtfm::cyccnt::CYCCNT)]
const APP: () = {
    #[init(schedule = [foo])]
    fn init(c: init::Context) {
        c.schedule(c.start + 1_000_000.cycles()).ok();

        // (d) IMPORTANT the author needs to initialize the DWT in this function
        // the DWT / CYCCNT can be initialized at any moment, even after the
        // `schedule` API is used

        // ..
    }

    // ..
};

An example

Applications are likely to implement the Monotonic trait "at the top" because
clock frequencies are selected by the application, not the HAL. Thus
Monotonic::ratio can't be known until the application is written. Implementing
Monotonic at the top also lets one use the core::time::Duration API.

use core::{convert::TryInto, ops, time::Duration};

// some `Instant` API backed by a device-specific timer
use pac::some_timer::Instant;

// For example, in this application:
// - core #0 SysTick frequency = 100 MHz
// - core #1 SysTick frequency = 80 MHz
// - `some_timer` frequency = 20  MHz
struct AppTimer;

// we need a newtype so we can implement the `TryInto` trait
struct AppDuration {
    inner: Duration,
}

impl Into<AppDuration> for Duration {
    fn into(self) -> AppDuration {
        AppDuration { inner: self }
    }
}

impl TryInto<u32> for AppDuration {
    // ..
}

// we need a newtype so we can implement the `Add` and `Sub` traits
#[derive(Clone, Copy, Eq, Ord, PartialEq, PartialOrd)]
struct AppInstant {
    inner: Instant,
}

impl ops::Add<AppDuration> for AppInstant {
    type Output = Self;

    // ..
}

impl rtfm::Monotonic for AppTimer {
    type Instant = AppInstant;

    // this implementation is multi-core friendly
    fn ratio() -> u32 {
        if cfg!(core = "1") {
            80 / 20
        } else {
            100 / 20
        }
    }

    fn now() -> AppInstant {
        AppInstant { inner: Instant::now() }

    }

    unsafe fn reset() {
        Instant::reset()
    }

    fn zero() -> AppInstant {
        AppInstant { inner: Instant::ZERO }
    }
}

#[rtfm::app(device = pac, monotonic = AppTimer)]
const APP: () = {
    #[init(schedule = [foo])]
    fn init(c: init::Context) {
        // (yes, this is fine to do before `some_timer` has been initialized)
        c.schedule.foo(c.start + Duration::from_secs(1).into());

        // initialize `some_timer` (e.g. set prescaler)

        // ..
    }
};

Unresolved questions

  • Should we place the Monotonic trait in a separate crate (e.g.
    cortex-m-rtfm-traits) so that HAL authors can implement it without pulling
    the whole cortex-m-rtfm crate, which has lots of dependencies?

  • Should we default to the CYCCNT implementation when no monotonic argument
    is given and the schedule API is used? But note that this will not work when
    the target is ARMv6-M and we won't be able to provide a good error message
    because it's not possible to know the compilation target at macro expansion
    time.


cc #115

@japaric japaric added the RFC This issue needs you input! label Jun 14, 2019
@japaric japaric added this to the v0.5.0 milestone Jun 14, 2019
@jonas-schievink
Copy link
Contributor

Can you explain why Monotonic needs to be an unsafe trait, and why its reset function is unsafe as well? What are the safety constraints on implementing Monotonic and calling reset?

@japaric
Copy link
Collaborator Author

japaric commented Jun 19, 2019

@jonas-schievink Hmm, actually I don't think the trait needs to be unsafe; I can't see how it could break memory safety even if it's implemented in a nonsensical way.

reset is unsafe because "it must be called within a critical section and it must be called at most once", the implementer can rely on these conditions being held when the runtime invokes the function. All the other methods are safe because their implementations must be sound even when invoked concurrently and from different cores.

@japaric
Copy link
Collaborator Author

japaric commented Jun 24, 2019

I have made two amendments to the RFC:

  • Monotonic is no longer a unsafe trait

  • a MultiCore marker trait has been added. The DSL will check at compile time that the monotonic argument implements this trait when the schedule API is used in more than one core. The CYCCNT monotonic timer will not implement this marker trait -- the CYCCNT doesn't behave correctly in multi-core context.

@korken89
Copy link
Collaborator

I quite like this.
However I think ratio should return a fraction to better support clock differences.

Looking into a traits crate could be a good option. Are there other traits we'd like to add?

When it comes to a default timer, I'd leave it explicit to easier find issues when CYCCNT does not exist (rather that than cryptic error messages).

@japaric
Copy link
Collaborator Author

japaric commented Jun 29, 2019

However I think ratio should return a fraction to better support clock differences.

Yes, good idea. Do you think a tuple (u32, u32) would do or should we use a struct Fraction { num: u32, den: u32 }?

Are there other traits we'd like to add?

Likely one more that lets you sub the SysTick with a user defined timeout timer.

There's also the Mutex trait to write generic code that uses resources but that one will go in the rtfm-core crate (as per RFC #203).

Instead of creating a cortex-m-rtfm-traits crate we could directly add these traits to rtfm-core. The MultiCore marker trait def sounds like it should go in rtfm-core. The Monotonic one, I'm not sure; may be best to wait and see what other RTFM ports use to implement the schedule API.


@TeXitoi think we can move forward with this?

(I'm going to set up the tickbox thing in any case)

Let's FCP merging this:

Note that we still need to decide on the return type for the ratio method.

@TeXitoi
Copy link
Collaborator

TeXitoi commented Jun 29, 2019

OK for me.

@korken89
Copy link
Collaborator

korken89 commented Jul 2, 2019

I think a struct Fraction { num: u32, den: u32 }, people from C++ will feel more at home as well then.
All ok from my side!

@japaric
Copy link
Collaborator Author

japaric commented Jul 8, 2019

🎉 This RFC, with the Fraction amendment, has been formally approved . Implementation is in PR #205 (I'm going to keep this open until the PR lands)

@japaric japaric added S-accepted This RFC has been accepted but not yet implemented and removed disposition-merge labels Jul 8, 2019
@perlindgren
Copy link
Collaborator

In the original work on RTFM a generic timer implementation was discussed, and initial experiments were conducted to verify feasibility. http://ceur-ws.org/Vol-1464/ewili15_16.pdf

There is discussion on using multiple timers/timer queues to reduce priority inversion. In short, a postponed task with a priority of x, will cause priority inversion if (put in a queue) handled by a timer with priority higher (more important) than x. So ideally we should associate a timer for each priority level that postponed task hold (in order to minimize priority inversion).

One possible approach is to give RTFM a set of "free" timers (implementing the "dispatch" trait). (We have already a similar approach to allowing RTFM to know about available interrupt handlers for the "soft" tasks, the somewhat "ugly" extern C.)

If the number of available timers is NOT sufficient (more priorities of postponed tasks than available timers), we can think about some heuristics for allocation (by default) or guided by annotations (in the free-timer list the user could say that he wants to associate the timer with queue handled at a particular priority). A sensible heuristic might be to distribute the timers top down (priority vice), so at least the highest priority tasks won't suffer from the priority inversion.

Ultimately, for a hard real-time system it boils down to response time analysis (and overall schedulability). Priority inversion (due to the timer queue handling/dispatch) is one piece of the puzzle. To that end, it makes full sense to introduce the timers to the RTFM model, and perform the system wide analysis on the complete model (including the timer tasks). The message queues becomes just shared resources (here lock free implementations are of great help reducing blocking, so we are in a particular good spot with RTFM).

Well why bother, RTFM works fine as is, and my application does not have any hard real time constraints.

While this is in general true, RTFM is indeed the most efficient real-time scheduler out there, and your application logic may not have any explicit timing constraints, hard constraints typically trickle in from the interaction with the underlying hardware (hidden by HAL/drivers etc.). E.g., we assume the interrupt handler (task) for the UART, etc. to handle communication without overflowing the input buffer. If that task is exposed to excessive interference by the message passing mechanism, the task misses its deadline and data is lost. We can prevent that by a "multiple timer" message passing.

In the Rust RTFM re-implementation, priority assignment is currently manual, in the original RTFM model priorities were assigned based on task deadline information (which allows reasoning on timing/response times etc.). For I/O interaction those deadlines can be derived from the inter-arrival of events (e.g., in the case of a UART, it would relate to the baud rate). A bit out of topic perhaps, but wouldn't it be great with a bit more of const evaluation of init, and having deadlines derived automatically by the HAL/driver implementations???? I fear however that Rust is not yet powerful enough to that type of const evaluation during proc-macro execution, but perhaps an extension to HAL could be possible with declarative style init (accessible to the proc-macro ... just dreaming.)

In the short term, lets focus on generic timers, and a reasonable heuristic for timer allocations.
/Per

bors bot added a commit that referenced this issue Sep 15, 2019
205: rtfm-syntax refactor + heterogeneous multi-core support r=japaric a=japaric

this PR implements RFCs #178, #198, #199, #200, #201, #203 (only the refactor
part), #204, #207, #211 and #212.

most cfail tests have been removed because the test suite of `rtfm-syntax`
already tests what was being tested here. The `rtfm-syntax` crate also has tests
for the analysis pass which we didn't have here -- that test suite contains a
regression test for #183.

the remaining cfail tests have been upgraded into UI test so we can more
thoroughly check / test the error message presented to the end user.

the cpass tests have been converted into plain examples

EDIT: I forgot, there are some examples of the multi-core API for the LPC541xx in [this repository](https://github.com/japaric/lpcxpresso54114)

people that would like to try out this API but have no hardware can try out the
x86_64 [Linux port] which also has multi-core support.

[Linux port]: https://github.com/japaric/linux-rtfm

closes #178 #198 #199 #200 #201 #203 #204 #207 #211 #212 
closes #163 
cc #209 (documents how to deal with errors)

Co-authored-by: Jorge Aparicio <jorge@japaric.io>
bors bot added a commit that referenced this issue Sep 15, 2019
205: rtfm-syntax refactor + heterogeneous multi-core support r=japaric a=japaric

this PR implements RFCs #178, #198, #199, #200, #201, #203 (only the refactor
part), #204, #207, #211 and #212.

most cfail tests have been removed because the test suite of `rtfm-syntax`
already tests what was being tested here. The `rtfm-syntax` crate also has tests
for the analysis pass which we didn't have here -- that test suite contains a
regression test for #183.

the remaining cfail tests have been upgraded into UI test so we can more
thoroughly check / test the error message presented to the end user.

the cpass tests have been converted into plain examples

EDIT: I forgot, there are some examples of the multi-core API for the LPC541xx in [this repository](https://github.com/japaric/lpcxpresso54114)

people that would like to try out this API but have no hardware can try out the
x86_64 [Linux port] which also has multi-core support.

[Linux port]: https://github.com/japaric/linux-rtfm

closes #178 #198 #199 #200 #201 #203 #204 #207 #211 #212 
closes #163 
cc #209 (documents how to deal with errors)

Co-authored-by: Jorge Aparicio <jorge@japaric.io>
@japaric
Copy link
Collaborator Author

japaric commented Sep 15, 2019

Done in PR #205

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC This issue needs you input! S-accepted This RFC has been accepted but not yet implemented
Projects
None yet
Development

No branches or pull requests

5 participants