Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] The Static wrapper is dead, long live safe &'static mut references! #59

Closed
japaric opened this issue Dec 9, 2017 · 17 comments
Closed
Milestone

Comments

@japaric
Copy link
Collaborator

japaric commented Dec 9, 2017

History: The Static wrapper

If you have been using RTFM claims you probably have noticed this "pattern":

r.FOO.claim_mut(|foo| {
    **foo += 1;
});

Here you need a double dereference because claim returns a &mut Static<T>, instead of a plain
mutable reference (&mut T). Static<T> is a newtype over T that Derefs to T. You normally
won't notice the Static wrapper when using methods because of the Deref implementation, but the
wrapper becomes apparent when you need to assign some new value to a resource.

DMA transfers

So, why is Static being used here? The main reason is that I needed some (zero cost) abstraction
to make DMA transfers memory safe. You can't build a safe DMA API on top of plain (non-static)
references. See below:

impl Serial {
    fn read_exact<'a>(&'a mut self, buf: &'a mut [u8]) -> Transfer<'a> { .. }
}

impl<'a> Transfer<'a> {
    fn wait(self) {
        drop(self)
    }
}

impl<'a> Drop for Transfer<'a> {
    fn drop(&mut self) {
        // waits until the DMA transfer finishes
    }
}

let mut on_the_stack = [0; 16];
{
    let transfer = serial.read_exact(&mut on_the_stack);

    // meanwhile, do some other stuff

    // on_the_stack[0] = 1;
    //~^ error `on_the_stack`

    transfer.wait();
}

// use `on_the_stack`

At first glance, this looks like an OK DMA API. While the DMA transfer is writing to the buffer you
can't access the buffer (on_the_stack is "frozen" by the outstanding borrow). The Transfer value
represents the on-going transfer and upon destruction (when dropped) it waits for the transfer to
finish. You can use the wait method to make the wait operation more explicit.

However, this API is not safe because you can safely mem::forget the Transfer value to gain
access to the buffer while the transfer is on-going:

let mut on_the_stack = [0; 16];
{
    let transfer = serial.read_exact(&mut on_the_stack);

    // destructor not run
    mem::forget(transfer);
}

// the transfer may not be over at this point
on_the_stack[0] = 1;
assert_eq!(on_the_stack[0], 1);

This doesn't look too dangerous but it's a violation of Rust aliasing model and will result in UB.
In the last line two mutable references to on_the_stack exist: one is being used in the indexing
operation, and the other is owned by the DMA (external hardware).

It gets much worse though because this mem::forget hole can be used to corrupt stack memory:

fn foo() {
    let mut on_the_stack = [0; 16];

    mem::forget(serial.read_exact(&mut on_the_stack));
}

fn bar() {
    // do stuff while the DMA transfer is on going
}

foo();
bar();

Here foo starts a DMA transfer that modifies some stack allocation but then immediately returns,
releasing the stack allocation. Next bar starts while the DMA is still on going; the problem is
that the DMA transfer will write into the stack potentially overwriting bar's local variables and
causing undefined behavior.

How does Static help?

We can prevent the memory corruption by having the API only accept references that point into memory
that will never be de-allocated. And that's what the Static wrapper represents: &Static<T> is a
reference into a statically allocated (i.e. stored in a static variable) value of type T. With
this change the API would look like this:

impl Serial {
    fn read_all<'a>(&'a mut self, buf: &'a mut Static<[u8]>) -> Transfer<'a> { .. }
}

(Note that this change is not enough to prevent the aliasing problem caused by mem::forget.
Discussing a solution for that issue is out of scope for this RFC though. The RefCell-like
Buffer abstraction in the blue-pill crate prevents the mem::forget aliasing problem showcased
above but it still has other issues like mem::swap aliasing and that you can e.g. still use
Serial while the transfer is in progress)

A value can't be safely wrapped in Static but RTFM does that for you in every claim and that
lets you use the memory safer DMA API from above.

Changing buf's type to &'static mut would also have worked but there's no way to safely create
&'static mut references, or rather there wasn't a way before this RFC.

Motivation

Being able to safely create &'static mut references.

Why? &'static mut references have interesting properties that I think will enable the creation of
novel APIs:

&'static mut T has move semantics. See below:

fn reborrow<'a, T>(x: &'a mut T) { .. }

fn consume<T>(x: &'static mut T) { .. }

fn foo<T>(x: &'static mut T) {
    reborrow(x);

    // OK to call again
    reborrow(x);

    // actually the compiler is doing this in each `reborrow` call
    reborrow(&mut *x);

    // this is different: this takes ownership of `x`
    consume(x);

    // now you can't use `x` anymore
    //consume(x);
    //~^ error `x` has been already moved

    //reborrow(x);
    //~^ error `x` has been already moved
}

&'static mut T has 'static lifetime (gasp!) so, unlike its non-static cousin, it can be stored
in a static variable and thus we can have a resource that protects a &'static mut T.

&'static mut T is pointer sized. If you need to send (transfer ownership) of a buffer from one
task (execution context) to another then it's cheaper to send &'static mut [u8; 256] than to send
[u8; 256] even though they are both semantically a move.

So &'static mut T is a bit like Box<T>: both have move semantics and are
pointer sized but the former doesn't need dynamic memory allocation (it's statically allocated!) and
we know that T's destructor will never run ('static lifetime).

Use case: memory safe DMA transfer

We can revise the DMA API to make it truly memory safe:

impl Serial {
    fn read_exact(self, buf: &'static mut [u8]) -> Transfer { .. }
}

impl Transfer {
    fn wait(self) -> (Serial, &'static mut [u8]) { .. }
}

let buf: &'static mut [u8] = /* created somehow */;

let transfer = serial.read_exact(&mut on_the_stack);

// can't use Serial while the DMA transfer is in progress
// let byte = serial.read();
//~^ error `serial` has been moved

// can't access `buf` while the transfer is in progress
// buf[0] = 1;
//~^ error `buf` has been moved

// meanwhile, do other stuff

// block until the transfer finishes
let (serial, buf) = transfer.wait();

// now you can use `serial` and access the `buf`fer again

Here if you mem::forget the transfer then you can't never access serial or the buffer again,
which may seem overkill but fulfills the memory safety requirement.

Use case: SPSC ring buffer

This use case prompted the original RFC (cf. #47). Basically a ring buffer queue can be split into
one producer end point and one consumer end point. Each end point can locklessly enqueue or dequeue
items into / from the same queue -- even if the end points are in different execution contexts (e.g.
threads or interrupts).

The API for this already exists in the heapless crate but the Consumer and Producer end
points have a lifetime parameter that matches the lifetime of the ring buffer queue. To put these
end points in resources the lifetime would have to be 'static and that requires a &'static mut RingBuffer, which can't be safely created without this RFC.

Detailed design

init.resources

We add a resources field to app.init. The value of this field is a list of resources, previously
declared in app.resources, that init will own for the rest of the program. The resources in
this list will appear under the init::Resources struct as &'static mut references. Example:

app! {
    device: stm32f103xx,

    resources: {
        static BUFFER: [u8; 16] = [0; 16];
        static SHARED: bool = false;
    },

    init: {
        // NEW!
        resources: [BUFFER],
    },

    idle: {
        resources: [SHARED],
    },

    tasks: {
        EXTI0: {
            path: exti0,
            resources: [SHARED],
        },
    }
}

fn init(p: init::Peripherals, r: init::Resources) {
    // static lifetime
    let buf: &'static mut [u8; 16] = r.BUFFER;

    // non-static lifetime
    let shared: &mut bool = r.SHARED;
}

// ..

Some constraints apply to init owned resources:

  • These resources must have an initial value; i.e. they can't be "late" resources.

  • Resources assigned to init can't appear in idle.resources or in tasks.$T.resources.
    Basically, the resources are owned by init so they can't be shared with other tasks, or with
    idle.

These constraints will be enforced by the app! macro. An error will be thrown before expansion if
any constraint is violated.

Drop the Static wrapper

Since this RFC brings proper support for &'static mut references to the table I think the Static
wrapper is no longer useful -- memory safe DMA APIs can be built without it and that was its main
reason for existing.

This will be implementing by changing all &[mut] Static<T> to &[mut] T. This means you will no
longer need to doubly dereference to assign a new value to a resource.

Downsides

This is a breaking change, but we are breaking things due to #50 so it's not much of a problem.

Alternatives

A pre_init function

A pre_init function with signature fn() -> T could be run before init. The value returned by
this function would be passed as &'static mut T to init. Unlike the main proposal this value
would be created at runtime so const eval limitations would not apply; also the value would be
allocated in the stack (in the first frame, which will never be deallocated), not in .bss /
.data.

With code it would look like this:

app! {
    device: stm32f103xx,

    pre_init: pre_init,
}

struct Root {
    buffer: [u8; 16],
}

fn pre_init() -> Root {
    Root { buffer: [0; 16] }
}

fn init(root: &'static mut Root, p: init::Peripherals, r: init::Resources) { .. }

I think it may make sense to also support this because it potentially lets you use a different
memory region -- think of the case of microcontrollers with two RAM regions the stack could be on
one region and .bss / .data could be on the other. However, if we get better support for multiple
memory regions in cortex-m-rt and support for placing resources in custom linker sections in
cortex-m-rtfm then there is no longer a need for this, I think, because then you can place an
init owned resource in any memory region (in RAM, e.g. .bss1, or in core-coupled RAM, .bss2).

I'm not too concerned about the const eval limitation that affects the main proposal because, in my
limited experience, the T in the &'static mut T references one creates is usually an array ([T; N]) or a thin abstraction over uninitialized arrays (e.g. heapless::RingBuffer).

Implementation

See #58

cc @pftbest @jonas-schievink @hannobraun

@therealprof
Copy link

Now I am confused. Didn't you tell me that &'static mut are a bad idea not too long ago?

@japaric
Copy link
Collaborator Author

japaric commented Dec 9, 2017

What? No. &'static mut is fine. The problem is aliasing the &'static mut reference, which must be unique. Simple, actually memory unsafe, functions like the one below result in mutable aliasing (and break Rust memory model) so those are a bad idea:

fn foo() -> &'static mut u32 {
    static mut FOO: u32 = 0;
    unsafe { &mut FOO }
}

let a: &'static mut u32 = foo();
let b: &'static mut u32 = foo();

// BAD `a` and `b` point to the same thing
assert_ne!(a as *const _ as usize, b as *const _ as usize);

But if you have a mechanism to guarantee the uniqueness of the &'static mut reference, like what's proposed in this RFC, then you are good to go.

@therealprof
Copy link

I neither see how your suggestion automagically fixes the possibility of mutable aliasing nor how that would break Rusts memory model (since you can only initialise them with known types and const functions anyway, Rust will always be aware of size and place). There're plenty of applications where concurrent access is just fine or even expected and/or out of program control (as in your DMA example).

@jonas-schievink
Copy link
Contributor

jonas-schievink commented Dec 9, 2017

@therealprof Under this proposal, mutable references are only handed out by RTFM to the init function (uniqueness is enforced by the app! macro - when a resource is used by init, it may not be used by any other task). Aliasing &muts, static or not, result in insta-UB (also see https://doc.rust-lang.org/nomicon/transmutes.html). (Edit: Maybe, maybe not, who knows? And who needs a memory model anyways?)

(I've only skimmed this proposal, but it looks safe - and a safe and simple abstraction over DMA is totally awesome!)

@therealprof
Copy link

There's no transmute happening in the example brought up by @japaric and the transmute chapter does not handle aliasing at all as far as I can see, that would be https://doc.rust-lang.org/nomicon/aliasing.html . As soon as you want to use shared memory or hand off the pointer for the use of DMA you're accepting the fact that you (or the hardware) is aliasing memory anyway. And for a lot of applications that is just okay.

@jonas-schievink
Copy link
Contributor

@therealprof You might be right about the safety of aliasing. Certainly, as it stands, Rust defers the UB decision to LLVM, which allows aliasing noalias/&mut pointers as long as their uses don't violate aliasing rules. However, Rust doesn't specify LLVM's behaviour. In fact, Rust doesn't really specify anything.

But giving the user aliasing &mut allows UB in safe code, which is a big no-no. I also don't understand your point about DMA - it also needs mutable access to the data, but the program doesn't need to (and isn't allowed to) access the memory while the transfer is in progress, and this is perfectly expressed by this proposal (peripherals and memory involved in a transfer are unusable until the application waits for the transfer to complete - no conflicting access is happening).

@perlindgren
Copy link
Collaborator

perlindgren commented Dec 10, 2017 via email

@therealprof
Copy link

@jonas-schievink A lot of languages implemented on LLVM do allow aliasing pointers so there shouldn't be any problem there.

My point re: aliasing was that aliasing (whether it happens in software an/or hardware is not relevant) is not necessarily a bad thing and sometimes quite expected.

Anyhow, I'm very much in favour of this proposal despite the confusion around the badness of aliased mut statics...

@pftbest
Copy link

pftbest commented Dec 11, 2017

@therealprof Other languages do not set noalias attribute on their pointers, so they don't have any problem with aliasing.

Here is a classical example of UB: https://godbolt.org/g/LrFMjL
Notice how removing pub keyword from function bar changes return value of abc

@therealprof
Copy link

@pftbest Fully understood that you can do really bad stuff with aliased pointers.

@pftbest
Copy link

pftbest commented Dec 11, 2017

you can do really bad stuff with aliased pointers

This is just a trivial example, the real code may be much more subtle. And it's not always about what you write, similar code may be generated after some optimization steps.

The whole idea of Rust is to make it impossible to trigger UB in a safe code. No matter how bad the code is, if it's in safe rust it should either not compile, or panic at runtime.

If you can trigger UB in a safe code that means your unsafe blocks are designed incorrectly.

@hannobraun
Copy link

@japaric Thanks for tagging me. I've looked over your proposal, and everything looks good to me, with two caveats:

  • I'm not intimitely familiar with the workings of &'static mut, so I might be missing something. It looks like you've done your homework though!
  • I'm largely unfamiliar with RTFM, so I might be missing something there, too.

I'm also a bit unclear on how this fits into the larger context. Specifically, how does this proposal relate to rust-embedded/embedded-hal#14? That proposal mentions static_ref::Ref, which seems to be a predecessor of Static. Am I correct in assuming that Static will be completely deprecated, even outside of RTFM?

It seems to me that Static wouldn't be useful, since it can only be created using unsafe, and once unsafe is an option, I can just create a &'static mut, and use that as suggested in this proposal.

@japaric
Copy link
Collaborator Author

japaric commented Dec 11, 2017

@hannobraun

Specifically, how does this proposal relate to rust-embedded/embedded-hal#14?

This proposal is for RTFM. I think, though, that Static, previously static_ref::Ref{,Mut}, is
not enough to create memory safe DMA APIs (I haven't documented the mem::swap problem but I think
Static is not enough to prevent it). Thus I think that the standardized DMA API that will end
in embedded-hal should use &'static mut references, not Static. That raises the question: what
will be the fate of non-RTFM applications where it's not safe to create &'static mut references?

Except that I just thought of a way to safely create &'static mut references in non-RTFM
applications. There is no free lunch though: this approach will incur in non-elidable runtime
checks. Here's the idea: a macro that creates &'static mut references using the singleton check I
introduced in rust-embedded/svd2rust#157:

// this macro could be simplified: for instance `$ident` is kind of useless
macro_rules! alloc {
    (static $ident:ident: $ty:ty = $expr:expr) => {
      cortex_m::interrupt::free(unsafe {
          static mut USED: bool = false;

          if USED {
              None
          } else {
              static mut $ident: $ty = $expr;
              let e: &'static mut $ty = &mut $ident;
              USED = true;
              Some(e)
          }
      })
    }
}

let a = alloc!(static BUFFER: [u8; 16] = [0; 16]).unwrap();
let b = alloc!(static BUFFER: [u8; 16] = [0; 16]).unwrap();

// OK `a` and `b` are not aliases -- they are pointing to different `static` variables
assert_ne!(a.as_ptr(), b.as_ptr());

fn alias() -> &'static mut [u8; 16] {
    alloc!(static BUFFER: [u8; 16] = [0; 16]).unwrap()
}

let c = alias();
// this will panic! if it didn't it would create an alias to the `BUFFER` variable in `alias`
let d = alias();

Am I correct in assuming that Static will be completely deprecated, even outside of RTFM?

I think the only user of Static is RTFM so yeah it will probably vanish into nothingness after
this change.

@therealprof
Copy link

@pftbest We're fully on the same page here and I'm absolutely not suggesting this to be used in any user facing crate.

@hannobraun
Copy link

@japaric Thanks for your reply. Interesting proposal. I don't have a firm opinion on the matter right now. Integrating DMA into one of my projects is on my todo list though, so I assume I'll have more to say then.

@japaric
Copy link
Collaborator Author

japaric commented Dec 23, 2017

Since this has received positive feedback and no objections I'm going to rubber stamp the RFC and land the open PRs.

I'll send another PR to cortex-m to discuss a checked version of this that works without RTFM.

@japaric japaric closed this as completed Dec 23, 2017
japaric pushed a commit that referenced this issue Dec 23, 2017
safe `&'static mut` references via init.resources

see RFC #59 for details
@japaric
Copy link
Collaborator Author

japaric commented Dec 23, 2017

I'll send another PR to cortex-m to discuss a checked version of this that works without RTFM.

See rust-embedded/cortex-m#70. Feedback on the macro syntax is welcome!

japaric pushed a commit that referenced this issue Dec 23, 2017
safe `&'static mut` references via init.resources

see RFC #59 for details
japaric referenced this issue in rust-embedded/cortex-m Jan 15, 2018
safe `&'static mut` references through a runtime checked macro

runtime checked implementation of japaric/cortex-m-rtfm#59 that doesn't depend on RTFM macros

TODO

- [ ] bikeshed macro syntax
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants