Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for std::cell::{Ref, RefMut}::leak #69099

Open
3 tasks
Tracked by #2
HeroicKatora opened this issue Feb 12, 2020 · 4 comments
Open
3 tasks
Tracked by #2

Tracking Issue for std::cell::{Ref, RefMut}::leak #69099

HeroicKatora opened this issue Feb 12, 2020 · 4 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@HeroicKatora
Copy link
Contributor

HeroicKatora commented Feb 12, 2020

#68712 adds methods to convert RefCell guards Ref/RefMut into references with the lifetime of the underlying cell.

The feature gate for the issue is #![feature(cell_leak)].

Unresolved Questions

  • Should similar methods be provided for MutexGuard and RwLockReadGuard/RwLockWriteGuard?
  • Should unsafe methods be added to forcefully revert a leak?
  • For undo_leak, would it make more sense not to return a reference (i.e., separate this from get_mut)?
@HeroicKatora HeroicKatora added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Feb 12, 2020
@jonas-schievink jonas-schievink added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Feb 12, 2020
Centril added a commit to Centril/rust that referenced this issue Mar 15, 2020
…tolnay

Add undo_leak to reset RefCell borrow state

This method is complementary for the feature cell_leak added in an
earlier PR. It allows *safely* reverting the effects of leaking a borrow guard by
statically proving that such a guard could not longer exist. This was
not added to the existing `get_mut` out of concern of impacting the
complexity of the otherwise pure pointer cast and because the name
`get_mut` poorly communicates the intent of resetting remaining borrows.

This is a follow-up to rust-lang#68712 and uses the same tracking issue, rust-lang#69099,
as these methods deal with the same mechanism and the idea came up
[in a review comment](rust-lang#68712 (comment)).

@dtolnay who reviewed the prior PR.
cc @RalfJung
@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
gomain added a commit to mightyiam/bookkeeping that referenced this issue Nov 17, 2020
Came accross the lifetime issue again when `Book::running_balance` returns
`Transaction<'a>` where `'a` is of the book. This too does not cross the
`std::cell::Ref<'_, T>` boundry. However, there is a way to get to the underlying
lifetime through an unstable feature 'cell_leak'

see rust-lang/rust#69099

This allows the borrow checker to life-check!
@Amanieu
Copy link
Member

Amanieu commented Dec 22, 2021

We discussed this during the libs meeting today and feel that there is a lack of a specific use case for these methods. @HeroicKatora (or anyone else) could you describe how you are using these methods in your own code?

In particular we feel that undo_leaks is unlikely to be useful in practice: if you've got a RefCell then you are unlikely to ever be able to get a mutable reference to it (except maybe in drop, but then you can just use get_mut and ignore the ref count).

@HeroicKatora
Copy link
Contributor Author

To expand on my original use case, consider an embedded network stack. More
specifically, a routing table in an almost static network, that is changing
infrequently (i.e. statically configured). This table will contain addresses of
some relevant remote machines, and is initialized by state machines that
discover those remotes in the network and insert their addresses and paths to
the table. For reasons of avoiding allocation during operation, the table
itself is represented by an unsized slice type: OrdSlice<Addr>.

Since multiple independent actors are supposed to access the table by methods
taking &mut OrdSlice<_>, to insert their endpoints, we must employ a form of
runtime borrow checking. I had initially considered an unsafe implementation
until noting that I would just be replicating the exact fields of RefCell and
figuring out it could fit into its types.

Note that each insert operation might modify the whole slice to keep it
sorted so we can't use a simple entry-based API where we split the borrow
upfront in this case. (Technically, some GhostCell variant could maybe be
used as well but at the time it wasn't nearly as established and I did not want
to have too much unsafe proof obligations floating around). The code that
runs after the operation expects a reference to the whole table, mostly
immutable but noalias is good for codegen and there may be modifications
during runtime, and we really did not want the details of the setup or the
RefCell wrapping to leak through to that module. More on evaluation of
alternatives later. Simplified code sketch of the structure involved:

/// A slice of ordered values, with bisect lookup.
/// Note: builtin unsizing by choosing `U = [T; N]`.
pub struct OrdSlice<T, U: ?Sized = [T]>(PhantomData<T>, U);

mod netstack {
    pub struct Runtime<'net> {
        _foo: &'net mut OrdSlice<Addr>,
    }
}

mod embedded_app {
    pub struct FindFoo<'net> {
        target: &'net RefCell<OrdSlice<Addr>>,
    }

    enum Network<'net> {
        Setup {
            target: &'net RefCell<OrdSlice<Addr>>,
            with_foo_service: Option<FindFoo<'net>>,},
        Running {
            net: Runtime<'net>,},
    }
}

There are several key reasons why RefCell (and likewise constructs) are
avoided in the netstack module. Firstly it is used by other applications,
some of which choose to allow dynamic allocations and thus pass around owned,
copied Vec<_> instances instead of the more complex scheme of embedded_app.
By not using Ref in the stack itself we allow both usage patterns without
complex generic parameters. Also note the stability implication of using
&RefCell<T>. What is required to freely utilize the object in this logically
distinct code module would be a cell that has no outstanding borrows. Which is
just &mut T with extra steps, in particular depending on absence of
programmer misuse.

Actual usage (with initialization for completeness) is then as follows:

fn ingress_step<'net>(
    network: &mut Network<'net>,
    queues: &mut,) {
    match network {
        Setup { target, with_foo_service,} => {
            // some setup io..
            if let Some(foo) = &with_foo_service {
                foo.ingress(queues);
            }

            // Done with setup?
            if !with_foo_service.as_ref().map_or(true, FindFoo::is_done) {
                return;
            }
            
            // Switch to active service mode.
            network = Network::Running {
                net: RefMut::leak(target.borrow_mut()),};
        },
        Running { net } => route_packets(net, queues),
    }
}

Which we can run with error recovery by utilizing undo_leak:

// Modified and shortened obviously..
fn run(addr: &RefCell<OrdSlice<Addr>>) -> Result<()> {
    let mut state = Network::new(addr);
    let mut queues = …;

    loop {
        ingress(&mut state, &mut queues);
        egress(&mut state, &mut queues);

        queues.errors()?;
    }

    Ok(())
}

fn main() {
    let addr: OrdSlice<_, [_; 16]> = Default::default();
    let mut addr = RefCell::new(addr);

    loop {
        // Otherwise, second `run` will panic.
        RefCell::undo_leak(&mut addr);

        if let Err(_) = run(&addr) {}
    }
}

Alternatives

… and why they were not chosen.

  • Construct Runtime<'net> transiently from a &'net OrdSlice<Addr> in each
    iteration of the loop body. This sidesteps the lifetime issue by storing a
    reference to the original RefCell instead. However, this is not very clean
    code as Runtime can no longer protect any inner invariants—possibly involving
    correctness checks in each iteration as part of construction.
  • Introduce a different lifetime in Network, just complicates the
    implementation which is duplicated a lot and is quite unreadable. Note that
    this style can introduce unexpected implicit and explicit lifetime bound
    requirements, adding to the effect.
  • Replace the Network state machine with code. Break run into separate
    loops for each of the states, introducing a Ref in the second half, which
    is borrowed for the Runtime. This duplicates the ingress/egress portion,
    error handling and forces all similar state machines to be exposed at that
    level.
  • Without undo_leak we could also just exit and have the outer system deal with
    error recovery and restart. This gives less control over the precise behavior
    during recovery, which might be fine as I'm merely using it for logging.
    We're losing some program state, however, which might also be worked around
    with shared memory files. It's simply more convenient to handle it in the
    program, though.

@DavidVonDerau
Copy link

Note that each insert operation might modify the whole slice to keep it
sorted so we can't use a simple entry-based API where we split the borrow
upfront in this case. (Technically, some GhostCell variant could maybe be
used as well but at the time it wasn't nearly as established and I did not want
to have too much unsafe proof obligations floating around). The code that
runs after the operation expects a reference to the whole table, mostly
immutable but noalias is good for codegen and there may be modifications
during runtime, and we really did not want the details of the setup or the
RefCell wrapping to leak through to that module.

I have a very similar situation that warrants this functionality, including the need for undo_leaks.

I think that this comes up in any situation with the following constraints:

  1. The code is running in a loop.
  2. The code must use run-time borrow checking. Normally, this is due to something like a collection where different parts of it can be read / mutated independently, but it's determined at run-time and you can't use a "disjoint" entry API.
  3. The data is accessed in a way where the lifetime of the Ref is too short -- e.g. in a loop where each Ref must be dropped before the next loop iteration.

My simplified example looks something like the following:

struct State {
    wrapper: Wrapper,
    counter: u64,
}

// Wrapper to make sure this isn't Clone / Copy.
struct DataType(u64);

struct FrameData<'sim> {
    data: &'sim DataType
}

mod inner {
    use std::cell::RefCell;
    use super::*;

    // This wrapper is needed so that nothing else can call `borrow` or `borrow_mut`.
    pub struct Wrapper {
        data: RefCell<DataType>
    }

    impl Wrapper {
        pub fn new() -> Self {
            Self {
                data: RefCell::new(DataType(0))
            }
        }

        pub fn simulate(&mut self, new_value: DataType) {
            *self.data.borrow_mut() = new_value;
        }

        pub fn wrapper_frame_data<'frame, 'sim: 'frame>(&'sim self, frame_data: &mut Vec<FrameData<'frame>>) {
            let guard = self.data.borrow();
            let data: &DataType = &*guard;

            // Transmute the data to the `sim lifetime, because we know that the state can't be modified during that lifetime.
            let data: &'sim DataType = unsafe { std::mem::transmute(data) };
            frame_data.push(FrameData {
                data
            });
        }
    }
}

use inner::Wrapper;

fn simulate(state: &mut State) {
    state.counter += 1;

    // This is the only place where the `wrapper` could be modified -- we need a &mut reference.
    state.wrapper.simulate(DataType(state.counter));
}

fn state_frame_data<'frame, 'sim: 'frame>(state: &'sim State, frame_data: &mut Vec<FrameData<'frame>>) {
    // `wrapper` cannot be mutated or dropped while `&'sim State` lives.
    state.wrapper.wrapper_frame_data(frame_data);
}

fn render(state: &State) {
    let mut frame_data = Vec::new(); // <-- This definitely lives less time than `State`.
    state_frame_data(state, &mut frame_data);
    for data in frame_data {
        println!("{}", data.data.0);
    }
}

fn main() {
    let mut state = State {
        wrapper: Wrapper::new(),
        counter: 0,
    };

   loop {
        simulate(&mut state);
        render(&state);
    }
}

Using the API, all I'd need to do is make 2 changes:

pub fn simulate(&mut self, new_value: DataType) {
    RefCell::undo_leak(&mut self.data);
    *self.data.get_mut() = new_value;
}

pub fn wrapper_frame_data<'frame, 'sim: 'frame>(&'sim self, frame_data: &mut Vec<FrameData<'frame>>) {
    let guard = self.data.borrow();
    let data: &DataType = &*guard;
    
    // No more unsafe!                                                                                                   
    let data: &'sim DataType = Ref::leak(guard);
    frame_data.push(FrameData {
        data
    });
}

@c410-f3r
Copy link
Contributor

What is the current status of this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants