Skip to content

Initial state once cell value #688

@IpFruion

Description

@IpFruion

Proposal

Problem statement

Provide a mechanism to lazily load a value from an initial state only when successful.

Motivating examples or use cases

Currently, OnceCell<T> and LazyCell<T> provide mechanisms to defer initialization until later, however they do not seem to provide a way to solve handling the full problem statement above.

Example 1 LazyCell<T>

Given some structure that is a long running structure in an application and it's new and request function

pub struct App {
   client: LazyCell<Result<Client, ClientError>>,
}
impl App {
   pub fn new(settings: ClientSettings) -> Self {
      App { 
         client: LazyCell::new(move || {
            Ok(Client::new(settings)?)
         })
      }
   }
   pub fn request(self, ...) -> Result<Response, RequestError> {
      // Initialization of the client only attempts once
      // This can lead to issues where something on the system might have changed and retrying to build the client might be necessary
      let response = self.client.map_err(|err| err.clone())?.send(...)?;
      Ok(response)
   }
}

We can see how we have a way to initialize this client but it can error and this function is only run once (in the request(...) function). A developer could need a mechanism to instead of only being able to attempt to initialize a value once, but to handle attempting again if the function fails (like in OnceCell::get_or_try_init where the value doesn't get initialized if it failed to init.

Example 2 OnceCell<T>

To attempt to fix being able to attempt to initialize the client (and only initialize on success), we could instead use OnceCell<T> to create this scenario

pub struct App {
   settings: ClientSettings,
   client: OnceCell<Client>
}
impl App {
   pub fn new(settings: ClientSettings) -> Self {
      App {
         settings,
         client: OnceCell::new(),
      }
   }
   pub fn request(self, ...) -> Result<Response, RequestError> {
      // we have to clone these settings here every time, even if the client is already initialized
      let settings = self.settings.clone();
      let response = self.client.get_or_try_init(move || {
         let client = Client::new(settings)?;
         Ok(client)
      })?.send(...)?;
      Ok(response)
   }
}

This gives us the ability to attempt initialization but keeps a long lasting ClientSettings around that is no longer going to be used after the Client is initialized. These ClientSettings could be de-allocated with a different structure.

Solution sketch

The thought process between these solutions is to combine both the worlds of LazyCell and OnceCell together to allow for some initial state to be able to be passed along and used in the attempts for initialization. Let's call this new structure StateCell (not set on the name)

pub struct StateCell<I, T> {
    state: UnsafeCell<State<I, T>>,
}

// Very similar to `LazyCell<T>`'s internal `State<T, F>` 
enum State<I, T> {
   Uninit(I),
   Init(T),
   Poisoned,
}

This internally looks almost exactly like LazyCell<T>'s internal structure, except without the bounds that LazyCell<T, F = fn() -> T> meaning the State::Uninit is by default a function.

We can then use the above to construct some of the OnceCell<T> functions to get the get_or_try_init functionality.

impl StateCell<I, T> {
    pub fn get_or_try_init<F, E>(&self, f: F) -> Result<&T, E>
    where
        F: FnOnce(&I) -> Result<T, E>,
    {
        // SAFETY:
        // This invalidates any mutable references to the data. The resulting
        // reference lives either until the end of the borrow of `self` (in the
        // initialized case) or is invalidated in `really_init` (in the
        // uninitialized case; `really_init` will create and return a fresh reference).
        let state = unsafe { &*self.state.get() };
        match state {
            State::Init(i) => Ok(i),
            State::Poisoned => panic_poisoned(),
            // SAFETY: The state is uninitialized.
            State::Uninit(_) => unsafe { self.really_try_init(f) },
        }
    }
    unsafe fn really_try_init<F, E>(&self, f: F) -> Result<&T, E>
    where
        F: FnOnce(&I) -> Result<T, E>,
    {
        // Swap out the initial state with the poisoned state to make sure the value is only alive for this function call
        // This helps prevent a case where `f` panics and the state should be poisoned like in `LazyCell`
        let i = unsafe { self.swap_poisoned() };
        let data = match f(&i) {
            Ok(data) => data,
            Err(err) => {
                // An error occured and thus we want to reset the internal state
                unsafe { self.state.get().write(State::Uninit(i)) };
                return Err(err);
            }
        };
        
        // Otherwise we have been successful and we want to initialize the data.
        Ok(unsafe { self.set_data(data) })
    }
}

The content of these functions are very similar to LazyCell however they provide this functionality later when the function is called to get access.

Example Usage

Our App example can now be different

pub struct App {
   client: StateCell<ClientSettings, Client>
}
impl App {
   pub fn new(settings: ClientSettings) -> Self {
      App {
         settings,
         client: StateCell::new(settings),
      }
   }
   pub fn request(self, ...) -> Result<Response, RequestError> {
      let response = self.client.get_or_try_init(|settings: &ClientSettings| {
         let client = Client::new(settings)?;
         Ok(client)
      })?.send(...)?;
      Ok(response)
   }
}

Alternatives

One alternative could be to relax the bounds on F in LazyCell<T, F> i.e. having the F function produce F: Fn() -> Result<T, E> but this has some compexity with the trait bounds in place for deref and other implementations.

Another option would be to completely relax on LazyCell<T, F> to allow F to not be a function but instead some initial value and provide some of the functions of OnceCell onto LazyCell only when F is not a FnOnce function but this goes against a lot of the initial design of LazyCell which seems like it would break some of this.

The reason I didn't recommend these as solutions is because it would produce a significant change to the API that is already there instead of adding a new solution as discussed above.

This can definitely be done as a crate on crates.io (I am new to the process of open source development and wanted to bring this problem, and possible solution, to where I think it would serve the wider audience as a whole). Another reason would be since this does have some unsafe work in it (very similar to the LazyCell and OnceCell safety guarantees), it would be harder to get public view into the solution since I would be wary of adding a singular crate for this structure that seems to fit in-between two standard library structures.

Links and related work

I don't have any other links here yet but will add them when / if I find them. I have started drafting up what the solution might look like in code (plus some tests to see if I am on the right track) and can add the fork (branch? not sure what the best way to contribute would be) to this issue.

Lastly, I would just like to state that I am new to the contributing space and I am happy to help discuss, chat, and design out the solution (or other solutions that others might recommend). I may be missing something obvious that already solves this problem and would be glad to review it if it meets the needs I have in mind. I appreciate any feedback and suggestions.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions