Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

💡 discussion: lazy!/once! macros #189

Closed
CAD97 opened this issue Jul 26, 2022 · 5 comments
Closed

💡 discussion: lazy!/once! macros #189

CAD97 opened this issue Jul 26, 2022 · 5 comments

Comments

@CAD97
Copy link

CAD97 commented Jul 26, 2022

The main disadvantage of the Lazy type over lazy_static! is that Lazy<T> necessarily stores and calls fn() -> T rather than impl FnOnce() -> T. This means that any inlining of the initialization must happen via devirtualization, rather than coming for free with monomorphization.

On the other hand, perhaps this is an advantage; initialization is #[cold] and only called once, so inlining the initialization codepath may serve more to make inlining callers more costly than to make initialization less expensive.

What doesn't suffer from the (perceived?) virtualization cost is the fn get_global() -> &'static Global pattern using OnceCell directly rather than through Lazy. Using a macro interface, we can make this pattern more accessible.

Presented for consideration, a potential implementation:

#![feature(type_alias_impl_trait)]

pub use once_cell::sync::*;
#[doc(hidden)]
pub use std::{ops::Deref, marker::Sized};

pub trait LazyStatic: Deref {
    /// Force initialization of this lazy static.
    ///
    /// Calling this function is a hint that this is called once
    /// during startup, and could benefit from inlining.
    ///
    /// For normal usage, use [`Deref`][<Self as Deref>::deref] instead;
    /// `deref` hints that the initialization path is `#[cold]`.
    #[inline]
    #[allow(unused_attributes)] // documentation
    fn force(this: &Self) -> &'static Self::Target;
}

#[macro_export]
macro_rules! __once_cell__sync__lazy {
    // item position lazy! { vis static NAME: Ty = expr; }
    // expands to:
    // vis static NAME: impl LazyStatic<Target = Ty>;
    {
        $(#[$meta:meta])* 
        $vis:vis static $STATIC:ident: $Type:ty = $build:expr
        $(; $($continue:tt)*)?
    } => {
        // To avoid TAIT, expose the Impl type instead.
        type $STATIC = impl $crate::LazyStatic<Target=$Type>;
        $(#[$meta])*
        $vis static $STATIC: $STATIC = {
            use $crate::LazyStatic as _;
            static ONCE: $crate::OnceCell<$Type> = $crate::OnceCell::new();

            struct Impl;
            impl $crate::LazyStatic for Impl {
                #[inline]
                fn force(_this: &Self) -> &'static $Type {
                    ONCE.get_or_init(|| $build)
                }
            }
            impl $crate::Deref for Impl {
                type Target = $Type;
                #[inline]
                fn deref(&self) -> &'static $Type {
                    #[cold]
                    fn init(this: &Impl) -> &'static $Type {
                        Impl::force(this)
                    }
                    ONCE.get().unwrap_or_else(|| init(self))
                }
            }

            Impl
        };

        // continue processing multiple items
        $($crate::lazy! {
            $($continue)*
        })?
    };

    // item position lazy! { vis fn name() -> Ty { block_content }
    // expands to:
    // vis fn name() -> Ty { once! { block_content } }
    {
        $(#[$meta:meta])* 
        $vis:vis fn $name:ident() -> &'static $Ret:ty {
            $($content:tt)*
        }
        $($($continue:tt)+)?
    } => {
        $(#[$meta])*
        $vis fn $name() -> &'static $Ret {
            // inlining once! can avoid TAIT since we know $Ret
            $crate::once! { $($content)* }
        }
        
        // continue processing multiple items
        $($crate::lazy! {
            $($continue)+
        })?
    };

    // empty
    {} => {};
}
pub use __once_cell__sync__lazy as lazy;

#[macro_export]
macro_rules! __once_cell__sync__once {
    // expression position once!( block_content )
    // expands to:
    // static ONCE: OnceCell<_>; ONCE.get_or_init(|| block_content)
    ( $($content:tt)* ) => {{
        // TAIT is transparent to the calling scope, but
        // only if the it ascribes the type at the callsite.
        #[allow(nonstandard_style)]
        type NEEDS_TYPE_ANNOTATION = impl $crate::Sized;
        static ONCE: $crate::OnceCell<NEEDS_TYPE_ANNOTATION> = $crate::OnceCell::new();
        ONCE.get_or_init(|| { $($content)* })
    }};
}
pub use __once_cell__sync__once as once;

Actually... perhaps lazy! { fn } should actually be memoize! or similar, since that matches the semantics (take the output and cache it) moreso than lazy!. But this is how the proof of concept went.

Bonus far-future example: if we get function assignment, it would be cool to write

fn make() -> T;
fn get_global() -> &'static T = once!(make());

instead of lazy! { fn }.

I understand that part of the draw of once_cell is the macroless API, but providing optional macros with straightforward documented translation1 can help as a documentation point for common-to-the-point-of-generalization patterns for using the API.

It probably makes more sense to keep the simple API for the time being (e.g. because the nice impls I provide rely on TAIT in order to be properly nice), but if/when std's version of OnceCell is available, it might make sense to wrap some common patterns into simple macros in this (or another) crate.

Footnotes

  1. Straightforward in the expository documentation, even if the actual expansion is more complicated. macro_rules! always have to do interesting things to achieve proper hygiene when defining items.

@matklad
Copy link
Owner

matklad commented Jul 26, 2022

On the other hand, perhaps this is an advantage; initialization is #[cold] and only called once, so inlining the initialization codepath may serve more to make inlining callers more costly than to make initialization less expensive.

Initialization is also dyn, so some devirtualization has to happen:

fn initialize_or_wait(queue: &AtomicUsize, mut init: Option<&mut dyn FnMut() -> bool>) {

Another problem with the macro is that introduces a fresh type per lazy instance, (which also leaks into the API).

So, I don't think this shold be included in the library, or in std.

But it might be a good fit for the recipes section of the docs!

https://github.com/matklad/once_cell/blob/master/src/lib.rs#L27

@notgull
Copy link

notgull commented Jul 26, 2022

I agree with matklad here. A Lazy storing anything aside from a fn() -> T is a code smell that indicates it should be replaced with a OnceCell.

@CAD97
Copy link
Author

CAD97 commented Jul 27, 2022

I think I'm in agreement: the patterns implemented by these macros can be documented in the docs and that's sufficient to close this "issue." Providing them as macros is not really necessary (especially given the reliance on TAIT for niceness; having static GLOBAL: impl Lazy<Resource>; I think is fine, whereas a custom type is significantly heavier and less nice).


A Lazy storing anything aside from a fn() -> T is a code smell that indicates it should be replaced

Is this justification for removing the second generic on Lazy? Doing so for once_cell is probably just breaking for minimal gain, but doing so for std's unstable version might be beneficial.

Note that a simple let it = Lazy::new(make); will use the fn()->T {make} ZST rather than the erased fn() -> T function pointer.

Rustc is smart enough to say expected `fn() -> T` found `T` instead of expected `Lazy<T, fn() -> T>` found `Lazy<T, T>` when calling Lazy::new(T), so perhaps there's no reason to prevent the fully inferred case from using dataful closures and monomorphized ZST function types.

with a OnceCell.

This isn't always possible (without just rewriting Lazy); consider:

let big_data = BigData::acquire();
let lazy = LazyCell::new(|| process(big_data));

if maybe() {
    observe(&*lazy);
}

if maybe() {
    observe(&*lazy);
}

The only way to rewrite that with OnceCell or Lazy<_, fn() -> _> is to change semantics (only acquire BigData inside the lazy callback) or reinvent Lazy by mem::takeing big_data into the closure, since it can't be FnOnce in multiple arms.

@matklad
Copy link
Owner

matklad commented Sep 15, 2022

A Lazy storing anything aside from a fn() -> T is a code smell

Yeah, I think I wouldn't agree with that -- it is sometimes (but rarely) useful to use a Lazy local variable (which is exactly CAD97's example)

@matklad
Copy link
Owner

matklad commented Sep 15, 2022

Closing! This is a useful discussion, but I think conclusion is that we don't actually want to change anything!

@matklad matklad closed this as not planned Won't fix, can't repro, duplicate, stale Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants