Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moka get_with inflates future size by ~7x #212

Closed
Swatinem opened this issue Jan 16, 2023 · 14 comments
Closed

Moka get_with inflates future size by ~7x #212

Swatinem opened this issue Jan 16, 2023 · 14 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@Swatinem
Copy link
Contributor

Consider the following example:

fn main() {
    let cache = moka::future::Cache::new(1);
    let get_fut = async {
        let buf = [0u8; 1024];
        async {}.await;
        drop(buf);
    };
    println!("get_fut size: {}", std::mem::size_of_val(&get_fut));
    let moka_fut = cache.get_with((), get_fut);
    println!("moka_fut size: {}", std::mem::size_of_val(&moka_fut));
}

This outputs the following (even in --release mode):

get_fut size: 1026
moka_fut size: 7688

The resulting future grows by pretty much 7x the size of the future I provide.

The problem does not seem to be affected by rust-lang/rust#62321 though.
If I comment out the two println statements, I can use a nightly compiler to log the type sizes, using cargo +nightly rustc -- -Zprint-type-sizes:

-Zprint-type-sizes
print-type-size type: `[async fn body@moka::future::Cache<(), ()>::get_with<[async block@src\main.rs:3:19: 7:6]>::{closure#0}]`: 7688 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend0`: 7684 bytes
print-type-size         field `.key`: 0 bytes, offset: 0 bytes, alignment: 1 bytes
print-type-size         field `.key`: 0 bytes
print-type-size         field `.__awaitee`: 6648 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `..generator_field2`: 1 bytes
print-type-size         field `..generator_field3`: 1 bytes
print-type-size     variant `Unresumed`: 7682 bytes
print-type-size         field `.key`: 0 bytes, offset: 0 bytes, alignment: 1 bytes
print-type-size         padding: 6648 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Returned`: 7682 bytes
print-type-size         field `.key`: 0 bytes, offset: 0 bytes, alignment: 1 bytes
print-type-size         padding: 6648 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Panicked`: 7682 bytes
print-type-size         field `.key`: 0 bytes, offset: 0 bytes, alignment: 1 bytes
print-type-size         padding: 6648 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     end padding: 3 bytes
print-type-size type: `[async fn body@moka::future::Cache<(), ()>::get_or_insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]`: 6648 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend0`: 6647 bytes
print-type-size         field `.__awaitee`: 5568 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.hash`: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `.maybe_v`: 1 bytes
print-type-size         field `..generator_field4`: 1 bytes
print-type-size         field `..generator_field5`: 1 bytes
print-type-size         field `..generator_field6`: 1 bytes
print-type-size         field `..generator_field7`: 1 bytes
print-type-size     variant `Unresumed`: 6642 bytes
print-type-size         padding: 5567 bytes
print-type-size         field `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Returned`: 6642 bytes
print-type-size         padding: 5567 bytes
print-type-size         field `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Panicked`: 6642 bytes
print-type-size         padding: 5567 bytes
print-type-size         field `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size type: `std::mem::ManuallyDrop<[async fn body@moka::future::Cache<(), ()>::get_or_insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]>`: 6648 bytes, alignment: 8 bytes
print-type-size     field `.value`: 6648 bytes
print-type-size type: `std::mem::MaybeUninit<[async fn body@moka::future::Cache<(), ()>::get_or_insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]>`: 6648 bytes, alignment: 8 bytes
print-type-size     variant `MaybeUninit`: 6648 bytes
print-type-size         field `.uninit`: 0 bytes
print-type-size         field `.value`: 6648 bytes
print-type-size type: `[async fn body@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]`: 5568 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend0`: 5563 bytes
print-type-size         field `.hash`: 8 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.hash`: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         field `.__awaitee`: 4472 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `..generator_field5`: 1 bytes
print-type-size     variant `Unresumed`: 5562 bytes
print-type-size         field `.hash`: 8 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 4504 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Returned`: 5562 bytes
print-type-size         field `.hash`: 8 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 4504 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Panicked`: 5562 bytes
print-type-size         field `.hash`: 8 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.replace_if`: 8 bytes
print-type-size         padding: 4504 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     end padding: 4 bytes
print-type-size type: `std::mem::ManuallyDrop<[async fn body@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]>`: 5568 bytes, alignment: 8 bytes
print-type-size     field `.value`: 5568 bytes
print-type-size type: `std::mem::MaybeUninit<[async fn body@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}]>`: 5568 bytes, alignment: 8 bytes
print-type-size     variant `MaybeUninit`: 5568 bytes
print-type-size         field `.uninit`: 0 bytes
print-type-size         field `.value`: 5568 bytes
print-type-size type: `[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}]`: 4472 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend0`: 4468 bytes
print-type-size         field `.__awaitee`: 3360 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.get`: 32 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.insert`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `..generator_field2`: 1 bytes
print-type-size         field `..generator_field3`: 1 bytes
print-type-size     variant `Unresumed`: 4466 bytes
print-type-size         padding: 3359 bytes
print-type-size         field `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.insert`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Returned`: 4466 bytes
print-type-size         padding: 3359 bytes
print-type-size         field `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.insert`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     variant `Panicked`: 4466 bytes
print-type-size         padding: 3359 bytes
print-type-size         field `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.insert`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size     end padding: 3 bytes
print-type-size type: `std::mem::ManuallyDrop<[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}]>`: 4472 bytes, alignment: 8 bytes
print-type-size     field `.value`: 4472 bytes
print-type-size type: `std::mem::MaybeUninit<[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}]>`: 4472 bytes, alignment: 8 bytes
print-type-size     variant `MaybeUninit`: 4472 bytes
print-type-size         field `.uninit`: 0 bytes
print-type-size         field `.value`: 4472 bytes
print-type-size type: `[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::do_try_init<'_, (), (), [closure@moka::future::value_initializer::make_pre_init<(), (), [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}]>::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}::{closure#0}]>::{closure#0}]`: 3360 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend1`: 3353 bytes
print-type-size         field `.pre_init`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.pre_init`: 32 bytes
print-type-size         field `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         field `.waiter`: 8 bytes
print-type-size         field `..generator_field9`: 8 bytes
print-type-size         field `.waiter_guard`: 56 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 1026 bytes
print-type-size         field `.init`: 1026 bytes, alignment: 1 bytes
print-type-size         field `..generator_field16`: 1 bytes
print-type-size         field `..generator_field17`: 1 bytes
print-type-size         field `..generator_field18`: 1 bytes
print-type-size         field `..generator_field19`: 1 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.__awaitee`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Suspend0`: 2439 bytes
print-type-size         field `.pre_init`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.pre_init`: 32 bytes
print-type-size         field `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         field `.retries`: 8 bytes
print-type-size         field `.hash`: 8 bytes
print-type-size         field `.waiter`: 8 bytes
print-type-size         padding: 64 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `..generator_field16`: 1 bytes
print-type-size         field `..generator_field17`: 1 bytes
print-type-size         field `..generator_field18`: 1 bytes
print-type-size         padding: 9 bytes
print-type-size         field `.__awaitee`: 112 bytes, alignment: 8 bytes
print-type-size     variant `Suspend3`: 2375 bytes
print-type-size         field `.pre_init`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.pre_init`: 32 bytes
print-type-size         field `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         field `.retries`: 8 bytes
print-type-size         field `.hash`: 8 bytes
print-type-size         field `.waiter`: 8 bytes
print-type-size         field `..generator_field9`: 8 bytes
print-type-size         padding: 56 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `.init`: 1026 bytes
print-type-size         field `..generator_field16`: 1 bytes
print-type-size         field `..generator_field17`: 1 bytes
print-type-size         field `..generator_field18`: 1 bytes
print-type-size         field `..generator_field19`: 1 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.res`: 8 bytes, alignment: 8 bytes
print-type-size         field `.__awaitee`: 40 bytes
print-type-size     variant `Suspend2`: 2343 bytes
print-type-size         field `.pre_init`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         field `.pre_init`: 32 bytes
print-type-size         field `.cht_key`: 16 bytes
print-type-size         field `..generator_field11`: 16 bytes
print-type-size         field `.type_id`: 8 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         field `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         field `.waiter`: 8 bytes
print-type-size         field `..generator_field9`: 8 bytes
print-type-size         field `.waiter_guard`: 56 bytes
print-type-size         field `.self`: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 1026 bytes
print-type-size         field `.init`: 1026 bytes, alignment: 1 bytes
print-type-size         field `..generator_field16`: 1 bytes
print-type-size         field `..generator_field17`: 1 bytes
print-type-size         field `..generator_field18`: 1 bytes
print-type-size         field `..generator_field19`: 1 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.__awaitee`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Unresumed`: 2316 bytes
print-type-size         padding: 31 bytes
print-type-size         field `.pre_init`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 120 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 1026 bytes
print-type-size         field `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Returned`: 2316 bytes
print-type-size         padding: 31 bytes
print-type-size         field `.pre_init`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 120 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 1026 bytes
print-type-size         field `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Panicked`: 2316 bytes
print-type-size         padding: 31 bytes
print-type-size         field `.pre_init`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         field `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 120 bytes
print-type-size         field `.self`: 8 bytes, alignment: 8 bytes
print-type-size         field `.key`: 8 bytes
print-type-size         field `.post_init`: 24 bytes
print-type-size         padding: 1026 bytes
print-type-size         field `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     end padding: 6 bytes
print-type-size type: `std::mem::ManuallyDrop<[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::do_try_init<'_, (), (), [closure@moka::future::value_initializer::make_pre_init<(), (), [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}]>::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}::{closure#0}]>::{closure#0}]>`: 3360 bytes, alignment: 8 bytes
print-type-size     field `.value`: 3360 bytes
print-type-size type: `std::mem::MaybeUninit<[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::do_try_init<'_, (), (), [closure@moka::future::value_initializer::make_pre_init<(), (), [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}]>::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::init_or_read<'_, [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async block@src\main.rs:3:19: 7:6], [closure@moka::future::Cache<(), ()>::insert_with_hash_and_fun<[async block@src\main.rs:3:19: 7:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}::{closure#0}]>::{closure#0}]>`: 3360 bytes, alignment: 8 bytes
print-type-size     variant `MaybeUninit`: 3360 bytes
print-type-size         field `.uninit`: 0 bytes
print-type-size         field `.value`: 3360 bytes
print-type-size type: `crossbeam_epoch::internal::Local`: 2104 bytes, alignment: 8 bytes
print-type-size     field `.entry`: 8 bytes
print-type-size     field `.epoch`: 8 bytes
print-type-size     field `.collector`: 8 bytes
print-type-size     field `.bag`: 2056 bytes
print-type-size     field `.guard_count`: 8 bytes
print-type-size     field `.handle_count`: 8 bytes
print-type-size     field `.pin_count`: 8 bytes
print-type-size type: `crossbeam_epoch::sync::queue::Node<crossbeam_epoch::internal::SealedBag>`: 2072 bytes, alignment: 8 bytes
print-type-size     field `.data`: 2064 bytes
print-type-size     field `.next`: 8 bytes
print-type-size type: `crossbeam_epoch::internal::SealedBag`: 2064 bytes, alignment: 8 bytes
print-type-size     field `.epoch`: 8 bytes
print-type-size     field `._bag`: 2056 bytes
print-type-size type: `std::mem::ManuallyDrop<crossbeam_epoch::internal::SealedBag>`: 2064 bytes, alignment: 8 bytes
print-type-size     field `.value`: 2064 bytes
print-type-size type: `std::mem::MaybeUninit<crossbeam_epoch::internal::SealedBag>`: 2064 bytes, alignment: 8 bytes
print-type-size     variant `MaybeUninit`: 2064 bytes
print-type-size         field `.uninit`: 0 bytes
print-type-size         field `.value`: 2064 bytes
print-type-size type: `crossbeam_epoch::internal::Bag`: 2056 bytes, alignment: 8 bytes
print-type-size     field `.deferreds`: 2048 bytes
print-type-size     field `.len`: 8 bytes
print-type-size type: `crossbeam_epoch::primitive::cell::UnsafeCell<crossbeam_epoch::internal::Bag>`: 2056 bytes, alignment: 8 bytes
print-type-size     field `.0`: 2056 bytes
print-type-size type: `std::cell::UnsafeCell<crossbeam_epoch::internal::Bag>`: 2056 bytes, alignment: 8 bytes
print-type-size     field `.value`: 2056 bytes
print-type-size type: `crossbeam_channel::flavors::list::Block<moka::common::concurrent::WriteOp<(), ()>>`: 1248 bytes, alignment: 8 bytes
print-type-size     field `.next`: 8 bytes
print-type-size     field `.slots`: 1240 bytes
print-type-size type: `crossbeam_channel::flavors::list::Block<moka::notification::notifier::RemovedEntries<(), ()>>`: 1248 bytes, alignment: 8 bytes
print-type-size     field `.next`: 8 bytes
print-type-size     field `.slots`: 1240 bytes
print-type-size type: `[async block@src\main.rs:3:19: 7:6]`: 1026 bytes, alignment: 1 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend0`: 1025 bytes
print-type-size         field `.buf`: 1024 bytes, offset: 0 bytes, alignment: 1 bytes
print-type-size         field `.__awaitee`: 1 bytes
print-type-size     variant `Unresumed`: 0 bytes
print-type-size     variant `Returned`: 0 bytes
print-type-size     variant `Panicked`: 0 bytes

This problem manifests itself here: getsentry/symbolicator#979
The tests are failing due to stack overflows.

I believe there is a workaround though to intentionally use Box::pin to reduce the size of the future passed to get_with.

Also, get_with has roughly an overhead of ~616 bytes, if I give it an empty future (though I believe that still has at least a discriminant).

@Swatinem
Copy link
Contributor Author

Another issue I kinda measured is stack depth:

Printing a std::backtrace::Backtrace reveals that get_with has a call stack ~20 frames deep. Some of those are related to catch_unwind, others include GenFuture overhead (rust-lang/rust#74779) which will go away hopefully with the next rust version.

Still, there is quite some overhead there, though luckily release builds should optimize most of that out.

@tatsuya6502 tatsuya6502 self-assigned this Jan 24, 2023
@tatsuya6502
Copy link
Member

Thank you for the detailed analysis. I was not aware of this kind of optimization issue exists in rustc. I think our issue is similar to rust-lang/rust#62958, which has not been resolved for a couple of years.

I believe there is a workaround though to intentionally use Box::pin to reduce the size of the future passed to get_with.

You are right. I added Box::pin to your code, and confirmed it helped to reduce the sizes of the futures.

fn main() {
    let cache = moka::future::Cache::new(1);

    let get_fut = async {
        let buf = [0u8; 1024];
        async {}.await;
        drop(buf);
    };

    let get_fut = Box::pin(get_fut); // <= Added this line.

    println!("get_fut size: {}", std::mem::size_of_val(&get_fut));
    let moka_fut = cache.get_with((), get_fut);
    println!("moka_fut size: {}", std::mem::size_of_val(&moka_fut));
}

Before adding the line:

get_fut size: 1026
moka_fut size: 7712

After adding the line:

get_fut size: 8
moka_fut size: 688

(rustc 1.66.1 (90743e729 2023-01-10), host: aarch64-apple-darwin)

According to rust-lang/rust#62958, nesting async fn calls causes the future size to grow exponentially. So I will probably reduce the number of nested calls inside get_with 🤔

pub async fn get_with(&self, key: K, init: impl Future<Output = V>) -> V {

async fn get_with(..., init, ...)
    => async fn get_or_insert_with_hash_and_fun(..., init, ...)
        => async fn insert_with_hash_and_fun(..., init, ...)

and also I will get the Box::pin workaround documented in the API reference just in case if stack overflow still happens for a large init future.

@Swatinem
Copy link
Contributor Author

Thank you for taking a look ❤️ I believe just reducing the number of nested calls should improve the situation here.

All in all, this seems to be a well known problem for Rust compiler developers. Apart from the issue you linked to, there is also a tracking issue for general memory usage issues: rust-lang/rust#69826
As well as rust-lang/rust#99504 which talks about the codegen side of things, mostly related to a ton of memcpy being used.

A 688 bytes baseline size is quite hefty. I’m not sure what the threshold for memcpy is exactly, but reducing the future size should definitely help with the generated code as well.

Also, I have done my analysis on moka:0.9, so I don’t know how the entry API of moka:0.10 changes things.

BTW, I blogged about this whole topic here: https://swatinem.de/blog/future-size/ also with an analysis of assembly etc.

Swatinem added a commit to getsentry/symbolicator that referenced this issue Feb 2, 2023
We can extract this from #1010, and this should help with moka-rs/moka#212 even if we do not (yet) switch to more widespread moka usage.
@tatsuya6502
Copy link
Member

tatsuya6502 commented Feb 4, 2023

I started to work on PR #220 to mitigate this issue for upcoming v0.10.0 release. Once I am satisfied with the result, I will back-port the mitigation to v0.9.7 as well. In v0.10.0, there are 17 15 × get_with-like methods that are taking init future. I am trying to reduce the number of nested calls in these methods.

Current Result

I ran the following program (debug build) against the head of the master branch (v0.10.0) and also the topic branch for the PR:

fn main() {
    let cache = moka::future::Cache::new(1);

    async fn get() {
        let buf = [0u8; 1024];
        async {}.await;
        #[allow(clippy::drop_copy)]
        drop(buf);
    }

    let get_fut = get();
    let get_fut_size = std::mem::size_of_val(&get_fut);
    println!("get_fut size:                     {} bytes", get_fut_size);

    let moka_fut = cache.get_with((), get_fut);
    let moka_fut_size = std::mem::size_of_val(&moka_fut);
    println!(
        "moka_get_with_fut size:           {} bytes ({:.2}x)",
        moka_fut_size,
        moka_fut_size as f64 / get_fut_size as f64
    );

    let get_fut = get();
    let moka_fut = cache.entry(()).or_insert_with(get_fut);
    let moka_fut_size = std::mem::size_of_val(&moka_fut);

    println!(
        "moka_entry_or_insert_wt_fut size: {} bytes ({:.2}x)",
        moka_fut_size,
        moka_fut_size as f64 / get_fut_size as f64
    );
}

Before

get_fut size:                     1026 bytes
moka_get_with_fut size:           7712 bytes (7.52x)
moka_entry_or_insert_wt_fut size: 7720 bytes (7.52x)

After

get_fut size:                     1026 bytes
moka_get_with_fut size:           4464 bytes (4.35x)
moka_entry_or_insert_wt_fut size: 5536 bytes (5.40x)
$ rustc -V
rustc 1.67.0 (fc594f156 2023-01-24)

Current Change

get_with()

Reduced nested calls from 5 levels to 2 levels.

async fn Cache::get_with(..., init)
  async fn Cache::get_or_insert_with_hash_and_fun(..., init, ...) <= Replaced with a macro.
    async fn Cache::insert_with_hash_and_fun(..., init, ...)      <= Replaced with a macro.
      async fn ValueInitializer::init_or_read(..., init, ...)     <= Eliminated.
        async fn ValueInitializer::do_try_init(..., init, ...)

entry().or_insert_with()

Reduced nested calls from 5 levels to 3 levels.

async fn OwnedKeyEntrySelector::or_insert_with(..., init)
  async fn Cache::get_or_insert_with_hash_and_fun(..., init, ...) <= Could NOT be replaced because
                                                                     it is a method of different type.
    async fn Cache::insert_with_hash_and_fun(..., init, ...)      <= Replaced with a macro.
      async fn ValueInitializer:init_or_read(..., init, ...)      <= Eliminated.
        async fn ValueInitializer::do_try_init(..., init, ...)

@tatsuya6502
Copy link
Member

tatsuya6502 commented Feb 4, 2023

After

get_fut size:                     1026 bytes
moka_get_with_fut size:           4464 bytes (4.35x)
moka_entry_or_insert_wt_fut size: 5536 bytes (5.40x)

I used a nightly compiler on the topic branch of the PR to log the type sizes. I found that the most inner future, which is from async fn try_init_or_read, already reserves the space for the 3 × init futures in variant Suspend1 of the enum. (Note: I renamed do_try_init method to try_init_or_read in the PR)

print-type-size     variant `Suspend1`: 3361 bytes
...
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
...
print-type-size         local `.__awaitee`: 1026 bytes, alignment: 1 bytes
$ rustc +nightly -V 
rustc 1.69.0-nightly (11d96b593 2023-02-01)

$ cargo +nightly rustc -F future --example check_future_size -- -Zprint-type-sizes

...
print-type-size type: `[async fn body@moka::future::value_initializer::ValueInitializer<...>::
    try_init_or_read<'_, (), (), ...>::{closure#0}]`: 3368 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend1`: 3361 bytes
print-type-size         local `.get`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         upvar `.get`: 32 bytes
print-type-size         local `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         local `.self`: 8 bytes
print-type-size         local `.insert`: 24 bytes
print-type-size         local `.post_init`: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         local `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         local `.waiter`: 8 bytes
print-type-size         local `..generator_field10`: 8 bytes
print-type-size         local `.waiter_guard`: 56 bytes
print-type-size         upvar `.self`: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size         local `..generator_field19`: 1 bytes
print-type-size         local `..generator_field20`: 1 bytes
print-type-size         local `..generator_field21`: 1 bytes
print-type-size         padding: 1 bytes
print-type-size         local `.__awaitee`: 1026 bytes, alignment: 1 bytes
...

Unfortunately, I do not think I could simplify try_init_or_read method.

@Swatinem
Copy link
Contributor Author

Swatinem commented Feb 4, 2023

@tatsuya6502 thanks for taking a look at this! ❤️
The results so far look very encouraging.

I believe the duplication between upvar .init and local .__awaitee might be exactly what rust-lang/rust#62958 is talking about (the issue mentions upvars vs locals).
Or it might have something to do with IntoFuture that requires a move, I’m not sure.

The padding looks a bit out of place though. It might be reserved for some space that is needed in Suspend2 if there are any other suspend points?

@tatsuya6502 tatsuya6502 added the enhancement New feature or request label Feb 5, 2023
@tatsuya6502 tatsuya6502 added this to the v0.9.7 milestone Feb 5, 2023
@tatsuya6502
Copy link
Member

tatsuya6502 commented Feb 5, 2023

The padding looks a bit out of place though. It might be reserved for some space that is needed in Suspend2 if there are any other suspend points?

Yes, there are other suspend points: Suspend0, Suspend2 and Suspend3. And I see the padding space is used for local .init: 1026 bytes in Suspend0 and Suspend3:

-Zprint-type-sizes
print-type-size type: `[async fn body@moka::future::value_initializer::ValueInitializer<(), (), std::collections::hash_map::RandomState>::try_init_or_read<'_, (), (), [closure@moka::future::Cache<(), ()>::get_or_insert_with_hash_and_fun<[async fn body@examples/check_future_size.rs:6:20: 11:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#0}], [async fn body@examples/check_future_size.rs:6:20: 11:6], [closure@moka::future::Cache<(), ()>::get_or_insert_with_hash_and_fun<[async fn body@examples/check_future_size.rs:6:20: 11:6], for<'a> fn(&'a ()) -> bool>::{closure#0}::{closure#1}]>::{closure#0}]`: 3368 bytes, alignment: 8 bytes
print-type-size     discriminant: 1 bytes
print-type-size     variant `Suspend1`: 3361 bytes
print-type-size         local `.get`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         upvar `.get`: 32 bytes
print-type-size         local `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         local `.self`: 8 bytes
print-type-size         local `.insert`: 24 bytes
print-type-size         local `.post_init`: 8 bytes
print-type-size         padding: 8 bytes
print-type-size         local `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         local `.waiter`: 8 bytes
print-type-size         local `..generator_field10`: 8 bytes
print-type-size         local `.waiter_guard`: 56 bytes
print-type-size         upvar `.self`: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size         local `..generator_field19`: 1 bytes
print-type-size         local `..generator_field20`: 1 bytes
print-type-size         local `..generator_field21`: 1 bytes
print-type-size         padding: 1 bytes
print-type-size         local `.__awaitee`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Suspend0`: 2447 bytes
print-type-size         local `.get`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         upvar `.get`: 32 bytes
print-type-size         local `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         local `.self`: 8 bytes
print-type-size         local `.insert`: 24 bytes
print-type-size         local `.post_init`: 8 bytes
print-type-size         local `.retries`: 8 bytes
print-type-size         local `.hash`: 8 bytes
print-type-size         local `.waiter`: 8 bytes
print-type-size         padding: 64 bytes
print-type-size         upvar `.self`: 8 bytes, alignment: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         local `.init`: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes
print-type-size         local `..generator_field19`: 1 bytes
print-type-size         local `..generator_field20`: 1 bytes
print-type-size         padding: 2 bytes
print-type-size         local `.__awaitee`: 112 bytes, alignment: 8 bytes
print-type-size     variant `Suspend3`: 2383 bytes
print-type-size         local `.get`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         upvar `.get`: 32 bytes
print-type-size         local `.cht_key`: 16 bytes
print-type-size         padding: 16 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         local `.self`: 8 bytes
print-type-size         local `.insert`: 24 bytes
print-type-size         local `.post_init`: 8 bytes
print-type-size         local `.retries`: 8 bytes
print-type-size         local `.hash`: 8 bytes
print-type-size         local `.waiter`: 8 bytes
print-type-size         local `..generator_field10`: 8 bytes
print-type-size         padding: 56 bytes
print-type-size         upvar `.self`: 8 bytes, alignment: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         local `.init`: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes
print-type-size         local `..generator_field19`: 1 bytes
print-type-size         local `..generator_field20`: 1 bytes
print-type-size         local `..generator_field21`: 1 bytes
print-type-size         padding: 1 bytes
print-type-size         local `.res`: 8 bytes, alignment: 8 bytes
print-type-size         local `.__awaitee`: 40 bytes
print-type-size     variant `Suspend2`: 2359 bytes
print-type-size         local `.get`: 32 bytes, offset: 0 bytes, alignment: 8 bytes
print-type-size         upvar `.get`: 32 bytes
print-type-size         local `.cht_key`: 16 bytes
print-type-size         local `..generator_field12`: 16 bytes
print-type-size         upvar `.type_id`: 8 bytes
print-type-size         local `.self`: 8 bytes
print-type-size         local `.insert`: 24 bytes
print-type-size         padding: 16 bytes
print-type-size         local `.hash`: 8 bytes, alignment: 8 bytes
print-type-size         local `.waiter`: 8 bytes
print-type-size         local `..generator_field10`: 8 bytes
print-type-size         local `.waiter_guard`: 56 bytes
print-type-size         upvar `.self`: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size         local `..generator_field19`: 1 bytes
print-type-size         local `..generator_field20`: 1 bytes
print-type-size         local `..generator_field21`: 1 bytes
print-type-size         padding: 1 bytes
print-type-size         local `.value`: 0 bytes, alignment: 1 bytes
print-type-size         local `..generator_field14`: 1 bytes
print-type-size         padding: 7 bytes
print-type-size         local `.__awaitee`: 16 bytes, alignment: 8 bytes
print-type-size     variant `Unresumed`: 2332 bytes
print-type-size         padding: 31 bytes
print-type-size         upvar `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 128 bytes
print-type-size         upvar `.self`: 8 bytes, alignment: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Returned`: 2332 bytes
print-type-size         padding: 31 bytes
print-type-size         upvar `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 128 bytes
print-type-size         upvar `.self`: 8 bytes, alignment: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     variant `Panicked`: 2332 bytes
print-type-size         padding: 31 bytes
print-type-size         upvar `.get`: 32 bytes, alignment: 8 bytes
print-type-size         padding: 32 bytes
print-type-size         upvar `.type_id`: 8 bytes, alignment: 8 bytes
print-type-size         padding: 128 bytes
print-type-size         upvar `.self`: 8 bytes, alignment: 8 bytes
print-type-size         upvar `.key`: 8 bytes
print-type-size         upvar `.insert`: 24 bytes
print-type-size         upvar `.post_init`: 8 bytes
print-type-size         padding: 1026 bytes
print-type-size         upvar `.init`: 1026 bytes, alignment: 1 bytes
print-type-size     end padding: 6 bytes

@tatsuya6502
Copy link
Member

tatsuya6502 commented Feb 5, 2023

I tried one of the solutions in your blog article, and it worked very well for our case.

If we are dealing with futures, we can also pass a Pin<&mut impl Future> when we pin the relevant future in the outermost callee.

Commit: bc92f4e

Result:

get_fut size:                     1026 bytes
moka_get_with_fut size:           2544 bytes (2.48x)
moka_entry_or_insert_wt_fut size: 2616 bytes (2.55x)

Again, not very nice, and it also makes you vulnerable to polling that future again after it completed, which will panic.

As for safety, I made only internal methods to take init as Pin<&mut impl Future>, and keep the external API to take it as impl Future (to take the ownership). In this way, I think it will be safe as our API will prevent user to reuse a completed future again.

I think ~2.5x is almost optimal with current rustc. I will back-port my stuff to moka v0.9.7 and will release it soon.

@Swatinem
Copy link
Contributor Author

Swatinem commented Feb 5, 2023

Nice, that looks like a very nice improvement!

And I see the padding space is used for local .init: 1026 bytes in Suspend0 and Suspend3:

Your comment is not formatted properly. It might be worth creating a separate issue in rustc for that one, depending on what is seemingly going on.

@tatsuya6502
Copy link
Member

Your comment is not formatted properly.

Oops. Fixed it.

FYI, async fn try_init_or_read has four .awaits. So having four suspend points seems fine to me.

  1. async::RwLock::write(...).await
  2. moka::future::Cache::insert(...).await
  3. init.await
  4. async::RwLock::read(...).await

@tatsuya6502
Copy link
Member

It might be worth creating a separate issue in rustc for that one, depending on what is seemingly going on.

Thank you for creating rust-lang/rust#107695 PR with a minimal reproducible example!

@tatsuya6502
Copy link
Member

I think ~2.5x is almost optimal with current rustc. I will back-port my stuff to moka v0.9.7 and will release it soon.

Tested v0.9.7 and published it to crates.io:
https://crates.io/crates/moka/0.9.7

@tatsuya6502
Copy link
Member

Closing this issue:

Please reopen if needed.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 7, 2023
Add test for Future inflating arg size to 3x

This adds one more test that should track improvements to generator
layout, like rust-lang#62958 and rust-lang#62575.

In particular, this test highlights suboptimal layout, as the storage
for the argument future is not being reused across its usage as `upvar`,
`local` and `awaitee` (being polled to completion).

This is on top of rust-lang#107692 (as those would conflict with each other)

It is a minimal repro for code mentioned in moka-rs/moka#212 (comment) (CC `@tatsuya6502)`
@Swatinem
Copy link
Contributor Author

Swatinem commented Mar 5, 2023

FYI, there is work underway in the compiler to solve some of the problems with future size blowup:
rust-lang/rust#108590 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants