Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The __deregister_frame is slow on macos #6541

Open
TimonPost opened this issue Jun 8, 2023 · 6 comments
Open

The __deregister_frame is slow on macos #6541

TimonPost opened this issue Jun 8, 2023 · 6 comments

Comments

@TimonPost
Copy link
Contributor

TimonPost commented Jun 8, 2023

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms.

This dropping happens when a wasmtime::Module drops.

Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. When hot-reloading after cargo watch detects a change when a single code line changes it used to take 2 secs, but somewhere in last few months, this has somehow gotten very slow. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}
@TimonPost TimonPost changed the title __deregister_frame is slow on macos The __deregister_frame is slow on macos Jun 8, 2023
@fitzgen
Copy link
Member

fitzgen commented Jun 8, 2023

Unfortunately there is nothing we can do to speed up the implementation of system libraries.

You can avoid the need to interact with libunwind by disabling (optional but enabled-by-default) native unwind info: https://docs.rs/wasmtime/latest/wasmtime/struct.Config.html#method.native_unwind_info

This configuration option only exists to help third-party stack capturing mechanisms, such as the system’s unwinder or the backtrace crate, determine how to unwind through Wasm frames. It does not affect whether Wasmtime can capture Wasm backtraces or not.

@TimonPost
Copy link
Contributor Author

TimonPost commented Jun 8, 2023

Can the behavior of frame deregistration be decoupled from the Drop trait and be optionally triggered in a controlled manner? For example, what if wasmtime allows one to manually fetch 'unwind frames' and allow manually disposing of them. Or what would happen if the frames were pushed to a separate thread for deregistration? Are there any potential downsides or considerations to keep in mind when pursuing this approach?

While I understand the typical approach of performing cleanup on drop, having more control over the process would be really nice to have control over this. As at the moment it is very implicit behavior and it took me a while to pinpoint this exact function deep down the code stack. And also, disabling the behavior will work but has its own downsides that we loose some unwinding information, which ideally we would want to preserve.

@bjorn3
Copy link
Contributor

bjorn3 commented Jun 8, 2023

Deregistration of unwind info needs to happen before the memory of the module is deallocated to prevent dangling pointers. You could move the Module to a background thread before dropping it to avoid blocking on deregistration of unwind info. This has the risk of runaway memory usage if you load new modules faster than they can be dropped however.

@jameysharp
Copy link
Contributor

I may not have understood the details here, but if I did, I think @bjorn3's suggestion is the best you can do if you must use the platform-native unwinding support. To avoid unbounded memory usage I think you could use sync_channel to create a bounded queue that blocks the main threads if the background thread falls behind on dropping unused modules.

@TimonPost
Copy link
Contributor Author

When i search for this native_unwind_infoconfiguration option it seems to never be used, only validated. It looks like the PR that introduced this option removed this line https://github.com/bytecodealliance/wasmtime/pull/4643/files#diff-908a422cef87524478865a23918af79040c970e02a3cc6862f6eb508710606f9L1436, which registered the option in cranelift compiler. When I add this manually to my config I do get 0 unwind registers for macos, if i don't do that, I still get all my unwind frames being inserted.

@TimonPost
Copy link
Contributor Author

Opened A PR #6547, let me know if this assumption is correct, that at the moment it isn't possible to disable unwind frames. From my local debugging with this branch, it does look like this drastically improves unloading from +20 secs to a few micro secs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants