Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow creating and writing directly to buffers in wasm memory from JS #1079

Closed
mstange opened this issue Dec 2, 2018 · 4 comments
Closed

Comments

@mstange
Copy link
Contributor

mstange commented Dec 2, 2018

I have a use case where my JS code needs to load a 2GB file and then pass its contents to a wasm-compiled rust function, which expects the data as a &[u8] parameter. If I do this the straightforward way, I end up with 4GB of memory usage and an extra 2GB copy:

  1. JS creates a 2GB Uint8Array and reads the data into it.
  2. JS calls the wasm-bindgen-generated binding function and passes it that typed array.
  3. The binding function wrapper allocates 2GB in wasm memory and copies the array contents into it. Now I have 4GB of memory usage.
  4. The binding function calls the underlying wasm function, giving it a pointer to the wasm memory allocation.
  5. The underlying wasm function runs and then returns.
  6. The binding function deallocates the 2GB chunk of wasm memory. Now I'm back down to 2GB of memory usage.
  7. The binding function returns.

I would like to avoid the extra 2GB of memory usage and the copy. Ideally, wasm-bindgen would generate an API to allocate a buffer, expose that buffer to JS as a typed array that directly wraps the wasm memory buffer, then let me write my data into that typed array from JS (while I promise not to call any other wasm functions as long as I'm using that typed array, because doing so might recreate the wasm memory's underlying array buffer), and then also let me pass that buffer handle to any generated binding functions that accept &[u8] parameters.

At the moment, I'm working around this problem by bypassing the wasm-bindgen-generated binding function and calling the underlying wasm function directly, as well as __wbindgen_malloc and __wbindgen_free. It would be nice if this workaround wasn't necessary.

@alexcrichton
Copy link
Contributor

Thanks for the report! I'm not actually sure how we'd best support this! This seems like it may actually be best done in user code rather than in wasm-bindgen itself though? There's some degree of memory unsafety happening here due to the fact that it's uninitialized memory being exposed to the outside world.

Do you have an idea though about how we could expose something like this to JS?

FWIW I'd definitely avoid the __wbindgen_* functions if you can, they're only for internal use and can change over time. Instead I'd recommend something like a custom struct tagged with #[wasm_bindgen] for accessing methods and such.

@alexcrichton
Copy link
Contributor

I think this is now best done with functions like Uint8Array::{copy_to,view}, so I'm going to go ahead and close this.

@mstange
Copy link
Contributor Author

mstange commented Jul 4, 2019

@alexcrichton So I think Uint8Array::view gets me pretty close to what I need, but what I really want (I think?) is something like Uint8Array::view_mut. Here's what I can do now:

Rust:

#[wasm_bindgen]
pub struct WasmMemBuffer {
    buffer: Vec<u8>,
}

#[wasm_bindgen]
impl WasmMemBuffer {
    #[wasm_bindgen(constructor)]
    pub fn new(byte_length: u32, f: &js_sys::Function) -> Self {
        let mut buffer = vec![0; byte_length as usize];
        unsafe {
            let array = js_sys::Uint8Array::view(&mut buffer);
            f.call1(&JsValue::NULL, &JsValue::from(array))
                .expect("The callback function should not throw");
        }
        Self { buffer }
    }
}

fn compute_buffer_hash_impl(data: &[u8]) -> u32 { ... }

#[wasm_bindgen]
pub fn compute_buffer_hash(data: &WasmMemBuffer) -> u32 {
    compute_buffer_hash_impl(&data.buffer)
}

JavaScript:

let buffer = new WasmMemBuffer(100000, array => {
  // "array" wraps a piece of wasm memory. Fill it with some values.
  for (let i = 0; i < array.length; i++) {
    array[i] = Math.floor(Math.random() * 256);
  }
});
let hash = compute_buffer_hash(buffer); // No buffer copy when passing this argument. Yay!
buffer.free();
console.log(hash);

This is great, but I think it's lying to the rust compiler: Uint8Array::view is taking an immutable reference, but then I'm going ahead and mutating the contents. So in theory, the compiler might assume that the buffer contents are unchanged by my JS call and just store all zeros in WasmMemBuffer::buffer at the end of WasmMemBuffer::new. Right? Should I file a new bug about adding Uint8Array::view_mut which does the same as view but takes a mutable slice?

mstange added a commit to mstange/profiler-get-symbols that referenced this issue Jul 5, 2019
In the past, we were relying on wasm-bindgen implementation details.
See rustwasm/wasm-bindgen#1079 for more information.
@alexcrichton
Copy link
Contributor

Yes this is a case where we're sort of lying to the compiler a bit, but afaik there's no practical ramifications today and we can add bindings if necessary in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants