Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WasmFS: external wasmMemory backend #20017

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ufolyah
Copy link

@ufolyah ufolyah commented Aug 11, 2023

Recently we developed this new wasmfs backend based on our demands, and we are happy to contribute it to the webassembly community. Any suggestion is welcome.

We have deployed this backend in our web app for 3 months in the production environment, and it works well. This pr is a fully productive version, and tests will be completed in a few weeks. For this pr, we would like to know everyone's idea on whether this feature could be merged and how to improve it further.

The doc is attached below.

ExtWasmMemFS: A 'WebAssembly.Memory' Based, Fully-Multithreaded Synchronizing WasmFS Backend

This doc describes a file storage system implemented using WebAssembly.Memory. Provides the ability to synchronize reads and writes directly from any thread.
Based on Emscripten 3.1.28 + version.

Spotlights

  1. Based on Memorybackend, but use another WebAssembly.Memory instance as a memory pool to save datafiles.
  2. File storage will not occupy runtime memory, and can grow on demand.
  3. Full 4GB maximum storage support (tested and used on our app).
  4. All threads can access file content without postmessage to other threads, so the performance will not be affected by main thread usage.
  5. Atomic lock implemented for thread-safe and main-thread access.
  6. Provides JS API for fast read/write to file in JS code by copying data from external wasm memory directly to js arraybuffer.
  7. Good browser compatibility, WebAssembly.Memory.grow() is supported along with WebAssembly itself, not like SAB.grow() which only supported recently. Good compatibility makes it can be easily adopted to production environment.

Implementation

The essence is to implement a memory pool on WebAssembly.Memory.

Data Layout

All ExtWasmMemFS data is stored in the following singleton object, shared with all threads:

Module["extWasmMemFS"] = {
    "dataFiles": new WebAssembly.Memory({
        initial: 1 << 9, // 32 MB
        maximum: 1 << 16, // 4 GB
        shared: true,
    }),
    "control": new SharedArrayBuffer(20),
    "index": new SharedArrayBuffer(2 * 1024 * 1024 * 8), // 2M files max.
}

Our data is stored in three buffers: dataFiles, control, and index.

extWasmMemFS.dataFiles

The dataFiles buffer is the main buffer for storing the contents of the file, so WebAssembly.Memory, which can be grown, is used.
The dataFiles are composed as follows:

dataFiles: |0|block 1|block 2| ... |block n|......unalloc area......|
            ^                               ^                       ^
         (8 bytes)                          unalloc_pointer         buffer.byteLength

Block can be one of file_block or empty_block.

empty_block: |block_size|next_empty_block_ptr|prev_empty_block_ptr|.......(trash data).......|
                4 bytes         4 bytes               4 bytes        (block_size - 12) bytes   

file_block:  |block_size|0xFFFFFFFF|.........(file content).......|
                4 bytes    4 bytes       (block_size - 8) bytes

block_size include itself.
Every block should be allocated at a 4-byte boundary and has 12 bytes for minimal size.

To ensure continuous memory read and write, for copy performance, each file is saved in a single file_block.
All empty_blocks form a doubly linked list. The first 8 bytes of dataFiles are not used in order to guarantee that ptr == 0 is meaningless.

extWasmMemFS.index

index stores the file_block header pointer for each file, as well as the file size.
The layout is as follows:

index: |file_0_block_ptr| file_0_size| file_1_block_ptr | file_1_size | ... |
             4 byte         4 byte          4 byte           4 byte
Each file occupies 8 bytes of space. Note that file_size is different from file_block_size.

extWasmMemFS.control

Store 5 int32 numbers

| w_mutex | r_count | index_count | head_empty_ptr | unalloc_ptr |
  • w_mutex, r_count: Two atomic variables that make up a read-write lock. Both reads and writes to ExtWasmMemFS use the same read-write lock.
  • index_count: Atomic variable that records the number of files in the index buffer, increasing only.
  • head_empty_ptr: Head of the empty_block doubly linked list in the dataFiles buffer. (Tail pointer is not recorded for it is not used.)
  • unalloc_ptr: Pointer to an unallocated position in the dataFiles, 1 byte behind the end of the last block, probably just beyond buffer.bytelength.

Function Details

Read-write lock

  • All operations on ExtWasmMemFS use one read-write lock implemented by atomic operation, write operations are guaranteed to be operated by only one thread, and all waits are synchronous spin waits.
  • Insufficient subsequent write performance can be optimized by seperating memory pool operation locks and single file locks, reducing granularity. (Not implemented, Our app does not care write performance.)

Handle provided to the C++ section

  • Each C++ DataFile instance corresponds to an index buffer number. Allocated by the index_count counter in the control buffer.

Method for allocating file blocks of a given size on dataFiles

  • Find a large enough block by head_empty_ptr traversing the doubly linked list.
    • Try to expand each empty_block backward while traversing (see the Defragmentation section below.)
  • If an available empty_block is found.
    • Remove this block from the linked list.
    • The block will be splitted into two blocks, where the first block is the required file_block and the second block is a new empty_block.
      • If there is no enough space (emphirical value is used here), there will be no second block, and the original block will be turned into a file_block as a whole (the size may be larger than the application value, but it ensures that the adjacent blocks are continuous, and the empty_block will not be too small and scattered.)
    • Initialize file_block and empty_block (write block_size and file 0xffffffff flag)
    • Insert a new empty_block into the doubly linked list. (Just insert from the table header using head_empty_ptr)
  • If there is no empty_block available.
    • Allocate a block from the unalloc area
    • If the allocated size exceeds the buffer length, call memory.grow () to request more memory.
    • Initialization file_block
    • Update unalloc_ptr

How to delete file blocks on dataFiles

  • Defragmentation
    • Try to expand the block backwards.
    • If the block is the last block, it is directly put into unalloc area.
  • If not, insert the block into the doubly linked list. (Just insert it at the head position using head_empty_ptr)

Defrag - Expand blocks backwards

  • Find the next adjacent block by block_size, if it is an empty_block:
    • Remove the next adjacent block from the list and add size to the previous block_size.
    • Repeat until the next block is not a empty_block or reaching unalloc area;

Defrag - try to put into unalloc

  • If block is the last block (block_ptr + block_size > = unalloc_ptr)
    • unalloc_ptr = block_ptr;

Try to expand a file_block in place:

  1. If the current block is the last block in the buffer, the space is taken from the unalloc area, and dataFiles.buffer.grow () is used to request more memory when needed.
  2. If it is not the last block, try to find a empty_block after the current file block.
    1. Because we have defragmentation to expand empty_blocks backwards, one empty_block lookahead is enough.
    2. If the empty_block too large, we will split it into two blocks,maintain doubly linked list, and then merged the front one into file_block.

File writing (write API)

  • Read file_block_ptr from index_buffer
  • If file_block large enough, write directly
  • Try to expand file_block
  • Unable to expand, assign new file_block
    • file_block Pre-allocation size calculation strategy: file_size > 65536? file_size * 1.2: file_size * 2;
    • Copy existing content to a new file_block
  • Write content

File reading (read API)

  • Read according to the data layout. 🤪

In-thread TypedArray object cache and cache invalidation

The read/write of SharedArrayBuffer requires the creation of TypedArray objects, and frequent creation of TypedArray objects will affect performance, so we need to maintain a set of TypedArray objects per thread to facilitate reading and writing. In ExtWasmMemFS we construct the following extWasmMemFS_local object in each web worker:

$extWasmMemFS_local: {
        /**@type Uint32Array*/
        dataFilesU32: null, 

        /**@type Uint8Array */
        dataFilesU8: null,

        /**@type Int32Array */
        control: null,

        /** @type Uint32Array*/
        index: null,
}

//run in __postset
extWasmMemFS_local.control = new Int32Array(Module['extWasmMemFS']['control']);
extWasmMemFS_local.index = new Uint32Array(Module['extWasmMemFS']['index']);
extWasmMemFS_local.dataFilesU32 = new Uint32Array(Module['extWasmMemFS']['dataFiles'].buffer);
extWasmMemFS_local.dataFilesU8 = new Uint8Array(Module['extWasmMemFS']['dataFiles'].buffer);

However, the cache may be invalid when WebAssembly.Memory.prototype.grow() get called. At that time, it is necessary to recreate the corresponding TypedArray:

extWasmMemFS_local.dataFilesU32 = new Uint32Array(Module['extWasmMemFS']['dataFiles'].buffer);
extWasmMemFS_local.dataFilesU8 = new Uint8Array(Module['extWasmMemFS']['dataFiles'].buffer);

Because we shall maintain a multi-threaded architecture, other threads cannot receive messages about buffer changes, so other threads need to check and update at a certain time.

And, because the thread must take the write lock of the global read-write lock when perform the buffer grow. So in other thread, it is sufficient to do a typearray recreation only after every time the thread gets the write lock.

@kripken
Copy link
Member

kripken commented Aug 15, 2023

Very interesting, thank you for posting this!

The spotlights section has some really great features that I definitely agree we want to support.

After thinking on this, I think the key benefit is to use an entirely separate Memory for storage. Another way to achieve that could be by compiling a separate program which would have its own Memory. That is, instead of src/library_wasmfs_extwasmmem.js which implements managing memory in that space, I am imagining that one would compile a second program to wasm. In more detail:

  • The main program is compiled normally into wasm+JS.
  • A "file storage" program is compiled into wasm+JS as well.
  • The main program has a new backend that is "connected" to a file storage program using some standard API. Perhaps it receives the Module or the wasm exports from the file storage program, or such.
  • So the main program and file storage programs have their own Memories, but each file operation is handed off to the file storage program.

Then we could implement various file storage programs, including one with 4GB support, and also by just recompiling to wasm64 we could allow even more than 4GB. That is one benefit to compiling the file storage to wasm. Another is that I think writing such a backend in C++ (or another language) may be more robust than JS - JS is great for small backends that interface with Web APIs, but (as this PR shows) there are a lot of options in the space of file storage, and as code gets larger it is usually more efficient to do in wasm.

I don't have specific ideas for the API yet, but perhaps it could build on the existing JSImpl backend approach we have somehow.

Thoughts?

@kripken kripken added the wasmfs label Aug 15, 2023
@ufolyah
Copy link
Author

ufolyah commented Aug 15, 2023

Implementing better FS schema in another wasm program is definitely a good idea. I reused MemoryBackend for directory part because writing a full FS require lots of efforts, and my memory pool implementation is an easy one as well. If we can run a wasm FS backend that directly manipulate the memory pool, we may support much more advanced features.

For the implementation details, I have several concerns based on our demand:

  1. Full multi-threading support, that is, any threads(workers) of the filestorage program and the main program should be able to access to both memories without postMessage, especially postmessage to main thread.
  2. Memory usage limit. We cannot always assume that our users have enough memory for file storage. A memory-based storage larger than 4GB may be not acceptable for production. And, as I know, so far we can't know how much memory is available on the host machine.
  3. Persistent file storage. An ideal FS for us is a persistent file storage with a memory pool cache. So far, we have indexeddb and opfs for that purpose on browsers, however they all have their disadvantages:
    • Indexeddb only support entire-file r/w for binary data, so a memory cache or some special storage schema must be implemented for using it as an FS backend.
    • OPFS have different APIs between the main thread and the worker thread (sync API only support worker thread.)
    • The compatibility of OPFS is not good.
  4. Good and fast access with js api in main thread. Our JS team takes charge of all resource-fetching work, and they have to write the resources to wasmFS to use them. And they perfer a synchonizing API for file r/w like the old MEMFS. It actually a difficult task because:
    • we cannot wait on main thread.
    • writing a js array to a FS implemented in another thread requires two copies (copy to wasmmemory, then file storage), except we use postmessage.
    • Some backend (like OPFS), have different API for the main thread and other threads.
    • If we use any async API or post a task to another thread, the file write API will be non-synchonizing, and it will be a break change for current JSAPI.

@tra4less
Copy link

any update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants