Skip to content

Wasm Library Registry Design

Paul Cooper edited this page Oct 10, 2023 · 16 revisions

Overview

A lot of developers have tried to port native projects into Wasm to bring them to the web. When building C/C++ projects, we have tools like vcpkg for acquiring and managing libraries we need for the project. For Node.js projects, we also have mature registry services and powerful CLI tools, such as npm, to manage different packages.

For ported libraries from native to Wasm, we don’t have such a service providing dedicated registry and package management. Some developers choose to publish the ported libraries to npm (i.e., FFmpeg), and we also have Emscripten-ports group on GitHub that holds several Emscripten ported libraries, which could be easily pulled in into Emscripten build through command line options. However, there are many ported libraries scattered around.

Therefore, we’d like to have a registry for these ported Wasm libraries to gather these great works together and improve their reusability.

How to reuse ported libraries?

To reuse a ported native library, there are briefly 3 possible ways:

a) Statically link to the main project and build together.

In the case of statically linking with Emscripten, dependent libraries will be linked to the main project to get a standalone application with no external dependencies. Most native build systems work with Emscripten in this way with little or no changes. It usually provides better performance and smaller code size over dynamic linking, but less flexibility.

b) Pre-built into a Wasm module and dynamically linked to the main project.

In the case of dynamic linking with Emscripten, dependent libraries will be pre-built into Wasm as side modules (-sSIDE_MODULE), whose exports will be imported into the context of main project Wasm module (-sMAIN_MODULE) by JavaScript glue code.

c) Pre-built into a Wasm module following the WebAssembly Component Model spec and serves like a microservice.

In this case, modules are dynamically composed at runtime, and they can communicate with other modules in a standard, portable way. The component model provides features such as Wasm Interface Type (WIT), Canonical ABI and module / components linking mechanisms to allow code to be reused without duplication. This is beyond b) to enable the dynamic linking in a more standard way, which requires runtime and toolchain support following the spec as well.

The current Webnizer and this design doc will majorly cover a) and b), and will consider c) in the future when runtime and toolchain support for component model is more mature.

Use cases

When users have a project or codebase in C/C++ with several dependent libraries and would like to bring it to the web, there are three typical use cases in the development cycle:

  • Query for already ported dependent libraries for what they are porting from native to Wasm.
  • Import candidate ported library and integrate it with current project.
  • Publish the ported project to public repositories (i.e., npm, GitHub, etc.) for distribution and sharing.

10000201000002B7000001205753B776FA903985

Query

When a user wants to query for a ported dependency of the project:

  1. To get started, the user searches related keywords (i.e., wasm <library_name>) on npm or GitHub to find out if there are any existing projects providing useful or similar functionalities to reuse before actually building everything from the ground.
  2. Go through the documents and possibly build instructions of each candidate project carefully to understand if it’s something useful. The user usually cares about:
    1. What’s the runtime environment of the library? Web or Node?
    2. How is the library distributed? Source code, archive file, shared library, or JS module?
    3. How to build the library from source, what build configuration is used and how to integrate it with the current project?
  3. If the user finds a candidate project that meets the requirement, then he/she will start to work on importing and integrating it.
  4. If not, the user needs to retry with new keywords or databases to query for new candidates.

Since there is no centralized entry to query from, it requires a lot of effort in searching and identifying a ported library to reuse. The process described above might be iterated over and over for all dependencies.

A registry service for Wasm libraries allows users to quickly query for already ported libraries and use it in their own project to improve library reusability. The use case would be optimized as:

  1. Search related keywords in the registry and it’ll respond to the user with a filtered list of candidate projects. Since the registry is for ported Wasm libraries only, filtering from keywords would be quicker, more accurate and more reliable.
  2. Each candidate will give a summary about the project build capabilities (i.e., build target, configurations, and instructions, etc.) to let users understand whether it can be used as a dependency. The summary is automatically extracted from project metadata, without the need of manually scrutinizing these tech details.
  3. A lot of projects are well organized or defined so that their dependencies are well described in CMake or .sln files. In that case, tools may help auto extract such information, use them to search in the registry, and recommend the identified candidates to developers.

Import

When the user identifies a ported library and wants to integrate it into the project:

  1. Clones the library from GitHub or npm.
  2. If the ported library is distributed as an archive file or library binary, the user needs to modify the project build script (i.e., CMake or autotools) with proper compiler and linker flags to get it integrated.
  3. If the ported library is distributed as source code:
    1. The user has to look for any build instructions (i.e., readme file or build script) available in the repo and tries to compile the ported library based on that. The building process might be conducted automatically if a build script is provided, otherwise the user needs to manually execute every command for it.
    2. Errors might happen if the build toolchain, source code and related pre-acquisitions (i.e., a dependency of the ported library) are not properly configured. The user bounces between build instructions and command terminal to resolve errors.
    3. Integrates the ported library into the current project after a successful build. As described above, for most available libraries at GitHub or npm, they don’t have a unified distribution format and standardized metadata defining information like build toolchains, configurations, and commands etc. It usually requires manual efforts to set up the project, manage dependencies, and control the overall workflow for building.

A registry service for Wasm libraries ensures the imported libraries from the registry already have a standardized metadata file with sufficient information provided. The use case would be optimized as:

  1. Import the dependency via a simple command or a click. That’s it! Webnizer will handle the import process automatically based on the metadata: fetch the source code, resolve dependencies, setup build steps and package configurations.

Publish

When the user successfully ports a library to Wasm and wants to publish it to a public database so that it could be reused by others:

  1. Wraps up the work by committing all the modifications needed for a successful build into the repo.
  2. Documents the build instructions and configurations properly, usually described in a readme file or providing a build script.
  3. Uploads the project to GitHub or npm for distribution.

This is a basic routine for developers to share their work around. However, this in turn will introduce issues we described in Query and Import use cases – ported libraries are distributed in different public databases, and they lack unified distribution format and standardized metadata describing a library for its consumer.

A registry service for Wasm libraries requires developers to publish their work with standardized metadata. It should contain sufficient information (i.e., build instructions, dependencies, package configurations, etc.) for building, which ensures other users can reuse and consistently reproduce the build results. The use case would be optimized as:

  1. Publish the project via a simple command or a click.

Similar to the Query use case, all the steps needed will be handled automatically by Webnizer behind the scenes, including finalizing the metadata, packaging the project and uploading to the registry.

The other area to consider is how to host a registry for ported WebAssembly libraries