Skip to content

Commit

Permalink
[doc] update docs for ORT web v1.19 (#20756)
Browse files Browse the repository at this point in the history
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
  • Loading branch information
fs-eire committed May 21, 2024
1 parent 10973e7 commit 67ba2e5
Show file tree
Hide file tree
Showing 5 changed files with 32 additions and 70 deletions.
52 changes: 19 additions & 33 deletions docs/build/web.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ ONNX Runtime Web can be built with WebGPU and WebNN support via JavaScript Execu

ONNX Runtime Web can also be built to support the training APIs. To build with training APIs included, use the flag `--enable-training-apis`.

A complete build for ONNX runtime WebAssembly artifacts will contain 7 ".wasm" files (ON/OFF configurations of the flags in the table above) with a few ".js" files.
The WebAssembly artifacts of a complete build for ONNX Runtime Web will contain 3 ".wasm" files with 3 ".mjs" files.
The build command below should be run for each of the configurations.

in `<ORT_ROOT>/`, run one of the following commands to build WebAssembly:
Expand All @@ -84,33 +84,29 @@ in `<ORT_ROOT>/`, run one of the following commands to build WebAssembly:
# It's recommended to use '--skip_tests` in Release & Debug + 'debug info' configruations - please review FAQ for more details

# The following command build debug.
./build.sh --build_wasm
./build.sh --build_wasm --enable_wasm_simd --enable_wasm_threads

# The following command build debug with debug info.
./build.sh --build_wasm --skip_tests --enable_wasm_debug_info
./build.sh --build_wasm --enable_wasm_simd --enable_wasm_threads --skip_tests --enable_wasm_debug_info

# The following command build release.
./build.sh --config Release --build_wasm --skip_tests --disable_wasm_exception_catching --disable_rtti
```

A full list of required build artifacts:

| file name | file name (renamed) | build flag used |
| --------------------------- | -------------------------------- | ----------------------------------------------------------------------- |
| ort-wasm.js | | |
| ort-wasm.wasm | | |
| ort-wasm-threaded.js | | `--enable_wasm_threads` |
| ort-wasm-threaded.wasm | | `--enable_wasm_threads` |
| ort-wasm-threaded.worker.js | | `--enable_wasm_threads` |
| ort-wasm-simd.wasm | | `--enable_wasm_simd` |
| ort-wasm-simd-threaded.wasm | | `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-wasm-simd.js | ort-wasm-simd.jsep.js | `--use_jsep` `--use_webnn` `--enable_wasm_simd` |
| ort-wasm-simd.wasm | ort-wasm-simd.jsep.wasm | `--use_jsep` `--use_webnn` `--enable_wasm_simd` |
| ort-wasm-simd-threaded.js | ort-wasm-simd-threaded.jsep.js | `--use_jsep` `--use_webnn` `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-wasm-simd-threaded.wasm | ort-wasm-simd-threaded.jsep.wasm | `--use_jsep` `--use_webnn` `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-training-wasm-simd.wasm | | `--enable_wasm_simd` `--enable_training_apis` |
| file name | build flag used |
| -------------------------------- | -------------------------------------------------------------------------- |
| ort-wasm-simd-threaded.wasm | `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-wasm-simd-threaded.mjs | `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-wasm-simd-threaded.jsep.wasm | `--use_jsep` `--use_webnn` `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-wasm-simd-threaded.jsep.mjs | `--use_jsep` `--use_webnn` `--enable_wasm_simd` `--enable_wasm_threads` |
| ort-training-wasm-simd.wasm | `--enable_wasm_simd` `--enable_wasm_threads` `--enable_training_apis` |
| ort-training-wasm-simd.mjs | `--enable_wasm_simd` `--enable_wasm_threads` `--enable_training_apis` |

NOTE: WebGPU and WebNN is currently supported as experimental feature for ONNX Runtime Web. The build instructions may change. Please make sure to refer to latest documents from [webgpu gist](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce) and [webnn gist](https://gist.github.com/Honry/88b87c43b3f51a6c38c10454f3599405) for a detailed build/consume instruction for ORT Web WebGPU and WebNN.
NOTE:
- ONNX Runtime Web is dropping support for non-SIMD and non-threaded builds in future versions since v1.19.0.
- WebGPU and WebNN is currently supported as experimental feature for ONNX Runtime Web. The build instructions may change. Please make sure to refer to latest documents from [webgpu gist](https://gist.github.com/fs-eire/a55b2c7e10a6864b9602c279b8b75dce) and [webnn gist](https://gist.github.com/Honry/88b87c43b3f51a6c38c10454f3599405) for a detailed build/consume instruction for ORT Web WebGPU and WebNN.


### Minimal Build Support
Expand Down Expand Up @@ -182,22 +178,12 @@ This is the last stage in the build process, please follow the sections in a seq

2. Copy following files from build output folder to `<ORT_ROOT>/js/web/dist/` (create the folder if it does not exist):

* ort-wasm.wasm (build with default flag)
* ort-wasm-threaded.wasm (build with flag `--enable_wasm_threads`)
* ort-wasm-simd.wasm (build with flag `--enable_wasm_simd`)
* ort-wasm-simd-threaded.wasm (build with flags `--enable_wasm_threads --enable_wasm_simd`)
* ort-wasm-simd.jsep.wasm (renamed from file `ort-wasm-simd.wasm`, build with flags `--use_jsep --enable_wasm_simd`)
* ort-wasm-simd-threaded.jsep.wasm (renamed from file `ort-wasm-simd-threaded.wasm`, build with flags `--use_jsep --enable_wasm_simd --enable_wasm_threads`)
* ort-training-wasm-simd.wasm (build with flags `--enable_wasm_simd --enable_training_apis`)

3. Copy following files from build output folder to `<ORT_ROOT>/js/web/lib/wasm/binding/`:

* ort-wasm.js (build with default flag)
* ort-wasm-threaded.js (build with flag `--enable_wasm_threads`)
* ort-wasm-threaded.worker.js (build with flag `--enable_wasm_threads`)
* ort-wasm-simd.jsep.js (renamed from file `ort-wasm-simd.js`, build with flags `--use_jsep --enable_wasm_simd`)
* ort-wasm-simd-threaded.jsep.js (renamed from file `ort-wasm-simd-threaded.js`, build with flags `--use_jsep --enable_wasm_simd --enable_wasm_threads`)
* ort-training-wasm-simd.js (build with flags `--enable_wasm_simd --enable_training_apis`)
* ort-wasm-simd-threaded.mjs (build with flags `--enable_wasm_threads --enable_wasm_simd`)
* ort-wasm-simd-threaded.jsep.wasm (build with flags `--use_jsep --use_webnn --enable_wasm_simd --enable_wasm_threads`)
* ort-wasm-simd-threaded.jsep.mjs (build with flags `--use_jsep --use_webnn --enable_wasm_simd --enable_wasm_threads`)
* ort-training-wasm-simd-threaded.wasm (build with flags `--enable_wasm_simd --enable_wasm_threads --enable_training_apis`)
* ort-training-wasm-simd-threaded.mjs (build with flags `--enable_wasm_simd --enable_wasm_threads --enable_training_apis`)

### Finalizing onnxruntime build

Expand Down
4 changes: 2 additions & 2 deletions docs/tutorials/web/classify-images-nextjs-github-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -291,10 +291,10 @@ module.exports = {
new CopyPlugin({
patterns: [
{
from: './node_modules/onnxruntime-web/dist/ort-wasm.wasm',
from: './node_modules/onnxruntime-web/dist/ort-wasm-simd-threaded.wasm',
to: 'static/chunks/pages',
}, {
from: './node_modules/onnxruntime-web/dist/ort-wasm-simd.wasm',
from: './node_modules/onnxruntime-web/dist/ort-wasm-simd-threaded.mjs',
to: 'static/chunks/pages',
},
{
Expand Down
26 changes: 11 additions & 15 deletions docs/tutorials/web/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,13 @@ The JavaScript code bundle is usually a minified JavaScript file that contains t

To reduce the size of the JavaScript code bundle, you can use [Conditional Importing](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js/importing_onnxruntime-web#conditional-importing) to import only the necessary parts of ONNX Runtime Web library. For example, you can import `onnxruntime-web/wasm` if you only uses the WebAssembly execution provider, which can reduce the size of the JavaScript code bundle.

#### Inlined worker
#### Worker loading

The ONNX Runtime Web JavaScript code include 3 inlined source code:
1. the web worker for proxy feature
2. the web worker for WebAssembly multi-threading feature
3. the WebAssembly entry generated by `function.toString()` required by (2) for multi-threading feature
There are 2 workers in ONNX Runtime Web that can be loaded at runtime:
1. the web worker for proxy feature. The ONNX Runtime Web JavaScript code can be loaded as the entry of the web worker of proxy feature.
2. the web worker for WebAssembly multi-threading feature. The Emscripten generated JavaScript files can be loaded as the entry of the web worker for WebAssembly multi-threading feature.

The use of inlined worker helps to keep ONNX Runtime Web to a single JavaScript file, which is easier to deploy and use. However, it may not work in some environments, such as Content Security Policy (CSP) restricted environments. See [Security considerations](#security-considerations) for more details.
When deployed in same-origin environments, the workers can be loaded directly from the script URL. This makes the workers being able to load in Content Security Policy (CSP) restricted environments. When deployed in cross-origin environments, for example, loading the workers from a CDN, the workers cannot be loaded directly from the script URL due to the same-origin policy. In this case, a `fetch` will be performed and the workers will be loaded on the object URL that created from the response of the fetch.

### WebAssembly binaries

Expand All @@ -54,13 +53,9 @@ The standard ONNX Runtime Web library includes the following WebAssembly binary

| File | SIMD | Multi-threading | JSEP | Training |
|-----------|-------------|--|---|---|
| `ort-wasm.wasm` |||||
| `ort-wasm-simd.wasm` | ✔️ ||||
| `ort-wasm-threaded.wasm` || ✔️ |||
| `ort-wasm-simd-threaded.wasm` | ✔️ | ✔️ |||
| `ort-wasm-simd.jsep.wasm` | ✔️ || ✔️ ||
| `ort-wasm-simd-threaded.jsep.wasm` | ✔️ | ✔️ | ✔️ ||
| `ort-training-wasm-simd.wasm` | ✔️ | || ✔️ |
| `ort-training-wasm-simd-threaded.wasm` | ✔️ | ✔️ || ✔️ |


The columns indicate whether the feature is supported by the WebAssembly artifact.
Expand All @@ -70,9 +65,10 @@ The columns indicate whether the feature is supported by the WebAssembly artifac
- JSEP: whether the JavaScript Execution Provider (JSEP) feature is enabled. This feature powers the WebGPU and WebNN execution providers.
- Training: whether the training feature is enabled.

When deploying ONNX Runtime Web in a production environment, you should consider which WebAssembly binary file(s) to include in the application. By default, ONNX Runtime Web JavaScript code will check the environment and load the appropriate WebAssembly binary file(s) automatically. This means you should include all combinations of WebAssembly binary file(s) in the deployment for the best compatibility.

However, when your application code imports ONNX Runtime Web with WebGPU or WebNN support, you can just include the 2 WebAssembly binary file(s) for JSEP. Furthermore, if you set the `ort.env.wasm.numThreads` to 1, you can just include file `ort-wasm-simd.jsep.wasm` in your deploy.
When deploying ONNX Runtime Web in a production environment, you should consider which WebAssembly binary file(s) to include in the application. Here are some considerations:
- When using training feature, the `ort-training-wasm-simd-threaded.wasm` file is used.
- When using WebGPU or WebNN execution provider, the `ort-wasm-simd-threaded.jsep.wasm` file is used.
- Otherwise, the `ort-wasm-simd-threaded.wasm` file is used.

#### Ensure the WebAssembly binary file(s) are correctly served

Expand Down Expand Up @@ -126,4 +122,4 @@ See [Secure Context](https://developer.mozilla.org/en-US/docs/Web/Security/Secur

### Content Security Policy (CSP) restricted environments

Currently, ONNX Runtime Web uses inline web workers to enable the proxy feature and WebAssembly multi-threading feature. This means in a CSP restricted environment, the features mentioned above may not work. We are working on a solution to make it work in a CSP restricted environment.
Since ONNX Runtime Web v1.19, the WebAssembly binary file(s) and workers can be loaded in CSP restricted environments. Necessary artifacts need to be served to make it work.
12 changes: 0 additions & 12 deletions docs/tutorials/web/env-flags-and-session-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,25 +79,13 @@ The default value is `0`, which means it will be determined by ONNX Runtime Web

Setting it to `1` will force disable multi-threading. Otherwize, ONNX Runtime Web will perform a check for whether the environment supports multi-threading. Only when the browser supports WebAssembly multi-threading and `crossOriginIsolated` mode is enabled, multi-threading will be enabled. See [Cross Origin Isolation Guide](https://web.dev/cross-origin-isolation-guide/) for more info.

When multi-threading is enabled, ONNX Runtime Web will load the multi-threaded WebAssembly binary file. The corresponding file name will include `-threaded`.

```js
// Disable multi-threading
ort.env.wasm.numThreads = 1;
```

For more information, see [API reference: env.wasm.numThreads](https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#numThreads).

#### `env.wasm.simd`

The `env.wasm.simd` flag is used to enable/disable the SIMD (Single Instruction, Multiple Data) feature. It is enabled by default.

When SIMD is enabled, ONNX Runtime Web will perform a check for whether the environment supports SIMD. If the environment supports SIMD, ONNX Runtime Web will load the SIMD WebAssembly binary file. The corresponding file name will include `-simd`.

It is not recommended to set this flag to `false` unless you are sure that the environment does not support SIMD.

For more information, see [API reference: env.wasm.simd](https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#simd).

#### `env.wasm.proxy`

The `env.wasm.proxy` flag is used to enable/disable the proxy worker feature. It is disabled by default.
Expand Down
8 changes: 0 additions & 8 deletions docs/tutorials/web/performance-diagnosis.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,14 +72,6 @@ ort.env.wasm.numThreads = 0;

See [API reference: env.wasm.numThreads](https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#numThreads) for more details.

### Enable SIMD

Always enable SIMD if it's supported. SIMD (Single Instruction, Multiple Data) is a set of instructions that perform the same operation on multiple data points simultaneously. This can significantly improve the performance of your application.

This feature is enabled by default in ONNX Runtime Web, unless you explicitly disable it by setting `ort.env.wasm.simd = false`.

See [API reference: env.wasm.simd](https://onnxruntime.ai/docs/api/js/interfaces/Env.WebAssemblyFlags.html#simd) for more details.

### Prefer uint8 quantized models

If you are using a quantized model, prefer uint8 quantized models. Avoid float16 models if possible, as float16 is not natively supported by CPU and it is going to be slow.
Expand Down

0 comments on commit 67ba2e5

Please sign in to comment.