Description
Describe the issue
I'm working on a C++ project that uses the ONNX Runtime CXX API. I've successfully integrated it with the Wasm CPU backend, and I'm now trying to add support for WebGPU. I've built libonnxruntime_webassembly.a
(using the build instructions provided below). However, I'm encountering the following error when trying to use the static library in a sample application:
error: handleI64Signatures: signature too long for emwgpuWaitAny
Error: Aborting compilation due to previous errors
emcc: error: '/usr/workspace/onnxruntime/cmake/external/emsdk/node/20.18.0_64bit/bin/node /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/tools/compiler.mjs -' failed (returned 1)
Build commands:
ONNXRuntime WebGPU
./build.sh --config Debug --build_wasm_static_lib --use_webgpu --parallel --skip_tests --skip_onnx_tests --allow_running_as_root
Sample application
I've included js-library
files and include dir for Dawn that was required for this build. webgpu.cpp
from Dawn was not added here as it resulted in duplicate symbol errors. I've also maintained the correct order for the js-library files, i.e. library_webgpu.js
comes last.
emcc mnist_inference_wasm.cpp \
-o $output_dir/mnist_inference_wasm.js \
-s ALLOW_MEMORY_GROWTH=1 \
-O3 \
-s "EXPORTED_FUNCTIONS=['_loadModel', '_runInference', '_malloc', '_free']" \
-s "EXPORTED_RUNTIME_METHODS=['ccall','cwrap', 'stringToUTF8', 'HEAPF32']" \
-s MAXIMUM_MEMORY=4GB \
-s STACK_SIZE=5MB \
--preload-file model/onnx_model.onnx \
-I onnxruntime/include/onnxruntime/core/session \
-L $onnxruntime_dir \
-lonnxruntime_webassembly \
-std=c++17 \
-I $onnxruntime_dir/_deps/dawn-build/gen/src/emdawnwebgpu/include/ \
--js-library $onnxruntime_dir/_deps/dawn-build/gen/src/emdawnwebgpu/library_webgpu_enum_tables.js\
--js-library $onnxruntime_dir/_deps/dawn-build/gen/src/emdawnwebgpu/library_webgpu_generated_struct_info.js \
--js-library $onnxruntime_dir/_deps/dawn-build/gen/src/emdawnwebgpu/library_webgpu_generated_sig_info.js \
--js-library $onnxruntime_dir/_deps/dawn-src/third_party/emdawnwebgpu/pkg/webgpu/src/library_webgpu.js \
The following is the output with EMCC_DEBUG=1 for the sample application build.
emcc:DEBUG: compiling source file: mnist_inference_wasm.cpp
shared:DEBUG: successfully executed /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/bin/clang -target wasm32-unknown-emscripten -fignore-exceptions -mllvm -combiner-global-alias-analysis=false -mllvm -e
nable-emscripten-sjlj -mllvm -disable-lsr --sysroot=/usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/incl
ude/compat -O3 -Ionnxruntime/include/onnxruntime/core/session -std=c++17 -Ionnxruntime/build/Linux/Debug/_deps/dawn-build/gen/src/emdawnwebgpu/include/ -c mnist_inference_wasm.cpp -o /tmp/emscripten_temp/mnist
_inference_wasm_0.o
profiler:DEBUG: block "compile inputs" took 1.007 seconds
profiler:DEBUG: block "linker_setup" took 0.000 seconds
link:DEBUG: looking for library "onnxruntime_webassembly"
link:DEBUG: found library "libonnxruntime_webassembly.a" at onnxruntime/build/Linux/Debug/libonnxruntime_webassembly.a
profiler:DEBUG: block "calculate linker inputs" took 0.000 seconds
link:DEBUG: setting up files
config:DEBUG: using config file: /usr/workspace/onnxruntime/cmake/external/emsdk/.emscripten
onnx_model.onnx /usr/workspace/onnx_model.onnx /usr/workspace
Packaging file "onnx_model.onnx" to VFS in path "/onnx_model.onnx".
shared:DEBUG: successfully executed /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/tools/file_packager mnist/build/Release/mnist_inference_wasm.data --from-emcc --preload onnx_model.onnx
profiler:DEBUG: block "package_files" took 0.050 seconds
system_libs:DEBUG: including libGL (libGL-getprocaddr.a)
system_libs:DEBUG: including libal (libal.a)
system_libs:DEBUG: including libhtml5 (libhtml5.a)
system_libs:DEBUG: including libstubs (libstubs.a)
system_libs:DEBUG: including libnoexit (libnoexit.a)
system_libs:DEBUG: including libc (libc.a)
system_libs:DEBUG: including libmalloc (libdlmalloc.a)
system_libs:DEBUG: including libcompiler_rt (libcompiler_rt.a)
system_libs:DEBUG: including libc++ (libc++-noexcept.a)
system_libs:DEBUG: including libc++abi (libc++abi-noexcept.a)
system_libs:DEBUG: including libsockets (libsockets.a)
profiler:DEBUG: block "calculate system libraries" took 0.002 seconds
building:DEBUG: saving intermediate file /tmp/emscripten_temp/emcc-00-settings.json
shared:DEBUG: successfully executed /usr/workspace/onnxruntime/cmake/external/emsdk/node/20.18.0_64bit/bin/node /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/tools/compiler.mjs - --symbols-
only
profiler:DEBUG: block "compile_javascript" took 0.168 seconds
profiler:DEBUG: block "JS symbol generation" took 0.170 seconds
link:DEBUG: linking: ['/tmp/emscripten_temp/mnist_inference_wasm_0.o', '-Lonnxruntime/build/Linux/Debug', '-lonnxruntime_webassembly', '-L/usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten', '-L/usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/src/lib', '-lG
L-getprocaddr', '-lal', '-lhtml5', '-lstubs', '-lnoexit', '-lc', '-ldlmalloc', '-lcompiler_rt', '-lc++-noexcept', '-lc++abi-noexcept', '-lsockets']
shared:DEBUG: successfully executed /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/bin/wasm-ld -o mnist/build/Release/mnist_inference_wasm.wasm /tmp/emscripten_temp/mnist_inference_wasm_0.o -Lonnxrunti
me/build/Linux/Debug -lonnxruntime_webassembly -L/usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/cache/s
ysroot/lib/wasm32-emscripten -L/usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/src/lib -lGL-getprocaddr -lal -lhtml5 -lstubs -lnoexit -lc -ldlmalloc -lcompiler_rt -lc++-noexcept -lc++abi-noex
cept -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr /tmp/emscripten_temp/tmp488xnt40libemscripten_js_symbols.so --strip-debug --export=loadModel --expor
t=runInference --export=malloc --export=free --export=_emscripten_stack_alloc --export=__wasm_call_ctors --export=emscripten_stack_get_current --export=_emscripten_stack_restore --export-if-defined=__start_em_as
m --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-table -z stack-size=
5242880 --max-memory=4294967296 --initial-memory=16777216 --no-entry --table-base=1 --global-base=1024
profiler:DEBUG: block "link" took 69.968 seconds
link:DEBUG: emscript
building:DEBUG: saving intermediate file /tmp/emscripten_temp/emcc-01-base.wasm
building:DEBUG: saving intermediate file /tmp/emscripten_temp/emcc-02-strip.wasm
shared:DEBUG: successfully executed /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/bin/llvm-objcopy mnist/build/Release/mnist_inference_wasm.wasm mnist/build/Release/mnist_inference_wasm.wasm '--remove
-section=.debug*' --remove-section=producers --remove-section=name
extract_metadata:DEBUG: no start/stop symbols found for section: em_lib_deps
emscripten:DEBUG: Metadata: {'all_exports': ['memory',
'__wasm_call_ctors',
'loadModel',
'__indirect_function_table',
'runInference',
'HaveOffsetConverter',
'emwgpuCreateBindGroup',
'emwgpuCreateBindGroupLayout',
'emwgpuCreateCommandBuffer',
'emwgpuCreateCommandEncoder',
'emwgpuCreateComputePassEncoder',
'emwgpuCreateComputePipeline',
'emwgpuCreatePipelineLayout',
'emwgpuCreateQuerySet',
'emwgpuCreateRenderBundle',
'emwgpuCreateRenderBundleEncoder',
'emwgpuCreateRenderPassEncoder',
'emwgpuCreateRenderPipeline',
'emwgpuCreateSampler',
'emwgpuCreateSurface',
'emwgpuCreateTexture',
'emwgpuCreateTextureView',
'emwgpuCreateAdapter',
'emwgpuCreateBuffer',
'emwgpuCreateDevice',
'emwgpuCreateQueue',
'emwgpuCreateShaderModule',
'emwgpuOnCompilationInfoCompleted',
'emwgpuOnCreateComputePipelineCompleted',
'emwgpuOnCreateRenderPipelineCompleted',
'emwgpuOnDeviceLostCompleted',
'emwgpuOnMapAsyncCompleted',
'emwgpuOnPopErrorScopeCompleted',
'emwgpuOnRequestAdapterCompleted',
'emwgpuOnRequestDeviceCompleted',
'emwgpuOnWorkDoneCompleted',
'emwgpuOnUncapturedError',
'free',
'malloc',
'emscripten_builtin_memalign',
'memalign',
'_emscripten_stack_restore',
'_emscripten_stack_alloc',
'emscripten_stack_get_current',
'__start_em_asm',
'__stop_em_asm',
'__start_em_js',
'__stop_em_js'],
'em_asm_consts': {683748: "({ if (typeof Module == 'undefined' || "
'!Module.MountedFiles) { return 1; } let fileName = '
'UTF8ToString(Number($0 >>> 0)); if '
"(fileName.startsWith('./')) { fileName = "
'fileName.substring(2); } const fileData = '
'Module.MountedFiles.get(fileName); if (!fileData) '
'{ return 2; } const offset = Number($1 >>> 0); '
'const length = Number($2 >>> 0); const '
'dataIdOrBuffer = Number($3 >>> 0); const loadType '
'= $4; if (offset + length > fileData.byteLength) { '
'return 3; } try { const data = '
'fileData.subarray(offset, offset + length); switch '
'(loadType) { case 0: HEAPU8.set(data, '
'dataIdOrBuffer); break; case 1: if '
'(Module.webgpuUploadExternalBuffer) { '
'Module.webgpuUploadExternalBuffer(dataIdOrBuffer, '
'data); } else { '
'Module.jsepUploadExternalBuffer(dataIdOrBuffer, '
'data); } break; default: return 4; } return 0; } '
'catch { return 4; } })',
684572: '{ return (typeof wasmOffsetConverter !== '
"'undefined'); }"},
'em_js_func_types': {'HaveOffsetConverter': FuncType(params=[], returns=[<Type.I32: 127>])},
'em_js_funcs': {'HaveOffsetConverter': '()<::>{ return typeof '
"wasmOffsetConverter !== 'undefined'; "
'}'},
'features': ['--enable-bulk-memory',
'--enable-bulk-memory-opt',
'--enable-call-indirect-overlong',
'--enable-multivalue',
'--enable-mutable-globals',
'--enable-nontrapping-float-to-int',
'--enable-reference-types',
'--enable-sign-ext',
'--enable-simd'],
'function_exports': {'__wasm_call_ctors': FuncType(params=[], returns=[]), [88/5112]
'_emscripten_stack_alloc': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'_emscripten_stack_restore': FuncType(params=[<Type.I32: 127>], returns=[]),
'emscripten_builtin_memalign': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>], returns=[<Type.I32: 127>]),
'emscripten_stack_get_current': FuncType(params=[], returns=[<Type.I32: 127>]),
'emwgpuCreateAdapter': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateBindGroup': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateBindGroupLayout': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateBuffer': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateCommandBuffer': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateCommandEncoder': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateComputePassEncoder': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateComputePipeline': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateDevice': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreatePipelineLayout': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateQuerySet': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateQueue': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateRenderBundle': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateRenderBundleEncoder': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateRenderPassEncoder': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateRenderPipeline': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateSampler': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateShaderModule': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateSurface': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateTexture': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuCreateTextureView': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'emwgpuOnCompilationInfoCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnCreateComputePipelineCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnCreateRenderPipelineCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnDeviceLostCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnMapAsyncCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnPopErrorScopeCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnRequestAdapterCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnRequestDeviceCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnUncapturedError': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>, <Type.I32: 127>], returns=[]),
'emwgpuOnWorkDoneCompleted': FuncType(params=[<Type.F64: 124>, <Type.I32: 127>], returns=[]),
'free': FuncType(params=[<Type.I32: 127>], returns=[]),
'loadModel': FuncType(params=[<Type.I32: 127>], returns=[]),
'malloc': FuncType(params=[<Type.I32: 127>], returns=[<Type.I32: 127>]),
'memalign': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>], returns=[<Type.I32: 127>]),
'runInference': FuncType(params=[<Type.I32: 127>, <Type.I32: 127>], returns=[<Type.I32: 127>])},
'global_exports': {'__em_js__HaveOffsetConverter': '684629',
'__start_em_asm': '683748',
'__start_em_js': '684629',
'__stop_em_asm': '684629',
'__stop_em_js': '684690'},
'imports': ['__cxa_throw',
'emscripten_asm_const_int',
'emscripten_errn',
'emscripten_stack_snapshot',
'emscripten_stack_unwind_buffer',
'wgpuDeviceHasFeature',
'emwgpuBufferDestroy',
'emwgpuBufferGetConstMappedRange',
'emwgpuBufferGetMappedRange',
'emwgpuBufferMapAsync',
'emwgpuBufferUnmap',
'emwgpuDeviceDestroy',
'emwgpuDelete',
'emscripten_has_asyncify',
'emwgpuAdapterRequestDevice',
'emwgpuDeviceCreateBuffer',
'emwgpuDeviceCreateShaderModule',
'emwgpuDevicePopErrorScope',
'emwgpuInstanceRequestAdapter',
'emwgpuWaitAny',
'wgpuDeviceCreateComputePipeline',
'wgpuBufferGetSize',
'wgpuBufferGetUsage',
'wgpuDeviceCreateCommandEncoder',
'wgpuComputePassEncoderEnd',
'wgpuCommandEncoderCopyBufferToBuffer',
'wgpuDeviceCreateBindGroup',
'wgpuComputePassEncoderSetBindGroup',
'wgpuQueueWriteBuffer',
'wgpuCommandEncoderBeginComputePass',
'wgpuAdapterHasFeature',
'wgpuAdapterGetLimits',
'wgpuComputePassEncoderWriteTimestamp',
'wgpuDeviceCreateQuerySet',
'wgpuDevicePushErrorScope',
'wgpuCommandEncoderResolveQuerySet',
'wgpuCommandEncoderFinish',
'wgpuQueueSubmit',
'wgpuComputePipelineGetBindGroupLayout',
'wgpuComputePassEncoderSetPipeline',
'wgpuComputePassEncoderDispatchWorkgroups',
'wgpuDeviceGetLimits',
'wgpuDeviceGetFeatures',
'wgpuDeviceGetAdapterInfo',
'HaveOffsetConverter',
'emscripten_pc_get_function',
'_tzset_js',
'proc_exit',
'_abort_js',
'clock_time_get',
'fd_close',
'emscripten_date_now',
'__syscall_fcntl64',
'__syscall_openat',
'__syscall_ioctl',
'fd_write',
'fd_read',
'__syscall_fstat64',
'__syscall_stat64',
'__syscall_newfstatat',
'__syscall_lstat64',
'environ_sizes_get',
'environ_get',
'emscripten_get_now',
'fd_seek',
'__syscall_mkdirat',
'_mktime_js',
'_localtime_js',
'_gmtime_js',
'_munmap_js',
'_mmap_js',
'__syscall_getdents64',
'__syscall_getcwd',
'__syscall_readlinkat',
'__syscall_unlinkat',
'__syscall_rmdir',
'emscripten_get_heap_max',
'emscripten_resize_heap'],
'invoke_funcs': [],
'js_deps': [],
'main_reads_params': False}
profiler:DEBUG: block "get_metadata" took 0.031 seconds
building:DEBUG: saving intermediate file /tmp/emscripten_temp/emcc-03-settings.json
error: handleI64Signatures: signature too long for emwgpuWaitAny
Error: Aborting compilation due to previous errors
emcc: error: '/usr/workspace/onnxruntime/cmake/external/emsdk/node/20.18.0_64bit/bin/node /usr/workspace/onnxruntime/cmake/external/emsdk/upstream/emscripten/tools/compiler.mjs -' failed (returned 1)
profiler:DEBUG: block "compile_javascript" raised an exception after 0.159 seconds
profiler:DEBUG: block "emscript" raised an exception after 0.463 seconds
profiler:DEBUG: block "post link" raised an exception after 0.463 seconds
profiler:DEBUG: block "main" raised an exception after 71.715 seconds
not cleaning up temp files since in debug-save mode, see them in /tmp/emscripten_temp
tools.filelock:DEBUG: Attempting to release lock 124609610204640 on /tmp/emscripten_temp/emscripten.lock
tools.filelock:DEBUG: Lock 124609610204640 released on /tmp/emscripten_temp/emscripten.lock
I would highly appreciate any advice to resolve this. I hope it is fine to tag the author of WebGPU support @fs-eire , I apologize if it is inappropriate.
Other things I've tried:
- Different tags and commits:
- Setting emsdk version to the following:
- latest
- tot
- Emsdk version with which the corresponding Dawn versions were compiled for each tag/commit
Other information:
Similar to what @sevagh faced in #23072, I also saw the following error log at the end of the ONNXRuntime WebGPU build.
wasm-ld: error: lto.tmp: undefined symbol: emwgpuDelete
I'm happy to provide further information if required.
To reproduce
- Build ONNXRuntime Wasm static library with WebGPU support
- Build a sample application using this static library
The commands for these steps are attached above.
Urgency
This blocks any/all WebGPU builds.
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
Tags: v1.22.1, v1.22.0, Commits: deee480, d293285
Execution Provider
'webgpu' (WebGPU)