A lightweight OpenMP implementation for Emscripten, enabling basic parallel programming capabilities in WebAssembly applications.
SimpleOMP provides a minimal OpenMP runtime for Emscripten-compiled projects. This implementation is based on the solution discussed in emscripten-core/emscripten#13892.
| Directive/Clause | Status | Description |
|---|---|---|
#pragma omp parallel for |
✅ | Parallel for loop |
num_threads(N) |
✅ | Specify thread count |
if(condition) |
✅ | Conditional parallelization |
schedule(static[, chunk]) |
✅ | Static loop scheduling (block or round-robin) |
schedule(dynamic[, chunk]) |
✅ | Dynamic work distribution |
schedule(guided[, chunk]) |
✅ | Guided scheduling with decreasing chunk sizes |
schedule(runtime) |
✅ | Runtime-determined scheduling (via OMP_SCHEDULE) |
| Directive/Clause | Status | Description |
|---|---|---|
#pragma omp barrier |
✅ | Thread barrier synchronization |
#pragma omp critical [(name)] |
✅ | Critical section (mutual exclusion) |
#pragma omp master |
✅ | Master thread-only execution |
#pragma omp single |
✅ | Single thread execution |
#pragma omp atomic |
Atomic operations (partial: add/sub/mul/div/and/or/xor/min/max/read/write; missing: capture) | |
#pragma omp cancel |
✅ | Request cancellation of parallel regions or loops (controlled by OMP_CANCELLATION) |
#pragma omp cancellation point |
✅ | Check for cancellation requests |
#pragma omp ordered |
❌ | Ordered execution within parallel loops |
| Directive/Clause | Status | Description |
|---|---|---|
#pragma omp sections |
❌ | Separate code sections |
#pragma omp task |
❌ | Task-based parallelism |
| Directive/Clause | Status | Description |
|---|---|---|
nowait |
✅ | Skip implicit barrier at end of worksharing constructs (compiler-handled, no runtime support needed) |
copyprivate(var) |
❌ | Broadcast private variable from single to all threads |
| Directive/Clause | Status | Description |
|---|---|---|
reduction(op:var) |
✅ | Reduction operations (supports +, *, -, &, |, ^, &&, ||, min, max) |
private(var) |
✅ | Thread-private variables (compiler-handled, no runtime support needed) |
shared(var) |
✅ | Shared variables (compiler-handled, no runtime support needed) |
firstprivate(var) |
✅ | Initialize private from shared (compiler-handled, no runtime support needed) |
lastprivate(var) |
✅ | Update shared from last iteration (compiler-handled, no runtime support needed) |
default(shared|none) |
✅ | Default data-sharing attribute (compiler-only, no runtime support needed) |
threadprivate |
❌ | Thread-private global variables (requires deep compiler integration) |
copyin(var) |
❌ | Initialize threadprivate variables (depends on threadprivate) |
| Directive/Clause | Status | Description |
|---|---|---|
#pragma omp flush |
🚫 | Memory fence (WebAssembly atomics already provide memory ordering) |
#pragma omp target |
🚫 | Offload to accelerator devices (not applicable to Wasm environment) |
Nested parallel regions are NOT supported. SimpleOMP uses a single global thread pool architecture and does not support hierarchical team structures. Attempting to use nested #pragma omp parallel will result in undefined behavior (potential deadlock or serialization of inner regions).
// ❌ NOT SUPPORTED
#pragma omp parallel num_threads(4)
{
// Outer parallel region: 4 threads
#pragma omp parallel num_threads(2)
{
// Inner parallel region: This will cause issues!
}
}Workaround: Restructure your algorithm to avoid nested parallelism, or flatten the parallel structure into a single level.
The following OpenMP Runtime API functions related to nested parallelism are provided as stubs that always return fixed values:
omp_set_nested()/omp_get_nested()- Always reports nested parallelism as disabledomp_set_max_active_levels()/omp_get_max_active_levels()- Always reports 1 active levelomp_get_level()- Always returns 0 (not inside a parallel region) or 1 (inside a parallel region)omp_get_active_level()- Same asomp_get_level()omp_get_ancestor_thread_num(level)- Only works for level 0 or 1omp_get_team_size(level)- Only works for level 0 or 1
- Emscripten toolchain
- Make
- Download
libsimpleomp.afrom the releases page - (Optional) Download
omp.hif you need to use OpenMP Runtime API functions (e.g.,omp_get_thread_num(),omp_set_num_threads(), locks, timing functions) - Link the library when building your project
- Add the following compilation flags:
-fopenmp -pthread
# Basic usage (pragma directives only)
emcc your_code.c -fopenmp -pthread \
-sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency \
libsimpleomp.a -o output.js
# With OpenMP Runtime API (using omp.h)
emcc your_code.c -I/path/to/include -fopenmp -pthread \
-sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency \
libsimpleomp.a -o output.js-sPTHREAD_POOL_SIZE=navigator.hardwareConcurrency flag when linking. SimpleOMP creates a worker thread pool sized to match the number of logical CPU cores. Without this flag, Emscripten's default pthread pool size will be insufficient, causing runtime errors.
For a complete working example, see the example directory.
To build the library from source:
# Build the library
make
# Output will be generated at: dist/libsimpleomp.aThe example directory contains sample projects demonstrating SimpleOMP usage:
- for.cpp - Basic parallel for loop with performance comparison
- if.cpp - Conditional parallelization using the
ifclause - schedule.cpp - Loop scheduling strategies (static/dynamic/guided) with validation
- master.cpp - Master thread construct demonstration
- critical.cpp - Critical section for mutual exclusion
- barrier.cpp - Barrier synchronization example
- single.cpp - Single thread execution construct
- atomic.cpp - Atomic operations (add/sub/mul/div/and/or/xor/min/max/read/write)
- nowait.cpp - Nowait clause demonstration (skipping implicit barriers)
- locks.cpp - OpenMP lock API demonstration (simple and nestable locks)
- data_sharing.cpp - Data-sharing clauses (private/shared/firstprivate/lastprivate)
- cancel.cpp - Cancellation constructs for early termination of parallel regions
- reduction.cpp - Reduction operations (sum, product, min/max, logical, bitwise)
# Build all examples
make with-examples
# Start a local server to test (requires PNPM)
make serve
# Open the URL that appears to see all examplesThis project is licensed under the MIT License - Copyright (c) 2025 Mu-Tsun Tsai.
The following source files are derived from Tencent NCNN and are licensed under the BSD 3-Clause License:
See the respective files for their copyright notices and license terms.
Contributions are welcome! Please feel free to submit issues or pull requests.
This project is based on the implementation discussed in the Emscripten issue tracker. Special thanks to:
- The contributors of emscripten-core/emscripten#13892
- The Tencent NCNN project for the threading implementation code