Skip to content

Commit

Permalink
Merge pull request #1425 from AdaptiveCpp/docs/typos-in-env-variables
Browse files Browse the repository at this point in the history
Fix a few typos in env_variables.md
  • Loading branch information
illuhad committed Apr 1, 2024
2 parents e05b20a + 03b8e4b commit 0359cac
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions doc/env_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
* `ACPP_DEBUG_LEVEL`: if set, overrides the output verbosity. `0`: none, `1`: error, `2`: warning, `3`: info, `4`: verbose, default is the value of `HIPSYCL_DEBUG_LEVEL` [macro](macros.md).
* `ACPP_VISIBILITY_MASK`: can be used to activate only a subset of backends. Syntax: `backend;backend2;..`. Possible values are `omp` (OpenMP), `cuda`, `hip`, `ocl` (OpenCL) and `ze` (Level Zero). `omp` will always be active as a CPU backend is required. For most backends, device level visibility has to be set via vendor specific variables for now, including `{CUDA,HIP}_VISIBLE_DEVICES` and `ZE_AFFINITY_MASK`. Certain backends, particularly `ocl`, support device level visibility specifications: For example, `omp;ocl:0,4` exposes OpenCL device 0 and 4, `omp;ocl:0.0,3.0` exposes device 0 from platform 0 and device 0 from platform 3. Instead of numbers, strings can also be passed, in which case a device will match if the platform/device name contains the given string. `*` acts as wildcard. Examples: `omp;ocl:Intel.0` (first device from platforms containing "Intel" in the name), `omp;ocl:Graphics.*` (All devices from platforms containing "Graphics" in their name), `omp;ocl:CPU` (All devices containing CPU in their name)
* `ACPP_RT_DAG_REQ_OPTIMIZATION_DEPTH`: maximum depth when descending the DAG requirement tree to look for DAG optimization opportunities, such as eliding unnecessary dependencies.
* `ACPPL_RT_MQE_LANE_STATISTICS_MAX_SIZE`: For the `multi_queue_executor`, the maximum size of entries in the lane statistics, i.e. the maximum number of submissions to retain statistical information about. This information is used to estimate execution lane utilization.
* `ACPP_RT_MQE_LANE_STATISTICS_MAX_SIZE`: For the `multi_queue_executor`, the maximum size of entries in the lane statistics, i.e. the maximum number of submissions to retain statistical information about. This information is used to estimate execution lane utilization.
* `ACPP_RT_MQE_LANE_STATISTICS_DECAY_TIME_SEC`: The time in seconds (floating point value) after which to forget information about old submissions.
* `ACPP_RT_SCHEDULER`: Set scheduler type. Allowed values:
* `direct` is a low-latency direct-submission scheduler.
Expand All @@ -14,7 +14,7 @@
* `system`: Makes default selector behave like a system selector from the `HIPSYCL_EXT_MULTI_DEVICE_QUEUE` extension
* `ACPP_HCF_DUMP_DIRECTORY`: If set, hipSYCL will dump all embedded HCF data files in this directory. HCF is hipSYCL's container format that is used by all compilation flows that are fully controlled by hipSYCL to store kernel code.
* `ACPP_PERSISTENT_RUNTIME`: If set to 1, hipSYCL will use a persistent runtime that will continue to live even if no SYCL objects are currently in use in the application. This can be helpful if the application consists of multiple distinct phases in which SYCL is used, and multiple launches of the runtime occur.
* `ACPPL_RT_MAX_CACHED_NODES`: Maximum number of nodes that the runtime buffers before flushing work.
* `ACPP_RT_MAX_CACHED_NODES`: Maximum number of nodes that the runtime buffers before flushing work.
* `ACPP_SSCP_FAILED_IR_DUMP_DIRECTORY`: If non-empty, hipSYCL will dump the IR of code that fails SSCP JIT into this directory.
* `ACPP_RT_GC_TRIGGER_BATCH_SIZE`: Number of nodes in flight that trigger a garbage collection job to be spawned
* `ACPP_RT_OCL_NO_SHARED_CONTEXT`: If set to `1`, instructs the OpenCL backend to not attempt to construct a shared context across devices within a platform. This can be necessary on OpenCL implementations that do not support this. Note that if shared contexts are unavailable, support for data transfers between devices might be limited as the devices can no longer directly talk to each other.
Expand All @@ -24,7 +24,7 @@
* `ACPP_STDPAR_OFFLOAD_SAMPLING`: If set to `1` and the application was not compiled with `--acpp-stdpar-unconditional-offload`, will cause this application to be carried out through the offloading mechanism. The stdpar runtime will measure the performance of offloaded STL algorithms, and make this information available for future application runs which can then benefit from potentially better information to decide whether offloading is viable.
* `ACPP_STDPAR_DATASET_NAME`: If set, is used as an identifier in the filename of the application profile constructed by the stdpar offloading heuristic engine. This can be used to distinguish different application profiles (e.g., if different compiler flags were used, or different hardware was targeted).
* `ACPP_STDPAR_PREFETCH_MODE`: Can be used to specify the desired prefetch mode (see `acpp --help` for details) if the compiler flag `--acpp-stdpar-prefetch-mode` was not set. If `--acpp-stdpar-prefetch-mode` was set, has no effect.
* `ACPP_STDPAR_OHC_MIN_OPS`: stdpar offload heuristic configration (ohc): If set, offloading decisions will only be reevaluated after at least this many stdpar algorithms have been dispatched. This also configures, how many operations the offload heuristic will attempt to predict when estimating performance.
* `ACPP_STDPAR_OHC_MIN_TIME`: stdpar offload heuristic configration (ohc): If set, offloading decisions will only be reevaluated after at least this much time in seconds has passed.
* `ACPP_STDPAR_OHC_MIN_OPS`: stdpar offload heuristic configuration (ohc): If set, offloading decisions will only be reevaluated after at least this many stdpar algorithms have been dispatched. This also configures, how many operations the offload heuristic will attempt to predict when estimating performance.
* `ACPP_STDPAR_OHC_MIN_TIME`: stdpar offload heuristic configuration (ohc): If set, offloading decisions will only be reevaluated after at least this much time in seconds has passed.
* `ACPP_RT_NO_JIT_CACHE_POPULATION`: If set to `1`, prevents the kernel cache from storing SSCP JIT-compiled binaries in the persistent on-disk cache. This can be useful e.g. in an MPI context, where it is sufficient that only one process among many populates the cache.
* `ACPP_ADAPTIVITY_LEVEL`: Controls the optimization level of the adaptivity engine. This is currently only relevant for the generic SSCP target. A higher value implies JIT-compiling more specialized kernels at the expense of more frequent JIT compilations. A value of 0 disables all adaptivity (not recommended).
* `ACPP_ADAPTIVITY_LEVEL`: Controls the optimization level of the adaptivity engine. This is currently only relevant for the generic SSCP target. A higher value implies JIT-compiling more specialized kernels at the expense of more frequent JIT compilations. A value of 0 disables all adaptivity (not recommended).

0 comments on commit 0359cac

Please sign in to comment.