-
Levels
noop
: disable optimizationsadvanced
: all optimizationsadvanced-fsg
: alternative optimization pipeline
-
Options (type, default)
- Parallelism:
openmp
(boolean, False): enable/disable OpenMP parallelismpar-collapse-ncores
(int, 4): control loop collapsingpar-collapse-work
(int, 100): control loop collapsingpar-chunk-nonaffine
(int, 3): control chunk size in nonaffine loopspar-dynamic-work
(int, 10): switch between dynamic and static schedulingpar-nested
(int, 2): control nested parallelism
- Blocking:
blockinner
(boolean, False): enable/disable loop blocking along innermost loopblocklevels
(int, 1): 1 => classic loop blocking; 2 for two-level hierarchical blocking; etc.
- CIRE:
min-storage
(boolean, False): smaller working set size, less loop fusioncire-rotate
(boolean, False): smaller working set size, fewer parallel dimensionscire-maxpar
(boolean, False): bigger working set size, more parallelismcire-ftemps
(boolean, False): give user control over the allocated temporariescire-mingain
(int, 10): minimum gain to optimize away a redundant expressioncire-schedule
((str, int), 'automatic'): scheduling strategy for derivatives
- Device-specific:
gpu-fit
(boolean, False): list of saved TimeFunctions that fit in the device memorypar-disabled
(boolean, True): enable/disable parallelism on the host
- Misc:
linearize
(boolean, False): linearize array accesses
- Parallelism:
- Parallelism
CPU | GPU | |
---|---|---|
openmp | ✔️ | ✔️ |
par-collapse-ncores | ✔️ | ❌ |
par-collapse-work | ✔️ | ❌ |
par-chunk-nonaffine | ✔️ | ✔️ |
par-dynamic-work | ✔️ | ❌ |
par-nested | ✔️ | ❌ |
- Blocking
CPU | GPU | |
---|---|---|
blockinner | ✔️ | ❌ |
blocklevels | ✔️ | ❌ |
- CIRE
CPU | GPU | |
---|---|---|
min-storage | ✔️ | ❌ |
cire-rotate | ✔️ | ❌ |
cire-maxpar | ✔️ | ✔️ |
cire-ftemps | ✔️ | ✔️ |
cire-mingain | ✔️ | ✔️ |
cire-schedule | ✔️ | ✔️ |
- Device-specific
CPU | GPU | |
---|---|---|
gpu-fit | ❌ | ✔️ |
par-disabled | ❌ | ✔️ |
- Misc
CPU | GPU | |
---|---|---|
linearize | ✔️ | ✔️ |