Skip to content

Latest commit

 

History

History
71 lines (59 loc) · 3.45 KB

README.md

File metadata and controls

71 lines (59 loc) · 3.45 KB

Summary of optimization parameters

  • Levels

    • noop: disable optimizations
    • advanced: all optimizations
    • advanced-fsg: alternative optimization pipeline
  • Options (type, default)

    • Parallelism:
      • openmp (boolean, False): enable/disable OpenMP parallelism
      • par-collapse-ncores (int, 4): control loop collapsing
      • par-collapse-work (int, 100): control loop collapsing
      • par-chunk-nonaffine (int, 3): control chunk size in nonaffine loops
      • par-dynamic-work (int, 10): switch between dynamic and static scheduling
      • par-nested (int, 2): control nested parallelism
    • Blocking:
      • blockinner (boolean, False): enable/disable loop blocking along innermost loop
      • blocklevels (int, 1): 1 => classic loop blocking; 2 for two-level hierarchical blocking; etc.
    • CIRE:
      • min-storage (boolean, False): smaller working set size, less loop fusion
      • cire-rotate (boolean, False): smaller working set size, fewer parallel dimensions
      • cire-maxpar (boolean, False): bigger working set size, more parallelism
      • cire-maxalias (boolean, False): bigger working set size, better flop count
      • cire-ftemps (boolean, False): give user control over the allocated temporaries
      • cire-mincost-sops (int, 10): minimum cost of a sum-of-product candidate
      • cire-mincost-inv (int, 50): minimum cost of a dimension-invariant candidate
    • Device-specific:
      • gpu-fit (boolean, False): list of saved TimeFunctions that fit in the device memory
      • gpu-direct (boolean, False): generate code for optimized GPU-aware MPI
      • par-disabled (boolean, True): enable/disable parallelism on the host

Optimization parameters by platform

  • Parallelism
CPU GPU
openmp ✔️ ✔️
par-collapse-ncores ✔️
par-collapse-work ✔️
par-chunk-nonaffine ✔️ ✔️
par-dynamic-work ✔️
par-nested ✔️
  • Blocking
CPU GPU
blockinner ✔️
blocklevels ✔️
  • CIRE
CPU GPU
min-storage ✔️
cire-rotate ✔️
cire-maxpar ✔️ ✔️
cire-maxalias ✔️ ✔️
cire-ftemps ✔️ ✔️
cire-mincost-sops ✔️ ✔️
cire-mincost-inv ✔️ ✔️
  • Device-specific
CPU GPU
gpu-fit ✔️
gpu-direct ✔️
par-disabled ✔️