• [RFC][PIR] Parallel IR, Stage 0: IR extensions
  • Preview
  • 1 Introduction
  • 1.1 Motivation
  • 1.2 Advantages of fork-join parallelism
  • 1.2.1 Minimal IR extension
  • 1.2.2 Simple semantics
  • 1.2.3 Simple and quick implementation
  • 1.2.4 Incremental integration into LLVM
  • 1.2.5 Low-level representation of parallelism
  • Implementing divide-and-conquer algorithms
  • Implementing parallel for loops
  • Implementing map
  • Implementing reduction
  • Implementing scan
  • 1.2.6 Representation of static parallelism
  • 1.2.7 Representative of existing languages
  • 1.2.8 Existence of prototypes
  • 1.2.9 Better optimization opportunities
  • 1.2.10 Good tooling support
  • Performance models
  • Scalability profilers
  • Race detectors
  • 1.2 Scope
  • 2 Design Goals
  • 3 Instruction Specification
  • 3.1 fork Terminator
  • 3.2 halt Terminator
  • 3.3 join Terminator
  • 4 Examples
  • 4.1 Simplest case (2 parallel threads)
  • Cilk
  • OpenMP
  • PIR
  • 4.2 More than 2 threads
  • Cilk
  • OpenMP
  • PIR
  • 4.3 A Parallel loop
  • Cilk
  • OpenMP
  • PIR
  • 4.4 Nested parallel loops
  • Cilk
  • OpenMP
  • PIR
  • 5 Outlook
  • 6 References