## Outline

#### [Introduction](00_intro.ipynb) (Aug 26)
* Course Goals and Culture
* Parallelism
* Modern processors from a programmer's perspective

#### [Parallel Python](01_joblib.ipynb)  (Aug 28)
* A First Program
* Python parallel processes
* Concurrency versus parallelism
* The Global Interpreter Lock and python threads

#### [Strong Scaling](02_strong_scaling.ipynb) (Sep. 4)
* Amdahls Law
* Speedup
* Parallel Efficiency
    
#### [OpenMP](04_open_mp.ipynb) (Sep. 9)
* What is OpenMP (parallel C/Fortran on multicore, shared-memory architectures)
* Serial to Parallel Refactoring
* System overview
    * Preprocessor
    * Library
    * Runtime
* Example #parallel directive
    * Blocks and scoping
    * Interacting with the environment
* Reference Materials : https://hpc-tutorials.llnl.gov/openmp/

#### [CPU Parallelism: Multicore](03_moores_multicore.ipynb) (Sep. 11)
* What is a CPU? (evolution of CPUs, motivation for parallelism, Moore's Law)
* Multicore
    * Shared-memory

#### [Memory Hierarchy](05_cache_hierarchy.ipynb) (Sep 16)
* Latencies
* Cache coherency
* Row versus column order
* LM Bench
* **Read**: https://siboehm.com/articles/22/Fast-MMM-on-CPU

#### [Cilk](06_fork_join.ipynb)  (Sep. 18)
* work and span
* fork/join parallelism
* loop parallelism
* reducers
* **Read**: Work and Span https://en.wikipedia.org/wiki/Analysis_of_parallel_algorithms

#### [Loop Parallelism](07_openmp_loops.ipynb) (Sep. 23)

* Scoping and thread local variables
* Loop dependencies
* Loop fusion
* Loop fission
* Reductions in OpenMP

#### [CPU Parallelism: ILP](08_ILP.ipynb) (Sep 25)
* ILP
    * Pipelines 
    * Vectorization
    * Other Sources (out-of-order execution, branch prediction, speculative execution)
 
#### [Example: Numba Jit Compilation](examples/ex_jit.ipynb) (Sep. 25)
* **Read**:
    * https://numba.pydata.org/
    * https://numba.readthedocs.io/en/stable/user/5minguide.html
* [Solutions](solutions/ex_jit_soln.ipynb)

#### [Factors Against Parallelism](09_factors.ipynb) (Sep. 30)
* Interference
* Skew
* Startup costs
* Overlap

#### [Processes and Threads](10_processthread.ipynb) (Sep. 30)
* OS processes
* OS threads
* virtual memory

#### [Vector Programming](11_vectorization.ipynb) (Oct. 2)
* vector processing
* assemmbly code (overview)
* vector registers
* programming with intrinsics
* (see examples on godbolt linked in notebooks)

#### [Java Thread Programming](12_javathreads.ipynb) (Oct. 7)
  * fork/join in java
  * Thread classes and Runnable interfaces

#### [Java Synchronization and Thread Safety](13_synchronization.ipynb) (Oct. 9)
* fork/join in java
* Thread classes and Runnable interfaces
* [Example: Synchronization in Java](examples/13_ex_javasycnh.ipynb)
    * [Solution: Syncrhonization in Java](examples/solutions/13_ex_javasynch.ipynb)
    
#### [Mutual Exclusion](14_mutex.ipynb) (Oct. 14)
* Peterson's algorithm
* Bakery algorithm
* Fast Mutual Exclusion
* **Read**: Herlihy and Shavit. _Art of Multiprocessor Programming_. 
  * Chapter 1: all
  * Chapter 2: 2.1-2.6
  * Chapter 7: 7.1-7.3.
* [Example: Fast Mutex in Java](examples/14_ex_fastmutex.ipynb)
  * [Solution: Fast Mutex in Java](examples/solutions/14_ex_fastmutex_soln.ipynb)
 

#### [Roofline Performance Model](15_roofline.ipynb) (Oct. 16)
* I/O intensity
* Off-chip bandwidth
* I/O and compute limited kernels
* **Read**: Williams et al. [Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures](https://escholarship.org/uc/item/5tz795vq), CACM, 52(4), 2009.
 
#### Midterm (Oct. 21)


#### Things to Understand (Study Guide)

These examples and readings embody concepts that I think it is important to know. These are things that I want to highlight that go beyond the treatment in the lecture notes or homework.

* Row/Column example: [row_column.c](./examples/openmp/row_column.c)
* False sharing example: [sharing.c](./examples/openmp/sharing.c)
* Fast matrix multiplication example: https://siboehm.com/articles/22/Fast-MMM-on-CPU
* Cilk: examples Integral and Matrix Multiplication on [http://preview.speedcode.org/](http://preview.speedcode.org/)