Parallel Programming

* Objectives:
    * Describe basic components of a computer
    * Describe basic components of an operating system (OS)
    * List components of a process
    * State difference between a task and a process
    * List issues involved in parallelizing computation
    * When to use process vs threads

1) Operating System Basics and Processes
* Operating System Basics:
    * An operating system is a program which manages a computer's resources:
        * Synchronization to coordinate work
        * Mutual exclusion to protect shared resources
        * Scheduling of work
            * Usually "fair"
            * Can change priority with `nice`
    * Work runs inside **processes**:
        * Runs example program `text.txt`:
            * Only one copy is in memory on the entire computer
            * Read-only
            * From OS, programs, and libraries
        * Contains:
            * `data`: initialized global and static variables
            * `bss`: uninitialized global and static variables
            * One copy per process
* **States of a process** - processes can be in one of the follow states:
    * Ready/executing
    * Blocked
    * Delayed
    * Suspended
* **Lifecycle of a process** - the process life cycle is:
    * Fork
    * Exec
    * Exit
    * Reaped by parent
* **Components of a process** - a process has the following main components:
    * OS control information:
        * PID and PPID
        * File descriptor table
        * Mapping of standard input, output, and error
        * Error status
        * Signal handlers
    * Thread of execution
    * **Stack** - local storage for thread
    * **Heap** - general memory for process to allocate dynamically
    * A process should be relatively insulated from other processes

2) Threads and Parallelization
* Process vs. Thread
    * A **thread** is a lighter-weight concept:
        * One or more run inside of a **process**
        * Each thread has its own **stack**
        * All threads uses the same **heap/global** memory
            * Easy to communicate
            * Easy to cause **race conditions** if threads do **not** coordinate access to shared memory
                * A **race condition or race hazard** is the behavior of an electronics, software, or other system where the output is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when events do not happen in the order the programmer intended
* **Parallelization** - use parallelization to speed up big jobs when:
    * **Embarassingly Parallel** - is one where little or no effort is needed to separate the problem into a number of parallel tasks
        * can break work up into independent chunks
    * Operations can block or fail
    * Application decomposes into different types of work or stages
    * Have more data than fits in a single computer
* Tools for parallelization:
    * OS and computer language provide support for **parallel programming**:
        * **Open Multi-Processing (OpenMP)** - works within one node (multi-core) via shared memory
        * **Message Passing Interface (MPI)** - works between nodes e.g. over a network
    * Python provides:
        * **Processes** - use `multiprocessing` module
        * **Threads** - use `threading` module
* Difficulties with Python Parallelization:
    * Python has **Global Interpreter Lock (GIL)**
    * CPython only lets one thread in a process at a time run
    * To avoid compromising shared/global data structures
    * Makes parallelization difficult
    * To get parallelization, **must run multiple Python jobs** as separate processes

3) When To Use Multi-Processing vs. Threading
* Use a **process** for **longer** running jobs:
    * Length of the job must offset the cost of launching process
    * (+) Circumvent **GIL**
    * (-) Need extra fault tolerance
    * Common on clusters using Condor, PBS, etc.
    * (+) Robust to errors
* Use a **thread** for **parallelization** when:
    * Quick creation/destruction vs. processes
    * Easy communication via shared memory

4) Parallel Programming Intuition
* Beward of trade-offs:
    * Processes vs. threads: robustness vs speed
    * Cost of launching a process/thread vs. length of work
* Other issues:
    * Fork, then load data (not vice-versa)
    * Parallel programming is hard to debug