hank / life

Good code.

This URL has Read+Write access

life / oscon / 2008 / tutorials / Ubiquitous.Multithreading.rdoc
c334ce77 » Erik 2008-07-26 Yay for ispell 1 = OSCON 2008, Tutorial 3: Ubiquitous Multi-threading in a Multi-core World
d8e7fba2 » Erik 2008-07-22 Adding Ubiquitous Multithre... 2
3 == Shift from serial to parallel
4 === Process
5 - Find things that can be done almost independendently
6 - Analyze communication (dependences)
7 - Organize dependences for parallelism
8 - <b>Do this early!</b>
9
10 === Generic programming
11 - Make assumptions
12 - eg. Quicksort -> walk bidirectionally and swap items.
13
14 === Generic iteration
c334ce77 » Erik 2008-07-26 Yay for ispell 15 - Dependences hinder parallel execution
d8e7fba2 » Erik 2008-07-22 Adding Ubiquitous Multithre... 16 - STL foreach is a good example, has to check the iterator before calling the function
17
18 === Dealing with Dependences
19 - Remove dependences
20 - Rearrange dependences to shorten critical path
21 - Domain experts are better than programmers since they know where to break rules.
22
23 === Parallel iteration
24 - Know number of iterations ahead of time to control dependence
25 - Linked-lists/variable length structures suck for what we're talking about
26
27
28 == Correctness
29 - Make sure you have the sequential version right first
30 - <i>"Embarrassing parallelism is good"</i> (big arrays, no dependences)
31
32 === First, define what is Correct
33 - Matching a serial program bit-for-bit might be unrealistic
34
35 ==== Examples
36 - Floating point round-off in fluid solvers (iteration process that solves a parameter, inaccuracies of floating point will introduce error)
37 - MPEG compression - trading compression for parallelism
c334ce77 » Erik 2008-07-26 Yay for ispell 38 - Search returns one of several acceptable answers
d8e7fba2 » Erik 2008-07-22 Adding Ubiquitous Multithre... 39
40 === Race conditions
41 - Shared data, winners and losers
42
c334ce77 » Erik 2008-07-26 Yay for ispell 43 === Synchronization
d8e7fba2 » Erik 2008-07-22 Adding Ubiquitous Multithre... 44 ==== Low-level
45 - Mutexes, condition variables (wait on condition, no lock), tricky events
46 - Atomic operations: guaranteed to happen without interruption
47 - Emphasis on a pair of threads
48
49 ==== Higher-level
50 - Parallel loops
51 - Pipelines
52 - Barriers - serialization after parallel, waiting for parallel to finish
53 - Work queues - dynamic scheduling
54
55 ==== Mutex
56 - A lock on a (critical) section of code.
57 - We have 2 things (or more) we want to change at the same time
58
59 ==== Semaphore
60 - Let up to N threads in at the same time
61
62 ==== Reader-writer lock
63 - Multiple readers or one writer at a time
64 - Useful when there's lots of reading, little writing
65
66 ==== Condition variables
67 - Allow threads to wait for state protected by mutex to change, without holding the mutex and without timing holes (uses signaling)
68
69 === Problems with locks
70 ==== Composition
71 - Locking lower level operations does not guarantee higher level is race free
72
73 ==== Deadlock
74 - Everyone's waiting for a lock that no one can give
75
76 ==== Convoying
77 - Similar to deadlocking, owner of lock is preempted, other threads wait behind it
78 - Owner lock crashes, other threads wait forever
79 - Minimize convoying with atomics and minimize lock-length time
80
81 ==== Priority Inversion
82 - Can occur with prioritized preemptive scheduling
83 - Low-priority thread is preempted when holding lock
84 - Medium-priority thread runs in preference to low-priority thread
85 - High-priority thread waits forever on a lock, times out, and restarts sys
86 - Mars Pathfinder example: http://research.microsoft.com/~mbj/Mars_Pathfinder/Mars_Pathfinder.html
87
88 === Composition problem
89 - Multiple threads might append the same thing to a list, for example
90 - Move your locks to the outermost invariant
91
92 === Notes on Mutexes
93 - Avoid exposing mutexes to other packages
94 - Look into invariant-based programming
95 - <b>Remember exception handling</b>
96
97 === Exception-safe mutexing using RAII
c334ce77 » Erik 2008-07-26 Yay for ispell 98 - RAII = Resource Acquisition is initialization
d8e7fba2 » Erik 2008-07-22 Adding Ubiquitous Multithre... 99 - Constructor acquires resource
100 - Destructor releases resource
101
102 === Lockless problems
103 - Livelock - when everyone gets a lock!
104 - ABA problem - When you read a var as A, it then is changed to B, then back to A, then you screw up (linked list example).
105 - Memory reclamation - compare and swaps required, so one has to succeed, the rest have to fail. You might have trouble when freeing memory in case it gets. See Hazard Pointers: http://www.research.ibm.com/people/m/michael/ieeetpds-2004.pdf
106 - Memory Consistency model
107 - Lock-free data structures are difficult to understand
108 - Often publishable, even for simple structures
109 - Verification is tricky: consider using Spin to verify (http://www.spinroot.com)
110
111 === Tools for correctness
112 - KISS: Keep it simple stupid
113 - Use automatic race detectors: Detect races like memory checkers detect leaks
114 - Helgrind (part of Valgrind)
115 - Intel thread checker: more general race detection based on inter-thread communication
116
117 == Scalability tidbits
118 - Creating and destroying a thread can take on the order of <b>25,000 clock cycles!</b>
119