# Make Python go Brrrrr - Joseph Long
I want to parallel process some optical physics
- Gaussian Beamlet Decomposition
- Polarization Ray Tracing

### Vocab


Numba allows you to write your own kernels that operate on Cupy arrays!

Cupy uses pre-compiled functions

## "Embarrassingly Parallel" Problems
Things where you don't have to do anything to re-structure the problem to parallelize it
- Divide into units that don't require communication between units!
- Units have the input information
- Units don't need to ask their neighbor
- Units can write their result all at once

Astro Examples
- Testing orbit fits with different parameters
- calibrating the contrast at many points w/ fake planet injections
- dark/bias/flat correcting a whole night of observations

## Non-"Embarrassingly Parallel" Problems
Units that need to communicate to each other
- Common when writing out results
- Need synchronization points

Astro Examples
- N body simulations
- fluid dynamics simulations
- file compression

## Functional Purity
_pure functions:_ operate identically to math functions e.g. f(x) = 2x
_impure functions_: have side effects e.g. 

`def f(x):`

`    print('called f(x)')`

`    return 2x`

Pure functions are better for parallelization! 

## The first problem
- your code is slow!
- your program maxes out 1 core!
- your python code uses `multiprocessing` to use all cores on your computer but NOT on multiple computers in parallel!

## The `Ray` python package
Built for parallelism. Ignore most of what goes on under the hood.

In [7]:
# The slow version!
import time
import random

def time_waster(min_delay: float = 1, max_delay: float=1) -> float:
    
    delay = (max_delay-min_delay)*random.random() + min_delay
    time.sleep(delay)
    return delay

def main():
    for _ in range(10):
        result = time_waster(min_delay=2,max_delay=3)
        print(f"Delay was {result} sec")
        
if __name__ == "main":
    start = time.time()
    main()
    print(f"Took {time.time() - start}")
    
main()

Delay was 6.559074909443284 sec


KeyboardInterrupt: 

In [8]:
# The ray version!
# It takes > 1s to start up
import time
import random
import ray

@ray.remote # added decorator
def time_waster(min_delay: float = 1, max_delay: float=1) -> float:
    
    delay = (max_delay-min_delay)*random.random() + min_delay
    time.sleep(delay)
    return delay

def main():
    pending = []
    for _ in range(10):
        # This will return a reference to a process that isn't necesarilly finished computing
        ref = time_waster.remote(min_delay=2,max_delay=30) # change how it's called
        pending.append(ref) # hold on to results until they are done
        
        # submission is happening one at a time, process happens in parallel
        
    for ref in pending: # loop over results
        result = ray.get(ref) # get references
        print(f"Delay was {result} sec")
        
if __name__ == "main":
    start = time.time()
    main()
    print(f"Took {time.time() - start}")
    
main()

Delay was 2.6861617063942558 sec
Delay was 9.523093161880043 sec
Delay was 20.669533161736144 sec
Delay was 12.559721903798707 sec
Delay was 18.940530024096965 sec
Delay was 24.278566129368357 sec
Delay was 3.372184541771611 sec
Delay was 21.66214743412878 sec
Delay was 17.29045392722999 sec
Delay was 8.488699986999823 sec


## The Drawback
The process is 