<h1 align="center">Computational Methods in Environmental Engineering</h1>
<h2 align="center">Lecture #17</h2>
<h3 align="center">27 Apr 2023</h3>



## Parallelization in Python



-   A few different ways to parallelize our code.
-   Let's start with the simplest: `multiprocessing`!



In [1]:
import pandas as pd
import multiprocessing as mp
print("Number of processors: ", mp.cpu_count())

Number of processors:  8


## Types of parallel execution



-   Synchronous
    -   `Pool.map`
    -   `Pool.apply`
-   Asynchronous
    -   `Pool.map_async`
    -   `Pool.apply_async`
-   `Process`



## Let's get some data



on fuel economy from the [DOE](https://catalog.data.gov/dataset/fuel-economy-data)



In [2]:
df = pd.read_csv("../data/vehicles.csv", low_memory=False)
df.head()

Unnamed: 0,barrels08,barrelsA08,charge120,charge240,city08,city08U,cityA08,cityA08U,cityCD,cityE,...,mfrCode,c240Dscr,charge240b,c240bDscr,createdOn,modifiedOn,startStop,phevCity,phevHwy,phevComb
0,15.695714,0.0,0.0,0.0,19,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
1,29.964545,0.0,0.0,0.0,9,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
2,12.207778,0.0,0.0,0.0,23,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
3,29.964545,0.0,0.0,0.0,10,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0
4,17.347895,0.0,0.0,0.0,17,0.0,0,0.0,0.0,0.0,...,,,0.0,,Tue Jan 01 00:00:00 EST 2013,Tue Jan 01 00:00:00 EST 2013,,0,0,0


## ☛ Hands-on exercise



### Create a list that has the combined MPG of each car multiplied by 1.2 if it's FWD



- Write a function (`mpg`) that acts on each row
- Use `time.sleep` if you want to make it a bit slower (so we can better show the impact of parallelization later)
- The columns needed are `comb08` and `drive`
- Use a list comprehension and `iterrows`. Time it!

In [8]:
df.head(1)[['comb08', 'drive']]
len(df)

41014

In [9]:
import time

def mpg(row):
    time.sleep(4.0 / 41014)
    if row['drive'] == 'Front-Wheel Drive':
        return row['comb08'] * 1.2
    else:
        return row['comb08']


In [10]:
%time results = [mpg(row) for _, row in df.iterrows()]

CPU times: user 1.05 s, sys: 97.6 ms, total: 1.15 s
Wall time: 6.27 s


In [11]:
!cat fuel.py

import time


def mpg(row):
    time.sleep(4.0 / 41015)
    if row['drive'] == 'Front-Wheel Drive':
        return row['comb08'] * 1.2
    else:
        return row['comb08']


### Now let's try to parallelize this



Use `Pool.map`



In [23]:
import fuel
p = mp.Pool(mp.cpu_count())
rows = [row for _, row in df.iterrows()]
%timeit results = p.map(fuel.mpg, rows)

1.73 s ± 38.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


`map` blocks returning results until *all* threads complete the job



### What about asynchronous?



Use `Pool.map_async`



In [24]:
%timeit results = p.map_async(fuel.mpg, rows).get()

1.74 s ± 27.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [20]:
results

<multiprocessing.pool.MapResult at 0x169f05f00>

## Can Numba parallelize?



[Assume](https://github.com/ernestk-git/data-scientist-ish/blob/master/find_nearby_coords.ipynb) that we have two large arrays of latitude/longitude coordinates and we
want to find the number of times a point in the first array is within 1 km of
each point in the second array.



Let's initialize some random data



In [25]:
import numpy as np
n = 10000
k = 10000
coord1 = np.zeros((n, 2), dtype=np.float32)
coord2 = np.zeros((k, 2), dtype=np.float32)
coord1[:,0] = np.random.uniform(-90, 90, n).astype(np.float32)
coord1[:,1] = np.random.uniform(-180, 180, n).astype(np.float32)
coord2[:,0] = np.random.uniform(-90, 90, k).astype(np.float32)
coord2[:,1] = np.random.uniform(-180, 180, k).astype(np.float32)

Let's define two functions to calculate the distance and find the nearby points



In [26]:
from numba import jit 

@jit(nopython=True)
def distance(s_lat, s_lng, e_lat, e_lng):
    R = 6373.0
    s_lat = np.deg2rad(s_lat)
    s_lng = np.deg2rad(s_lng)
    e_lat = np.deg2rad(e_lat)
    e_lng = np.deg2rad(e_lng)
    d = np.sin((e_lat - s_lat)/2)**2 + np.cos(s_lat)*np.cos(e_lat) \
    * np.sin((e_lng - s_lng)/2)**2
    return 2 * R * np.arcsin(np.sqrt(d))

In [31]:
def get_nearby(coord1, coord2, max_dist):
    n = coord1.shape[0]
    k = coord2.shape[0]
    output = np.zeros(n, dtype=np.int32)
    # lat_filter = max_dist / 100
    for i in range(n):
        point = coord1[i]
        dist = distance(point[0], point[1], coord2[:, 0], coord2[:, 1])
        output[i] = np.sum(dist < max_dist)
    return output

In [32]:
get_nearby1 = jit(nopython=True)(get_nearby)
get_nearby2 = jit(nopython=True, parallel=True)(get_nearby)

Let's time the two versions



In [33]:
%timeit out = get_nearby1(coord1, coord2, 1.0)

3.57 s ± 21.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [34]:
%timeit out = get_nearby2(coord1, coord2, 1.0)

OMP: Info #271: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.


4.5 s ± 23.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
