This python script shows the difference in running batch versus parallel jobs in python. Pi is estimated using the infinite series shown here: https://www.britannica.com/topic/Pi-Recipes-1084437 and run in batch and parallel jobs.

In [1]:
from joblib import Parallel, delayed
import tqdm

In [2]:
def batch_function(): # function which estimates PI as infinite series
    k, pi = 1,0
    array_of_terms=[]
    for i in range(1,10000): #create list of terms , -1/3, 1/5, -1/7, according to the website above        
        if i%2!=0: # if odd i
            
            array_of_terms.append(1/i) # append only terms we need
        #now multiply every other element by -1
            temp = array_of_terms # Copy the list
    # Iterate through the indexes instead of the elements
    for i in range(len(temp)):
        if i % 2 != 0:
            temp[i] = temp[i]*-1 #perform multiplication with -1 if needed
    #sum the list and multiply by 4 to get PI!
    pi=4*sum(temp)
    return pi


In [3]:
batch_function()

3.141392653591791

In [4]:
N=5000
items=range(N)

In [5]:
%%time
result=[batch_function() for row in items]



Wall time: 5.85 s


Batch function gives 5.95s compute time to process PI 5000 total times. Now try parallel processing with joblib

In [6]:
%%time
r=Parallel(n_jobs=8)(
    delayed(batch_function)()
    for row in items
)


Wall time: 3.45 s


parallel processing gives 3.58s. Now try with greater N

In [7]:
N=12000
items=range(N)

In [8]:
%%time
result=[batch_function() for row in items]

Wall time: 14 s


In [9]:
%%time
r=Parallel(n_jobs=8)(
    delayed(batch_function)()
    for row in items
)

Wall time: 3.91 s
