# Parallel Super fast MC background iterations, paired, for multiple methods and activities

Sometimes you need a lot of MC iterations in the background (background GSA for instance). Using a powerful computer with multiple cores is not enough because the built-in brightway MC iterator is sequential and does not work in parallel.
So I here force it to be parallel using "ray" which distributes work across workers.



*Basically, it's just a parallelized version of the Paired Background MC  script*

THIS IS NOT NECESSARILY FASTER ON A LOCAL LAPTOP BUT TAKES IT'S FULL POTENTIAL ON THE SERVERS WITH 16, 32, 64 CPUS.

**On Ucloud with the 64 cores (very easy to use): 0.5 s per MC for 95 activities, 20 impact categories, all paired**

**i.e. 30 min for 10 000 iterations for 95 FUs for 20 Impact categories :) Adding impact category does not add computing time at all.**

Ucould is very easy to use via it's graphic interface. You can run jupytr notebooks from there.

**THIS NEEDS THE PACKAGE "ray". and "itertools"**

pip import ray 

conda install -c conda-forge r-itertools

In [14]:

import datetime
from time import *

import os
import sys

import pandas as pd
from random import *
from itertools import *
from math import*
import csv
import copy
import numpy as np
import random


import bw2data
import bw2io
from bw2data.parameters import *
import brightway2 as bw


import ray
import multiprocessing


from itertools import compress


import functools



## Loading brightway

In [15]:

    
bw.projects.set_current('Eco38_4') 



# Loading Ecoinvent
Ecoinvent = bw.Database('ecoinvent 3.8 conseq')




biosphere=bw.Database('biosphere3')




In [16]:
# 5 methods
list_meth =[('ReCiPe Midpoint (I)', 'climate change', 'GWP20'), 
            ('ReCiPe Midpoint (H)', 'human toxicity', 'HTPinf'),
            ('ReCiPe Midpoint (H)', 'freshwater ecotoxicity', 'FETPinf'),
            ('ReCiPe Midpoint (H)', 'terrestrial ecotoxicity', 'TETPinf')]
 


In [2]:
list_act_fu =[{Ecoinvent.random():1} for i in range(5)] # The activities you want to make MC iterations for

NameError: name 'Ecoinvent' is not defined

In [18]:
list_act_fu

[{'market for electricity, medium voltage' (kilowatt hour, GT, None): 1},
 {'oil power plant construction, 500MW' (unit, RER, None): 1},
 {'treatment of high level radioactive waste for final repository' (cubic meter, CH, None): 1},
 {'sanitary landfill facility construction' (unit, CH, None): 1},
 {'market for waste paperboard' (kilogram, CY, None): 1}]

Collect the necessary characterization matrixes in a dictionnary 

In [19]:


# Create empty dictionnary to collect characterization matrixes
C_matrixes ={}

# Make a LCA. It's only to load a LCA object. We do not care about the result.
Lca=bw.LCA({Ecoinvent.random():1},('ReCiPe Midpoint (I)', 'climate change', 'GWP20'))
Lca.lci()
Lca.lcia()

# Use switch_method and collect the characterization matrix for every change
for meth in list_meth:
    Lca.switch_method(meth)
    #print(meth,"##############")
    #print(Lca.characterization_matrix)
    C_matrixes[meth]=Lca.characterization_matrix


Here we want to make 9 MC iterations in total, and we divide the work over 3 cores (number_chunks), 3 iterations each (size_divis).

FYI the best ratios I found to make 235 000 on 64 cores iterations was :
size_divis= 1000
number_chunks = 235

Try to stick to this ratio somehow for a good performance.

In [20]:
size_divis= 3
number_chunks = 3

size= size_divis * number_chunks



This is the function that will be run in parallel on several cores. 

It needs the decorator "@ray.remote " just before the first line.



In [7]:
@ray.remote   # MAGIC decoration
def parallel_MC_backgound(constant_inputs):
    
    [list_act_fu,
     list_meth,
     C_matrixes,
     size_divis] = constant_inputs
    
        
    
    
    bw.projects.set_current('Eco38_4')  # This is ugly but we need to re-load brightway on each core. 

    
    
    # Loading Ecoinvent
    Ecoinvent = bw.Database('ecoinvent 3.8 conseq') # Same




    biosphere=bw.Database('biosphere3')


    list_array_mc_sample=[np.array([[0]*len(list_act_fu) ]*size_divis,dtype="float32") for meth in range(len(list_meth))]  # Initialize empty list for table

    
    mc=bw.MonteCarloLCA(list_act_fu[0],('ReCiPe Midpoint (I)', 'climate change', 'GWP20'))
    
    #print("done_initializing")

    for it in range(size_divis):  # This is excatly the same cdoe as in the non parallel Paired background MC script
        
        #print("iteration",it)
        next(mc)
        #print("ok2",it)
    
        for i in range(0,len(list_act_fu)):
            
            mc.redo_lcia(list_act_fu[i])  # redo with new FU

            index_array_method=-1

            #print(i)
            for m in list_meth:

                #print("ok3",m)

                index_array_method+=1



                list_array_mc_sample[index_array_method][it,i]=(C_matrixes[m]*mc.inventory).sum()



    # Just reordering the results
    
    list_array_total_mc_sorted =[[[0 for a in range(list_array_mc_sample[0].shape[1])] for meth in list_meth] for it in range(size_divis)]


    for it in range(size_divis):
        row_to_add = [meth[it] for meth in list_array_mc_sample]
        #print(row_to_add)
        #print(row_to_add[0])
        #print(len(row_to_add[]))
        #print(len(row_to_add))

        list_array_total_mc_sorted[it] = row_to_add

    return list_array_total_mc_sorted   
    



NameError: name 'ray' is not defined

The next cells runs the MC iterations in parallel.

In [50]:
ray.shutdown() 

ray.init()



constant_inputs = ray.put([list_act_fu,  # We need to feed ray with the inputs that are common to all iterations. It's a common space accross all core working together
                           list_meth,
                           C_matrixes,
                           size_divis])
time3=time()

ray_results = ray.get([parallel_MC_backgound.remote(constant_inputs) for i in range(number_chunks )]) # This distibutes and launches the work. 
list_array_total_mc_sorted = functools.reduce(operator.iconcat, ray_results, []) # reorder results

time4=time()

print("timetot_parallel",time4-time3)


timetot_parallel 46.11112689971924


Note that this works because every time a core starts doing MC iterations within brightway, it starts with a different seed. This is annoying in most cases but there it's useful as we know that the different cores are sampling different values and we can them recombine them in a total sample after all.

In [58]:
len(list_array_total_mc_sorted) # 9 iterations 

9

In [59]:
len(list_array_total_mc_sorted[0]) #4 Impact categories

4

In [60]:
len(list_array_total_mc_sorted[0][0]) #5 FU

5

 The result is here presented as a list of MC iterations containing a list of impact categories containing a list of FUs. It's maybe not the most convenient way for your use. But you an play around to reorganise it as you want.