# Example 3 Batch equilibrium calculation with TC-Python and Open MPI

## Define problem
We want to conduct a batch equilibrium calcualtion, but by utilizing multiple core/nodes.
Here I use the same batch equilibrium example with using open mpi.
Library for open-mpi can be downloaded through conda-forge:
conda install -c conda-forge openmpi=4.1.2

We also use mpi4py library to run the python code using openmpi:
conda install -c conda-forge mpi4py

FYI, we again find the equilibrium phases and corresponding volume fraction of each phase through TC-Python.
The system that we are interested in is the Fe-Cr-C system with various NTP conditions.
For versatility, we will interact with a csv file that contains various NTP conditions.
We will use TCFE11 database to conduct these calcualtions.

## 1. Import mpi and prepare Batch equilib function
Here, we import MPIPoolExecutor from mpi4py.future which allows us to exacute multiple processing.
Other libraries i.e, numpy, pandas, and tc_python need ot be imported as well.

Note that we are using the Batch_equilib function that we developed in the second example.

In [None]:
import numpy as np, pandas as pd
from tc_python import *
from mpi4py.futures import MPIPoolExecutor

def Batch_equilib(DB,Elements,NTP):
    '''
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Description
    ===========
    This function calculates a batch equilibrium based on given conditions. The function is written based on the single point calculation.


    Revisions
    =========

     Date            Programmer      Description of change
     ----            ----------      ---------------------
     05/16/2023      S. KWON         Original code


    Variables
    =========

    Arguments
    DB:       (string)            A database name of ThermoCalc e.g., TCFE11
    Elements: (list of strings)   A list of system elements e.g., ['Fe','C']
    NTP:      (array of floats)   An array of condition with the sequence of compositions (wt.), Temperature (Celsius), Pressure (bar), e.g., [0.8,0.2,1500,1]

    Returns
    output_all: (list)            A list of dictionary that contains equilibrium information i.e., stable phases and volume fractions
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    '''
    # Allocate variables
    comp = NTP[:,:-2]
    T = NTP[:,-2]
    P = NTP[:,-1]
    l = len(NTP[:,0])
    
    # TC-Python calculation
    with TCPython() as sess:
        # Init TC python: load database, set system elements
        TCcalc = (
            sess
            .select_database_and_elements(DB, Elements)     #Define databases and the system of interest
            .select_phase('FCC_L12')                        #Specify phases that must be included
            .select_phase('BCC_B2')
            .get_system()
            .with_single_equilibrium_calculation()
            .enable_global_minimization()                   #We enable globla minimization
        )

        # Loop through conditions
        output_all=[]   #Initialize the output variable
        for j in range(l):
            output=dict({})
            try:            
                TCcalc.set_condition(ThermodynamicQuantity.temperature(), T[j])
                TCcalc.set_condition(ThermodynamicQuantity.pressure(), P[j])
                
                #When the composition is given in wt fraction
                wf = ThermodynamicQuantity.mass_fraction_of_a_component
                
                # Loop Thorugh compositions
                for i in range(1,len(Elements)):
                    TCcalc.set_condition(wf(Elements[i]), comp[j,i])
                
                result = TCcalc.calculate()
                Phases=result.get_stable_phases()
                

                for phase in Phases:
                    output[phase]=result.get_value_of("VPV({})".format(phase))
            
            #Error handling
            except CalculationException as e:
                # Sometime TC-Python fails to reach global equilibrium, which will abort the whole calculations.
                # We create an exception that handles this error
                output['Error']=0
            output_all.append(output)
    return output_all



## 2. Divide into subranges

Before we conduct the multiprocess calculation, we need to first determine how many calculations will be allocated in each processor.
For example, if we would like to conduct 100 calculations with 5 processor, we can allocate 20 calculations for each processor.
Once the subrange is determined, we will prepare arguments (variables) for each calculations.

In [None]:
def determine_subranges(Database,elements,NTP,nPerBatch):
    """ 
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------

    Description
    ===========
    Break a full range into smaller sets of ranges.
    
    Revisions
    =========

     Date            Programmer      Description of change
     ----            ----------      ---------------------
     05/18/2023      S. KWON         Original code


    Variables
    =========

    Arguments
    DB:       (string)            A database name of ThermoCalc e.g., TCFE11
    elements: (list of strings)   A list of system elements e.g., ['Fe','C']
    NTP:      (array of floats)   An array of condition with the sequence of compositions (wt.), Temperature (Celsius), Pressure (bar), e.g., [0.8,0.2,1500,1]
    nPerBatch: (integer)          The size of a subrange

    Returns
    res: (list)                   A list of arguments for Batch_equilibrium function
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    """
    res = [] #Allocate output
    n,m=NTP.shape
    fullrange=[0,n]

    for i in range(fullrange[0], fullrange[1], nPerBatch):
        res.append( [Database,elements,NTP[i:min(i+nPerBatch, fullrange[1]),:]] ) 
    return res

### 3. Main script
Now, we prepare a main script that will be excuted through mpirun.
Note that this script will be called multiple times through mpirun, so we will make sure that the manuscript is executed only once.

In [None]:
if __name__ == '__main__': #Make sure the script is executed only once
    # First define the number of cores and the subrange for each batch
    nCores=5
    nPerBatch=20
    
    # Get whole dataset for calculations
    database='TCFE11'                   #Define database
    df=pd.read_csv('BatchEquilib.csv')  # Read csv file into a dataframe
    elements=df.keys().to_list()[:-2]   #Define system   
    NTP=df.to_numpy()                   #Read NTP data from the dataframe
    
    # Prepare arguments for each subrange
    subsets=determine_subranges(database,elements,NTP,nPerBatch)
    
    # Conduct multiprocess calculation
    with MPIPoolExecutor() as executor:
        res = executor.starmap(Batch_equilib, subsets)

## 4. Cmd line execution
We execute the main script through open mpi to activate multi processing.
The command line execution for 5 processors is

mpirun -np 5 python -m mpi4py.futures mainscript.py