# Global Uncertainty Analysis: Polynomial Chaos Expansion (PCE) for Chemical Reaction Systems


This IPython notebook uses MUQ as a basis for adaptive Polynomial Chaos Expansions to perform global uncertainty analysis for chemical reaction systems.  This IPython notebook details a workflow using RMG, Cantera, and MUQ codes.

In [None]:
import random

from rmgpy.tools.canteramodel import Cantera, get_rmg_species_from_user_species
from rmgpy.species import Species
from rmgpy.chemkin import load_chemkin_file
from rmgpy.tools.globaluncertainty import ReactorPCEFactory
from rmgpy.tools.uncertainty import Uncertainty

## Initial setup

This section sets up everything needed to perform the global uncertainty analysis. This includes creating an instance of the Uncertainty class, loading the model to be analyzed, and setting up the Cantera reactor simulator.

In [None]:
# Must use annotated chemkin file
chemkin_file = './data/parse_source/chem_annotated.inp'
dict_file = './data/parse_source/species_dictionary.txt'

In [None]:
# Set output directory (Note: Global uncertainty analysis doesn't actually write any output files currently)
output_directory = './temp/uncertainty'

In [None]:
# Initialize the Uncertainty class instance and load the model
uncertainty = Uncertainty(output_directory=output_directory)
uncertainty.load_model(chemkin_file, dict_file)

In [None]:
# Map the species to the objects within the Uncertainty class
ethane = Species().from_smiles('CC')
C2H4 = Species().from_smiles('C=C')
mapping = get_rmg_species_from_user_species([ethane, C2H4], uncertainty.species_list)

# Define the reaction conditions
reactor_type_list = ['IdealGasConstPressureTemperatureReactor']
mol_frac_list = [{mapping[ethane]: 1.0}]
Tlist = ([1300], 'K')
Plist = ([1], 'bar')
reaction_time_list = ([0.5], 'ms')

Global uncertainty analysis works by simulating the full model at random points within the uncertainty distributions of the input parameters. In the current implementation, the simulation is performed by Cantera, which we set up here using the RMG wrapper class.

In [None]:
# Create the cantera model
job = Cantera(species_list=uncertainty.species_list, reaction_list=uncertainty.reaction_list, output_directory=output_directory)
# Load the cantera model based on the RMG reactions and species
job.load_model()
# Generate the conditions based on the settings we declared earlier
job.generate_conditions(reactor_type_list, reaction_time_list, mol_frac_list, Tlist, Plist)

Next, we need to load the RMG-database into the Uncertainty instance which was created in order to extract the original sources for every estimated parameter in the model.

In [None]:
uncertainty.load_database(
    thermo_libraries=['DFT_QCI_thermo', 'primaryThermoLibrary'],
    kinetics_families='default',
    reaction_libraries=[],
)
uncertainty.extract_sources_from_model()

## Part 1: Global uncertainty analysis for uncorrelated parameters

In [None]:
# Assign uncorrelated parameter uncertainties 
uncertainty.assign_parameter_uncertainties(correlated=False)

Input a set of kinetic $(k)$ and thermo $(G)$ parameters to be propagated and their uncertainties $(\Delta\ln k, \Delta G)$ into the `ReactorPCEFactory` class. These kinetic and thermo parameters should typically be pre-screened from local uncertainty analysis to narrow down to the most influential parameters.

Parameter uncertainties are assigned the same way as for local uncertainty analysis and are provided directly from the `Uncertainty` instance.

Random sampling from the uncertainty distributions of the input parameters is aided by a set uncertainty factors, $f$, calculated from the input uncertainties $(\Delta\ln k, \Delta G)$, and a set of unit random variables, $\xi$, sampled from a uniform distribution.

For thermochemistry,

$$f^G = G_{max} - G_0 = G_{0} - G_{min} = \sqrt{3} \Delta G$$

$$G = \xi f^G_{n} + G_{0}$$

For kinetics,

$$f^k = \log_{10} \left(\frac{k_{max}}{k_0}\right) = \log_{10} \left(\frac{k_0}{k_{min}}\right) = \frac{\sqrt{3}}{\ln 10} \Delta \ln k$$

$$k = 10^{\xi f_{m}} k_{0}$$

This allows calculation of a new parameter value given the nominal value, standard deviation, and the random variable.

The MIT Uncertainty Quantification Library (MUQ) is used to perform the random sampling and construct a Polynomial Chaos Expansion (PCE) to fit the output variable of interest, mole fractions.

In [None]:
# Choose input parameters to vary within their uncertainty bounds
k_params = [3, 5]  # RMG indices of reactions to vary
g_params = [1, 4]  # RMG indices of species to vary

In [None]:
# Create ReactorPCEFactory global uncertainty analysis object for the uncorrelated case
reactor_pce_factory = ReactorPCEFactory(
    cantera=job,
    output_species_list=[mapping[ethane], mapping[C2H4]],
    k_params=k_params,
    k_uncertainty=uncertainty.kinetic_input_uncertainties,
    g_params=g_params,
    g_uncertainty=uncertainty.thermo_input_uncertainties,
    correlated=False,
    logx=False,
)

Begin generating the PCEs adaptively based a runtime.

There are actually three methods for generating PCEs. See the `ReactorPCEFactory.generate_pce` function for more details.

- Option 1: Adaptive for a pre-specified amount of time
- Option 2: Adaptively construct PCE to error tolerance
- Option 3: Used a fixed order, and (optionally) adapt later.  

In [None]:
reactor_pce_factory.generate_pce(run_time=60)  # runtime of 60 seconds.

Let's compare the outputs for a test point using the real model versus using the PCE approximation.
Evaluate the desired output mole fractions based on a set of inputs `inputs = [[ln(k)_rv], [g_rv]]` which contains the 
random unit uniform variables attributed to the uncertain kinetics and free energy parameters, respectively.

In [None]:
# Create a random test point of length = number of k_params + number of g_params
random_test_point = [random.uniform(-1.0,1.0) for i in range(len(k_params)+len(g_params))]
true_test_point_output, pce_test_point_output = reactor_pce_factory.compare_output(random_test_point, log=False)

Obtain the results: the species mole fraction mean and variance computed from the PCE, as well as the global sensitivity indices.

In [None]:
mean, variance, covariance, main_sens, total_sens = reactor_pce_factory.analyze_results(log=False)

## Part 2: Global uncertainty analysis of correlated parameters

In [None]:
uncertainty.assign_parameter_uncertainties(correlated=True)

In [None]:
k_params = [
    'Training H_Abstraction H(6)+ethane(1)=H2(11)+C2H5(5)',
    'Training R_Recombination CH3(4)+CH3(4)=ethane(1)',
]
g_params = [
    'Library CH4(3) ',
    'Estimation CH3(4)',
]

In [None]:
reactor_pce_factory_correlated = ReactorPCEFactory(
    cantera=job,
    output_species_list=[mapping[ethane], mapping[C2H4]],
    k_params=k_params,
    k_uncertainty=uncertainty.kinetic_input_uncertainties,
    g_params=g_params,
    g_uncertainty=uncertainty.thermo_input_uncertainties,
    correlated=True,
    logx=False,
)

Do the same analysis for the correlated `reactorPCEFactory`

In [None]:
reactor_pce_factory_correlated.generate_pce(run_time=60)  # runtime of 60 seconds.

In [None]:
random_test_point = [random.uniform(-1.0,1.0) for i in range(len(k_params)+len(g_params))]
true_test_point_output, pce_test_point_output = reactor_pce_factory_correlated.compare_output(random_test_point, log=False)

In [None]:
mean, variance, covariance, main_sens, total_sens = reactor_pce_factory_correlated.analyze_results(log=False)