In [None]:
import os

from IPython.display import display, Image

from rmgpy.tools.uncertainty import Uncertainty, process_local_results
from rmgpy.tools.canteraModel import getRMGSpeciesFromUserSpecies
from rmgpy.species import Species

# First Order Local Uncertainty Analysis for Chemical Reaction Systems

This IPython notebook performs first order local uncertainty analysis for a chemical reaction system
using a RMG-generated model.  

## Step 1: Define mechanism files and simulation settings

Two examples are provided below. You should only run one of the two blocks.

In [None]:
# This is a small phenyldodecane pyrolysis model

# Must use annotated chemkin file
chemkinFile = './data/pdd_model/chem_annotated.inp'
dictFile = './data/pdd_model/species_dictionary.txt'

# Initialize the Uncertainty class instance and load the model
uncertainty = Uncertainty(outputDirectory='./temp/uncertainty')
uncertainty.loadModel(chemkinFile, dictFile)

# Map the species to the objects within the Uncertainty class
PDD = Species().fromSMILES("CCCCCCCCCCCCc1ccccc1")
C11ene=Species().fromSMILES("CCCCCCCCCC=C")
ETHBENZ=Species().fromSMILES("CCc1ccccc1")
mapping = getRMGSpeciesFromUserSpecies([PDD,C11ene,ETHBENZ], uncertainty.speciesList)

# Define the reaction conditions
initialMoleFractions = {mapping[PDD]: 1.0}
T = (623, 'K')
P = (350, 'bar')
terminationTime = (72, 'h')
sensitiveSpecies=[mapping[PDD], mapping[C11ene]]

In [None]:
# This is an even smaller ethane pyrolysis model

# Must use annotated chemkin file
chemkinFile = 'data/ethane_model/chem_annotated.inp'
dictFile = 'data/ethane_model/species_dictionary.txt'

# Initialize the Uncertainty class instance and load the model
uncertainty = Uncertainty(outputDirectory='./temp/uncertainty')
uncertainty.loadModel(chemkinFile, dictFile)

# Map the species to the objects within the Uncertainty class
ethane = Species().fromSMILES('CC')
C2H4 = Species().fromSMILES('C=C')
mapping = getRMGSpeciesFromUserSpecies([ethane, C2H4], uncertainty.speciesList)

# Define the reaction conditions
initialMoleFractions = {mapping[ethane]: 1.0}
T = (1300, 'K')
P = (1, 'bar')
terminationTime = (0.5, 'ms')
sensitiveSpecies=[mapping[ethane], mapping[C2H4]]

## Step 2: Run sensitivity analysis

Local uncertainty analysis uses the results from a first-order sensitivity analysis. This analysis is done using RMG's native solver.

In [None]:
# Perform the sensitivity analysis
uncertainty.sensitivityAnalysis(initialMoleFractions, sensitiveSpecies, T, P, terminationTime, number=5, fileformat='.png')

In [None]:
# Show the sensitivity plots
for species in sensitiveSpecies:
    print '{}: Reaction Sensitivities'.format(species)
    index = species.index
    display(Image(filename=os.path.join(uncertainty.outputDirectory,'solver','sensitivity_1_SPC_{}_reactions.png'.format(index))))
    
    print '{}: Thermo Sensitivities'.format(species)
    display(Image(filename=os.path.join(uncertainty.outputDirectory,'solver','sensitivity_1_SPC_{}_thermo.png'.format(index))))

## Step 3: Uncertainty assignment and propagation of uncorrelated parameters

If we want to run local uncertainty analysis, we must assign all the uncertainties using the `Uncertainty` class' `assignParameterUncertainties` function. `ThermoParameterUncertainty` and `KineticParameterUncertainty` classes may be customized and passed into this function if non-default constants for constructing the uncertainties are desired. This must be done after the parameter sources are properly extracted from the model.

### Thermo Uncertainty

Each species is assigned a uniform uncertainty distribution in free energy:

$$G \in [G_{min},G_{max}]$$

We will propogate the standard deviation in free energy, which for a uniform distribution is defined as follows:

$$\Delta G = \frac{1}{\sqrt{12}}(G_{max} - G_{min})$$

Several parameters are used to formulate $\Delta G$.  These are $\Delta G_\mathrm{library}$, $\Delta G_\mathrm{QM}$, $\Delta G_\mathrm{GAV}$, and $\Delta _\mathrm{group}$.
        
$$\Delta G = \delta_\mathrm{library} \Delta G_\mathrm{library} + \delta_\mathrm{QM} \Delta G_\mathrm{QM} + \delta_\mathrm{GAV} \left( \Delta G_\mathrm{GAV} + \sum_{\mathrm{group}\; j} d_{j} \Delta G_{\mathrm{group},j} \right)$$

where $\delta$ is the Kronecker delta function which equals one if the species thermochemistry parameter contains the particular source type and $d_{j}$ is the degeneracy (number of appearances) of the thermo group used to construct the species thermochemistry in the group additivity method.

### Kinetics Uncertainty

Each reaction is assigned a uniform uncertainty distribution in the overall $\ln k$, or $\ln A$:

$$\ln k \in [\ln(k_{min}),\ln(k_{max})]$$

Again, we use the standard deviation of this distribution:

$$\Delta \ln(k) = \frac{1}{\sqrt{12}}(\ln k_{max} - \ln k_{min})$$

The parameters used to formulate $\Delta  \ln k$ are $\Delta \ln k_\mathrm{library}$, $\Delta \ln k_\mathrm{training}$, $\Delta \ln k_\mathrm{pdep}$, $\Delta \ln k_\mathrm{family}$, $\Delta \ln k_\mathrm{non-exact}$, and $\Delta \ln k_\mathrm{rule}$.

For library, training, and pdep reactions, the kinetic uncertainty is assigned according to their uncertainty type.  For kinetics estimated using RMG's rate rules, the following formula is used to calculate the uncertainty:

$$\Delta \ln k_\mathrm{rate\; rules} = \Delta\ln k_\mathrm{family} + \log_{10}(N+1) \left(\Delta\ln k_\mathrm{non-exact}\right)  + \sum_{\mathrm{rule}\; i} w_i \Delta \ln k_{\mathrm{rule},i}$$

where N is the total number of rate rules used and $w_{i}$ is the weight of the rate rule in the averaging scheme for that kinetics estimate. 

In [None]:
# NOTE: You must load the database with the same settings which were used to generate the model.
#       This includes any thermo or kinetics libraries which were used.
uncertainty.loadDatabase(
    thermoLibraries=['DFT_QCI_thermo', 'primaryThermoLibrary'],
    kineticsFamilies='default',
    reactionLibraries=[],
)

In [None]:
uncertainty.extractSourcesFromModel()
uncertainty.assignParameterUncertainties()

The first order local uncertainty, or variance $(d\ln c_i)^2$, for the concentration of species $i$ is defined as:

$$(\Delta \ln c_i)^2 = \sum_{\mathrm{reactions}\; m} \left(\frac{\partial\ln c_i}{\partial\ln k_m}\right)^2 (\Delta \ln k_m)^2  + \sum_{\mathrm{species}\; n} \left(\frac{\partial\ln c_i}{\partial G_n}\right)^2(\Delta G_n)^2$$

We have previously performed the sensitivity analysis.  Now we perform the local uncertainty analysis and apply the formula above using the parameter uncertainties and plot the results.  This first analysis considers the parameters to be independent.  In other words, even when multiple species thermochemistries depend on a single thermo group or multiple reaction rate coefficients depend on a particular rate rule, each value is considered independent of each other.  This typically results in a much larger uncertainty value than in reality due to cancellation error.

In [None]:
result = uncertainty.localAnalysis(sensitiveSpecies, correlated=False, number=5, fileformat='.png')
print process_local_results(result, sensitiveSpecies, number=5)[1]

In [None]:
# Show the uncertainty plots
for species in sensitiveSpecies:
    print '{}: Thermo Uncertainty Contributions'.format(species)
    display(Image(filename=os.path.join(uncertainty.outputDirectory, 'uncorrelated', 'thermoLocalUncertainty_{}.png'.format(species.toChemkin()))))
    
    print '{}: Reaction Uncertainty Contributions'.format(species)
    display(Image(filename=os.path.join(uncertainty.outputDirectory, 'uncorrelated', 'kineticsLocalUncertainty_{}.png'.format(species.toChemkin()))))

## Step 4: Uncertainty assignment and propagation of correlated parameters

A more accurate picture of the uncertainty in mechanism estimated using groups and rate rules requires accounting of the correlated errors resulting from using the same groups in multiple parameters.  This requires us to track the original sources: the groups and the rate rules, which constitute each parameter.  These errors may cancel in the final uncertainty calculation.  Note, however, that the error stemming from the estimation method itself do not cancel.  

For thermochemistry, the error terms described previously are $\Delta G_\mathrm{library}$, $\Delta G_\mathrm{QM}$, $\Delta G_\mathrm{GAV}$, and $\Delta _\mathrm{group}$.  Of these, $\Delta G_\mathrm{GAV}$ is an uncorrelated residual error, whereas the other terms are correlated. The set of correlated and uncorrelated parameters can be thought of instead as a set of independent parameters, $\Delta G_{ind,w}$.

For kinetics, the error terms described perviously are $\Delta \ln k_\mathrm{library}$, $\Delta \ln k_\mathrm{training}$, $\Delta \ln k_\mathrm{pdep}$, $\Delta \ln k_\mathrm{family}$, $\Delta \ln k_\mathrm{non-exact}$, and $\Delta \ln k_\mathrm{rule}$.  Of these, $\Delta \ln k_\mathrm{family}$ and $\Delta \ln k_\mathrm{non-exact}$ are uncorrelated error terms resulting from the method of estimation.  Again, we consider the set of correlated and uncorrelated parameters as the set of independent parameters, $\Delta\ln k_{ind,v}$.

The first order local uncertainty, or variance $(d\ln c_i)^2$, for the concentration of species $i$ becomes:

$$(\Delta \ln c_i)^2 = \sum_v \left(\frac{\partial\ln c_i}{\partial\ln k_{ind,v}}\right)^2 \left(\Delta\ln k_{ind,v}\right)^2 + \sum_w \left(\frac{\partial\ln c_i}{\partial G_{ind,w}}\right)^2 \left(\Delta G_{ind,w}\right)^2$$

where the differential terms can be computed as:

$$\frac{\partial\ln c_i}{\partial\ln k_{ind,v}} = \sum_m \frac{\partial\ln c_i}{\partial\ln k_m} \frac{\partial\ln k_m}{\partial\ln k_{ind,v}}$$

$$\frac{\partial\ln c_i}{\partial G_{ind,w}} = \sum_n \frac{\partial\ln c_i}{\partial G_n} \frac{\partial G_n}{\partial G_{ind,w}}$$


In [None]:
uncertainty.assignParameterUncertainties(correlated=True)
result = uncertainty.localAnalysis(sensitiveSpecies, correlated=True, number=10, fileformat='.png')
print process_local_results(result, sensitiveSpecies, number=5)[1]

In [None]:
# Show the uncertainty plots
for species in sensitiveSpecies:
    print '{}: Thermo Uncertainty Contributions'.format(species)
    display(Image(filename=os.path.join(uncertainty.outputDirectory, 'correlated', 'thermoLocalUncertainty_{}.png'.format(species.toChemkin()))))
    
    print '{}: Reaction Uncertainty Contributions'.format(species)
    display(Image(filename=os.path.join(uncertainty.outputDirectory, 'correlated', 'kineticsLocalUncertainty_{}.png'.format(species.toChemkin()))))