[Kinetics] Change error handling about invalid reactions #903

12Chao · 2020-07-12T17:21:30Z

Changes proposed in this pull request
Current version of Cantera only throw CanteraError for one invalid Plog reaction and its invalid reaction condition(temperature, pressure) when loading input file by ct.Solution(); however, multiple Plog reactions with invalid Arrhenius parameters may exist in the input file. This pull request helps raise CanteraError for all invalid Plog reactions in the input file and their invalid conditions.
For example, there are 2 Plog reactions shown below with invalid Arrhenius parameters:

- equation: CH2CHOO <=> CO + CH3O  
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 0.316 atm, A: -1.8e+33, b: -7.27, Ea: 3.376e+04}
  - {P: 0.316 atm, A: 8.69e-50, b: 16.63, Ea: -3900.0}
  - {P: 1.0 atm, A: -3.83e+33, b: -7.2, Ea: 3.51e+04}
  - {P: 1.0 atm, A: 1.19e-39, b: 13.61, Ea: -1317.0}

- equation: CH2CHOO <=> CO2 + CH3 
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 100.0 atm, A: -2510.0, b: 1.41, Ea: 1.442e+04} 
  - {P: 100.0 atm, A: 4.05e-09, b: 5.14, Ea: 1.048e+04}

Current version of Cantera only throw the error message for the first reactions at first invalid presssure point(0.316 atm) and first invalid temperature point(200K, defined by Cantera) as shown below.

***********************************************************************
CanteraError thrown by Plog::validate:
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983, T = 500.0
***********************************************************************

This pull request will change Cantera to raise error showing, in this case, first reaction is invalid at (0.316 atm, 500K; 0.316 atm, 1000K; 1 atm 500K, 1atm, 1000K); second reaction is invalid at 100 atm, 500K.

***********************************************************************
CanteraError thrown by Plog::validate:




Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983 , T = 1000.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 101324.99999999999 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 101324.99999999999 , T = 1000.0




Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'
at P = 10132499.999999985 , T = 500.0

***********************************************************************

If applicable, fill in the issue number this pull request is fixing

Related to enhancement proposal

Checklist

There is a clear use-case for this code change
The commit message has a short title & references relevant issues
Build passes (scons build & scons test) and unit tests address code coverage
The pull request is ready for review

bryanwweber

Thanks @12Chao I think this is a really good start. As I mentioned in one of the comments, I wonder if it makes sense to process all the reactions and collect all the errors to report them all at once (duplicates, errors with Chebychev coefficients, etc.). That would be a pretty significant expansion of the scope of this PR though, so I don't want to put that on you until we decide whether it makes sense to go for that. Regardless, there are a few comments that you can fix up here that will help the final outcome anyways, plus a couple of questions.

bryanwweber · 2020-07-15T15:42:06Z

include/cantera/kinetics/RxnRates.h

-
-        return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
+        if (logP_ == logP1_) {
+            return std::exp(log_k1);
+        } else {
+            return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
+        }


Is this change necessary? Since logP_ and logP1_ are both double numbers, the chance that they are identical is very small. The previous line accounted for this by taking the difference, so that if it was almost zero, the offset from log_k1 would be also nearly zero.

Yes, I believe it is necessary(or at least there should be applied some change). I think in the validation case, since the input value of update_C is taken from pressures_ in Plog::validate, so the lower reference pressure log_P1 should be the exact same value with logP_ because the logarithm of lower reference pressure(log_P1) is assigned from pressures_ as well. I add if here because I want to differentiate the validation case from the normal case(case that input pressure does not equal to any refference pressure). std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) involves log_k2 at higher reference pressure and log_k1 at lower reference pressure, in validation case, every reference pressure point will be called twice as higher pressure point first and lower pressure point when the input changes to next reference pressure point. In this case, if the rate constant at a reference pressure point is invalid, the error will be raised twice as it involves in formulastd::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) twice. For example, in reaction

- equation: CH2CHOO <=> CO + CH3O type: pressure-dependent-Arrhenius rate-constants: - {P: 1.0 atm, A: 1.8e+33, b: -7.27, Ea: 3.376e+04} - {P: 1.0 atm, A: 8.69e-50, b: 16.63, Ea: -3900.0} - {P: 10.0 atm, A: -3.83e+33, b: -7.2, Ea: 3.51e+04} - {P: 10.0 atm, A: 1.19e-39, b: 13.61, Ea: -1317.0}

reaction rate constant at P = 1.0 atm is valid (positive), when validate the pressure point P = 1.0 atm, log_k1 at lower reference pressure 1.0 atm is valid and it should pass the validation, but the log_k2 at higher reference pressure 10.0 atm is invalid, so the formula std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) will return NaN , which should actually be log_k1 because logP_ - logP1_ equal to zero in the validation case. In the original code, it is not a problem because is throws the next pressure point 10 atm in the error message and stops. However, to check all the reference pressure point in validation case, It is not necessary to involve log_k2 when calculating log_k1 .

Rather than changing this, which introduces an extra branch point every time these rates are evaluated, I would suggest replacing the call to updateRC in the validate method and just calculating the value of k directly at that point, which is just

double k = 0; for (size_t i = ilow1_; i < ilow2_; i++) { k += rates_[i].updateRC(logT, recipT); }

And then all you have to do is check that k is positive.

bryanwweber · 2020-07-15T15:44:26Z

interfaces/cython/cantera/test/test_kinetics.py

+        try:
+            ct.Solution('pdep_err_test.yaml')
+        except ct.CanteraError as e:
+            self.assertEqual(str(e).find(err_msg), 116)


This check finds the index that starts the error message, right? That seems a little bit fragile to me, in the sense that totally unrelated changes might change the index where this error message starts and would cause this test to fail. I don't think you want to use self.assertRaisesRegex() here, so I agree with the general structure. Can we try to find another way to check the error message? For instance, can we split the error message by newlines and assert that all of the lines in err_msg are present in the raised error?

bryanwweber · 2020-07-15T16:10:49Z

src/kinetics/KineticsFactory.cpp

@@ -94,6 +96,7 @@ unique_ptr<Kinetics> newKinetics(std::vector<ThermoPhase*>& phases,

 void addReactions(Kinetics& kin, const AnyMap& phaseNode, const AnyMap& rootNode)
 {
+    fmt::memory_buffer err_Plog_reactions;


I think it would be better if this were defined closer to the first place its used.

bryanwweber · 2020-07-15T16:14:15Z

src/kinetics/KineticsFactory.cpp

+                    kin.addReaction(newReaction(R, kin));
+                }
+                catch(CanteraError& err){
+                    format_to(err_Plog_reactions, "{}", err.what());


It seems like this will catch any CanteraErrors that are generated by trying to add the reaction, not just errors related to Plog reactions. In that case, the name of the variable seems to be a bit of a misnomer. If you only want to capture the Plog reactions here, then I think you need some of the same string processing as you've got in the next section.

To be clear though, I think it might be a good thing to catch all the errors and report them all at once back to the user, not just the errors related to adding PLOG reactions. I'd like to try it out and see how that works from the user side though, first. I also realize that might be expanding the scope of this change quite a bit, so this is more of a "thinking-out-loud" paragraph 😄

bryanwweber · 2020-07-15T16:17:11Z

src/kinetics/KineticsFactory.cpp

+                    } else {
+                        std::regex re("(\\w+)::(\\w+)");
+                        std::smatch match;
+                        regex_search(captured_err, match, re);
+                        std::string procedure = match[0];
+                        std::string msg = captured_err.substr(25+procedure.length(), -1);
+                        throw CanteraError(procedure,
+                            "{}", msg);


Why is this processing necessary? What are you erasing here that wasn't here before? Processing based on character lengths is quite fragile, as I mentioned earlier, since changes in other places can totally break this without any warning.

Here I am trying to raise Plog error while not affecting raising other CanteraError, so I distinguish the Plog::validation error and other CanteraError, the first erase line(line189) is for erasing all the '*' sign automatically added in each CanteraError. captured_err.erase(0, 40) erases string CanteraError thrown by Plog::validation. If the captured CanteraError is not a Plog::validation error, it uses regex to extract the procedure(the function that raise the error) and message in CanteraError and use throw CanteraError to raise it again. I forgot to do the same in line 178. In this case, as long as we do not change the CanteraError format like CanteraError thrown by (\\w+)::(\\w+) and Plog::validation in Plog error message, it should be ok. However, I agree it is fragile to process string based on character length or regex, and I would like to take any suggestion about how to improve this part. I think maybe a function to separate CanteraError into predure and message would be good?

bryanwweber · 2020-07-15T16:30:44Z

test/data/pdep_err_test.yaml

@@ -0,0 +1,263 @@
+description: |-


Can you use the absolute minimum file that produces the condition you want to test, including removing all the comments that aren't related to the test?

bryanwweber · 2020-07-15T16:31:52Z

src/kinetics/RxnRates.cpp

-                    equation, std::exp((++iter)->first), T[i]);
-            }
-        }
+                 // expression is at the higher of the adjacent pressures.


This line is duplicated

bryanwweber · 2020-07-15T16:32:57Z

src/kinetics/KineticsFactory.cpp

+        std::string Plog_err = to_string(err_Plog_reactions);
+        throw CanteraError("Plog::validate",
+                        "\n{}", Plog_err);


Suggested change

std::string Plog_err = to_string(err_Plog_reactions);

throw CanteraError("Plog::validate",

"\n{}", Plog_err);

throw CanteraError("Plog::validate",

"\n{}", to_string(err_Plog_reactions));

bryanwweber · 2020-07-15T16:33:29Z

src/kinetics/RxnRates.cpp

+     }
+    if (!(to_string(err_reactions).empty())){
+        throw CanteraError("Plog::validate",
+                        "\n{}", to_string(err_reactions));


Suggested change

"\n{}", to_string(err_reactions));

"\n{}", to_string(err_reactions));

bryanwweber · 2020-07-15T16:34:02Z

src/kinetics/RxnRates.cpp

-        }
+                 // expression is at the higher of the adjacent pressures.
+                format_to(err_reactions, "\nInvalid rate coefficient for reaction '{}'\nat P = {} , T = {}",
+                    equation, std::exp((iter)->first), T[i]);


Do you still need to increment the iterator here?

Because of the change in RxnRate.h line 293, the iterator does not have to be incremented here

12Chao · 2020-07-15T21:18:49Z

Hi @bryanwweber, thanks for the review, I have added some changes according to your comments. As I replied in the comment, I am trying to catch all the errors and only add Plog validation error to the buffer while raise other CanteraError again as the way it should be raised. I am not sure if capture all the errors and raise them all at once would be more helpful as it depends on users preference and the hardness to trace all the problems, but I personally like to visualize all the errors at once to avoid the fraustration of seeing them one by one.

codecov · 2020-07-15T21:44:43Z

Codecov Report

Merging #903 into main will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #903      +/-   ##
==========================================
+ Coverage   71.19%   71.22%   +0.02%     
==========================================
  Files         376      377       +1     
  Lines       46218    46272      +54     
==========================================
+ Hits        32903    32955      +52     
- Misses      13315    13317       +2

Impacted Files	Coverage Δ
src/kinetics/KineticsFactory.cpp	`96.66% <100.00%> (+0.28%)`	⬆️
src/kinetics/RxnRates.cpp	`97.14% <100.00%> (+0.26%)`	⬆️
samples/cxx/demo/demo.cpp	`95.12% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c249de...7478bae. Read the comment docs.

interfaces/cython/cantera/test/test_kinetics.py

speth

Thanks for working on this, @12Chao. Besides the specific comments I've made here, I think this is actually really close to a more general improvement to the way we report errors in reactions. Instead of applying this approach of collecting the error messages from just Plog reactions, wouldn't it be simpler and more helpful to the user to do this for all of the errors that can occur when calling addReaction?

speth · 2020-08-22T00:38:19Z

include/cantera/kinetics/RxnRates.h

-
-        return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
+        if (logP_ == logP1_) {
+            return std::exp(log_k1);
+        } else {
+            return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
+        }


Rather than changing this, which introduces an extra branch point every time these rates are evaluated, I would suggest replacing the call to updateRC in the validate method and just calculating the value of k directly at that point, which is just

double k = 0; for (size_t i = ilow1_; i < ilow2_; i++) { k += rates_[i].updateRC(logT, recipT); }

And then all you have to do is check that k is positive.

speth · 2020-08-22T00:51:10Z

src/kinetics/RxnRates.cpp

    for (auto iter = pressures_.begin(); iter->first < 1000; iter++) {
        update_C(&iter->first);
        for (size_t i=0; i < 6; i++) {
            double k = updateRC(log(T[i]), 1.0/T[i]);
-            if (!(k >= 0)) {
+            // avoid to add invalid k at cantera inserted pressure(P=0) to error message
+            if (!(k >= 0) && std::exp((iter)->first) != 0) {


A cleaner approach would be to just skip over the first value in pressures_ by incrementing the loop variable as it's initialized, e.g.

for (auto iter = ++pressures_.begin(); iter->first < 1000; iter++) {

speth · 2020-08-22T00:53:01Z

src/kinetics/RxnRates.cpp

@@ -6,6 +6,7 @@
 #include "cantera/kinetics/RxnRates.h"
 #include "cantera/base/Array.h"

+using std::string;


It doesn't look like you use an unqualified string declaration anywhere in this file, so I think this can be removed.

speth · 2020-08-22T00:53:37Z

src/kinetics/RxnRates.cpp

+             }
+         }
+     }
+    if (err_reactions.size()){


Suggested change

if (err_reactions.size()){

if (err_reactions.size()) {

speth · 2020-08-22T00:58:13Z

src/kinetics/RxnRates.cpp

-                    equation, std::exp((++iter)->first), T[i]);
-            }
-        }
+                format_to(err_reactions, "\nInvalid rate coefficient for reaction '{}'\nat P = {} , T = {}",


The error message output would be a little cleaner if you added a format specifier for the pressure, e.g. P = {:.5g} so that some of rounding errors in unit conversion won't be as visible, e.g. you'll see P = 1.0132e+05 instead of P = 101324.99999999999.

speth · 2020-08-22T01:18:46Z

src/kinetics/KineticsFactory.cpp

+                    std::string captured_err = err.what();
+                    captured_err.erase(std::remove(captured_err.begin(), captured_err.end(), '*'), captured_err.end());
+                    std::string str_to_erase = "CanteraError thrown by Plog::validate:";


Rather than trying to reformat the output of err.what(), you can use the err.getClass() and the err.getMessage() methods, which should simplify things a bit.

speth · 2020-08-22T01:23:25Z

src/kinetics/KineticsFactory.cpp

+                    } else {
+                        std::regex re("(\\w+)::(\\w+)");
+                        std::smatch match;
+                        regex_search(captured_err, match, re);
+                        std::string procedure = match[0];
+                        std::string msg = captured_err.substr(25+procedure.length(), -1);
+                        throw CanteraError(procedure,
+                                           "{}", msg);


I think this block can be replaced by just re-throwing the existing exception, i.e. throw err. But even beyond that, I'm not sure why you wouldn't want to collect all reaction errors in the same way that you aggregate the Plog errors.

speth · 2020-08-22T01:27:54Z

src/kinetics/KineticsFactory.cpp

+                try{
+                    kin.addReaction(newReaction(R, kin));
+                }
+                catch (CanteraError& err){


These formatting suggestions apply here and elsewhere:

Suggested change

try{

kin.addReaction(newReaction(R, kin));

}

catch (CanteraError& err){

try {

kin.addReaction(newReaction(R, kin));

} catch (CanteraError& err) {

12Chao · 2020-09-03T01:20:22Z

Hi @speth, thanks for the suggested changes. I was not sure about if it was helpful to capture all the errors and throw them all at once, so I only kept PLOG errors. I have changed the code to capture the error when add reactions and throw them at last. In the test case the error message is:

**********************************************************************
CanteraError thrown by KineticsFactory::addReactions:


***********************************************************************
CanteraError thrown by Plog::validate:


Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32019 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32019 , T = 1000.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 1.0132e+05 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 1.0132e+05 , T = 1000.0
***********************************************************************

***********************************************************************
CanteraError thrown by Plog::validate:


Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'
at P = 1.0132e+07 , T = 500.0
***********************************************************************
***********************************************************************

speth

Thanks for the updates, @12Chao, I think this has become both useful with a simpler implementation. I just had a few additional suggestions.

interfaces/cython/cantera/test/test_kinetics.py

src/kinetics/KineticsFactory.cpp

src/kinetics/RxnRates.cpp

speth

Thanks, @12Chao. I squashed the commits to reduce some of the churn and make it easier to see what changes were being made here, and fixed a couple of small formatting issues that had snuck in.

bryanwweber

A few small formatting changes. I would advocate for this change to be merged after the release of 2.5.0, since it's a somewhat large new feature, and the beta for 2.5.0 has already been published. Either that, or creating a release branch for 2.5 at this point before merging this PR.

bryanwweber · 2020-09-04T14:54:09Z

interfaces/cython/cantera/test/test_kinetics.py

+                    "at P = 32019, T = 500.0",
+                    "at P = 32019, T = 1000.0",
+                    "at P = 1.0132e+05, T = 500.0",
+                    "at P = 1.0132e+05, T = 1000.0",
+                    "Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'",
+                    "at P = 1.0132e+05, T = 500.0",
+                    "InputFileError thrown by Kinetics::checkReactionBalance:",
+                    "The following reaction is unbalanced: H2O2 + OH <=> 2 H2O + HO2")


These lines are indented by one too many spaces

bryanwweber · 2020-09-04T14:57:03Z

test/data/addReactions_err_test.yaml

+description: |-
+  This is the file for testing Plog errors
+
+generator: ck2yaml
+input-files: [mech.inp, therm.DAT, trans.dat]
+cantera-version: 2.5.0a2
+date: Mon, 29 Jun 2020 14:13:58 +0000
+


I would suggest deleting these lines to keep the test file as small as possible.

bryanwweber · 2020-09-04T14:57:51Z

test/data/addReactions_err_test.yaml

+  transport:
+    model: gas
+    geometry: linear
+    well-depth: 244.0
+    diameter: 3.763
+    polarizability: 2.65
+    rotational-relaxation: 2.1


Also this, and the other transport properties as well. I'm actually surprised you can specify a transport model in the phase, since a few species don't have transport parameters.

The reason this doesn't cause a problem is because loading the file results in an error in creating the Kinetics object before it gets to trying to create a Transport object.

12Chao · 2020-09-04T16:25:21Z

Thanks for the review, @bryanwweber, I have made the changes according to the suggested changes.

speth · 2020-09-04T21:24:52Z

@12Chao I'm not sure exactly what led to this, but your latest changes have reverted the cleanup work that I did and introduced a new, undesirable merge commit that prevents this PR from being integrated. I would suggest that you reset your branch to match the version I last updated, which I have pushed to my fork (https://github.com/speth/cantera/tree/pr/12Chao/903) and make the updates to address @bryanwweber's review starting from that point.

12Chao · 2020-09-04T22:33:39Z

@speth sorry about that, I pulled the changes from my repo trying to sync the changes you made, and it messed up the commit history. I think I have fixed it.

Instead of reporting only the first error and stopping, continue adding reactions and collect all of the errors that occur, then report them at the end. For Plog reactions, report each temperature and pressure combination that results in an invalid reaction rate.

bryanwweber requested changes Jul 15, 2020

View reviewed changes

bryanwweber reviewed Jul 16, 2020

View reviewed changes

interfaces/cython/cantera/test/test_kinetics.py Outdated Show resolved Hide resolved

speth requested changes Aug 22, 2020

View reviewed changes

12Chao force-pushed the pdep_err branch from e5f4b7b to b3f2d32 Compare September 2, 2020 01:56

speth requested changes Sep 3, 2020

View reviewed changes

speth force-pushed the pdep_err branch from f6bae6f to e56915e Compare September 3, 2020 22:43

speth approved these changes Sep 3, 2020

View reviewed changes

speth requested a review from bryanwweber September 3, 2020 22:44

bryanwweber requested changes Sep 4, 2020

View reviewed changes

bryanwweber changed the title ~~[Kinetics] Change error handling about invalid Plog reactions~~ [Kinetics] Change error handling about invalid reactions Sep 4, 2020

12Chao force-pushed the pdep_err branch from dea89e8 to e56915e Compare September 4, 2020 22:19

speth force-pushed the pdep_err branch from adbbc31 to 7478bae Compare September 6, 2020 16:05

speth requested a review from bryanwweber September 6, 2020 16:09

bryanwweber approved these changes Sep 7, 2020

View reviewed changes

speth merged commit 2cd0619 into Cantera:main Sep 7, 2020

	"\n{}", to_string(err_reactions));
	"\n{}", to_string(err_reactions));

[Kinetics] Change error handling about invalid reactions #903

[Kinetics] Change error handling about invalid reactions #903

Conversation

12Chao commented Jul 12, 2020 • edited Loading

bryanwweber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

12Chao commented Jul 15, 2020

codecov bot commented Jul 15, 2020 • edited Loading

Codecov Report

speth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

12Chao commented Sep 3, 2020

speth left a comment

Choose a reason for hiding this comment

speth left a comment

Choose a reason for hiding this comment

bryanwweber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

12Chao commented Sep 4, 2020

speth commented Sep 4, 2020

12Chao commented Sep 4, 2020

12Chao commented Jul 12, 2020 •

edited

Loading

codecov bot commented Jul 15, 2020 •

edited

Loading