Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kinetics] Change error handling about invalid reactions #903

Merged
merged 1 commit into from
Sep 7, 2020

Conversation

12Chao
Copy link
Contributor

@12Chao 12Chao commented Jul 12, 2020

Changes proposed in this pull request
Current version of Cantera only throw CanteraError for one invalid Plog reaction and its invalid reaction condition(temperature, pressure) when loading input file by ct.Solution(); however, multiple Plog reactions with invalid Arrhenius parameters may exist in the input file. This pull request helps raise CanteraError for all invalid Plog reactions in the input file and their invalid conditions.
For example, there are 2 Plog reactions shown below with invalid Arrhenius parameters:

- equation: CH2CHOO <=> CO + CH3O  
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 0.316 atm, A: -1.8e+33, b: -7.27, Ea: 3.376e+04}
  - {P: 0.316 atm, A: 8.69e-50, b: 16.63, Ea: -3900.0}
  - {P: 1.0 atm, A: -3.83e+33, b: -7.2, Ea: 3.51e+04}
  - {P: 1.0 atm, A: 1.19e-39, b: 13.61, Ea: -1317.0}

- equation: CH2CHOO <=> CO2 + CH3 
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 100.0 atm, A: -2510.0, b: 1.41, Ea: 1.442e+04} 
  - {P: 100.0 atm, A: 4.05e-09, b: 5.14, Ea: 1.048e+04}

Current version of Cantera only throw the error message for the first reactions at first invalid presssure point(0.316 atm) and first invalid temperature point(200K, defined by Cantera) as shown below.

***********************************************************************
CanteraError thrown by Plog::validate:
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983, T = 500.0
***********************************************************************

This pull request will change Cantera to raise error showing, in this case, first reaction is invalid at (0.316 atm, 500K; 0.316 atm, 1000K; 1 atm 500K, 1atm, 1000K); second reaction is invalid at 100 atm, 500K.

***********************************************************************
CanteraError thrown by Plog::validate:




Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32018.699999999983 , T = 1000.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 101324.99999999999 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 101324.99999999999 , T = 1000.0




Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'
at P = 10132499.999999985 , T = 500.0

***********************************************************************

If applicable, fill in the issue number this pull request is fixing

Related to enhancement proposal

Checklist

  • There is a clear use-case for this code change
  • The commit message has a short title & references relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • The pull request is ready for review

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @12Chao I think this is a really good start. As I mentioned in one of the comments, I wonder if it makes sense to process all the reactions and collect all the errors to report them all at once (duplicates, errors with Chebychev coefficients, etc.). That would be a pretty significant expansion of the scope of this PR though, so I don't want to put that on you until we decide whether it makes sense to go for that. Regardless, there are a few comments that you can fix up here that will help the final outcome anyways, plus a couple of questions.

Comment on lines 293 to 297

return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
if (logP_ == logP1_) {
return std::exp(log_k1);
} else {
return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary? Since logP_ and logP1_ are both double numbers, the chance that they are identical is very small. The previous line accounted for this by taking the difference, so that if it was almost zero, the offset from log_k1 would be also nearly zero.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe it is necessary(or at least there should be applied some change). I think in the validation case, since the input value of update_C is taken from pressures_ in Plog::validate, so the lower reference pressure log_P1 should be the exact same value with logP_ because the logarithm of lower reference pressure(log_P1) is assigned from pressures_ as well. I add if here because I want to differentiate the validation case from the normal case(case that input pressure does not equal to any refference pressure). std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) involves log_k2 at higher reference pressure and log_k1 at lower reference pressure, in validation case, every reference pressure point will be called twice as higher pressure point first and lower pressure point when the input changes to next reference pressure point. In this case, if the rate constant at a reference pressure point is invalid, the error will be raised twice as it involves in formulastd::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) twice. For example, in reaction

- equation: CH2CHOO <=> CO + CH3O  
  type: pressure-dependent-Arrhenius
  rate-constants:
  - {P: 1.0 atm, A: 1.8e+33, b: -7.27, Ea: 3.376e+04}
  - {P: 1.0 atm, A: 8.69e-50, b: 16.63, Ea: -3900.0}  
  - {P: 10.0 atm, A: -3.83e+33, b: -7.2, Ea: 3.51e+04}
  - {P: 10.0 atm, A: 1.19e-39, b: 13.61, Ea: -1317.0}

reaction rate constant at P = 1.0 atm is valid (positive), when validate the pressure point P = 1.0 atm, log_k1 at lower reference pressure 1.0 atm is valid and it should pass the validation, but the log_k2 at higher reference pressure 10.0 atm is invalid, so the formula std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_) will return NaN , which should actually be log_k1 because logP_ - logP1_ equal to zero in the validation case. In the original code, it is not a problem because is throws the next pressure point 10 atm in the error message and stops. However, to check all the reference pressure point in validation case, It is not necessary to involve log_k2 when calculating log_k1 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than changing this, which introduces an extra branch point every time these rates are evaluated, I would suggest replacing the call to updateRC in the validate method and just calculating the value of k directly at that point, which is just

double k = 0;
for (size_t i = ilow1_; i < ilow2_; i++) {
    k += rates_[i].updateRC(logT, recipT);
}

And then all you have to do is check that k is positive.

try:
ct.Solution('pdep_err_test.yaml')
except ct.CanteraError as e:
self.assertEqual(str(e).find(err_msg), 116)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check finds the index that starts the error message, right? That seems a little bit fragile to me, in the sense that totally unrelated changes might change the index where this error message starts and would cause this test to fail. I don't think you want to use self.assertRaisesRegex() here, so I agree with the general structure. Can we try to find another way to check the error message? For instance, can we split the error message by newlines and assert that all of the lines in err_msg are present in the raised error?

@@ -94,6 +96,7 @@ unique_ptr<Kinetics> newKinetics(std::vector<ThermoPhase*>& phases,

void addReactions(Kinetics& kin, const AnyMap& phaseNode, const AnyMap& rootNode)
{
fmt::memory_buffer err_Plog_reactions;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if this were defined closer to the first place its used.

kin.addReaction(newReaction(R, kin));
}
catch(CanteraError& err){
format_to(err_Plog_reactions, "{}", err.what());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this will catch any CanteraErrors that are generated by trying to add the reaction, not just errors related to Plog reactions. In that case, the name of the variable seems to be a bit of a misnomer. If you only want to capture the Plog reactions here, then I think you need some of the same string processing as you've got in the next section.

To be clear though, I think it might be a good thing to catch all the errors and report them all at once back to the user, not just the errors related to adding PLOG reactions. I'd like to try it out and see how that works from the user side though, first. I also realize that might be expanding the scope of this change quite a bit, so this is more of a "thinking-out-loud" paragraph 😄

Comment on lines 193 to 200
} else {
std::regex re("(\\w+)::(\\w+)");
std::smatch match;
regex_search(captured_err, match, re);
std::string procedure = match[0];
std::string msg = captured_err.substr(25+procedure.length(), -1);
throw CanteraError(procedure,
"{}", msg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this processing necessary? What are you erasing here that wasn't here before? Processing based on character lengths is quite fragile, as I mentioned earlier, since changes in other places can totally break this without any warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am trying to raise Plog error while not affecting raising other CanteraError, so I distinguish the Plog::validation error and other CanteraError, the first erase line(line189) is for erasing all the '*' sign automatically added in each CanteraError. captured_err.erase(0, 40) erases string CanteraError thrown by Plog::validation. If the captured CanteraError is not a Plog::validation error, it uses regex to extract the procedure(the function that raise the error) and message in CanteraError and use throw CanteraError to raise it again. I forgot to do the same in line 178. In this case, as long as we do not change the CanteraError format like CanteraError thrown by (\\w+)::(\\w+) and Plog::validation in Plog error message, it should be ok. However, I agree it is fragile to process string based on character length or regex, and I would like to take any suggestion about how to improve this part. I think maybe a function to separate CanteraError into predure and message would be good?

@@ -0,0 +1,263 @@
description: |-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the absolute minimum file that produces the condition you want to test, including removing all the comments that aren't related to the test?

equation, std::exp((++iter)->first), T[i]);
}
}
// expression is at the higher of the adjacent pressures.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is duplicated

Comment on lines 208 to 210
std::string Plog_err = to_string(err_Plog_reactions);
throw CanteraError("Plog::validate",
"\n{}", Plog_err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string Plog_err = to_string(err_Plog_reactions);
throw CanteraError("Plog::validate",
"\n{}", Plog_err);
throw CanteraError("Plog::validate",
"\n{}", to_string(err_Plog_reactions));

}
if (!(to_string(err_reactions).empty())){
throw CanteraError("Plog::validate",
"\n{}", to_string(err_reactions));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"\n{}", to_string(err_reactions));
"\n{}", to_string(err_reactions));

}
// expression is at the higher of the adjacent pressures.
format_to(err_reactions, "\nInvalid rate coefficient for reaction '{}'\nat P = {} , T = {}",
equation, std::exp((iter)->first), T[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still need to increment the iterator here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the change in RxnRate.h line 293, the iterator does not have to be incremented here

@12Chao
Copy link
Contributor Author

12Chao commented Jul 15, 2020

Hi @bryanwweber, thanks for the review, I have added some changes according to your comments. As I replied in the comment, I am trying to catch all the errors and only add Plog validation error to the buffer while raise other CanteraError again as the way it should be raised. I am not sure if capture all the errors and raise them all at once would be more helpful as it depends on users preference and the hardness to trace all the problems, but I personally like to visualize all the errors at once to avoid the fraustration of seeing them one by one.

@codecov
Copy link

codecov bot commented Jul 15, 2020

Codecov Report

Merging #903 into main will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #903      +/-   ##
==========================================
+ Coverage   71.19%   71.22%   +0.02%     
==========================================
  Files         376      377       +1     
  Lines       46218    46272      +54     
==========================================
+ Hits        32903    32955      +52     
- Misses      13315    13317       +2     
Impacted Files Coverage Δ
src/kinetics/KineticsFactory.cpp 96.66% <100.00%> (+0.28%) ⬆️
src/kinetics/RxnRates.cpp 97.14% <100.00%> (+0.26%) ⬆️
samples/cxx/demo/demo.cpp 95.12% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c249de...7478bae. Read the comment docs.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @12Chao. Besides the specific comments I've made here, I think this is actually really close to a more general improvement to the way we report errors in reactions. Instead of applying this approach of collecting the error messages from just Plog reactions, wouldn't it be simpler and more helpful to the user to do this for all of the errors that can occur when calling addReaction?

Comment on lines 293 to 297

return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
if (logP_ == logP1_) {
return std::exp(log_k1);
} else {
return std::exp(log_k1 + (log_k2-log_k1) * (logP_-logP1_) * rDeltaP_);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than changing this, which introduces an extra branch point every time these rates are evaluated, I would suggest replacing the call to updateRC in the validate method and just calculating the value of k directly at that point, which is just

double k = 0;
for (size_t i = ilow1_; i < ilow2_; i++) {
    k += rates_[i].updateRC(logT, recipT);
}

And then all you have to do is check that k is positive.

for (auto iter = pressures_.begin(); iter->first < 1000; iter++) {
update_C(&iter->first);
for (size_t i=0; i < 6; i++) {
double k = updateRC(log(T[i]), 1.0/T[i]);
if (!(k >= 0)) {
// avoid to add invalid k at cantera inserted pressure(P=0) to error message
if (!(k >= 0) && std::exp((iter)->first) != 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A cleaner approach would be to just skip over the first value in pressures_ by incrementing the loop variable as it's initialized, e.g.

for (auto iter = ++pressures_.begin(); iter->first < 1000; iter++) {

@@ -6,6 +6,7 @@
#include "cantera/kinetics/RxnRates.h"
#include "cantera/base/Array.h"

using std::string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like you use an unqualified string declaration anywhere in this file, so I think this can be removed.

}
}
}
if (err_reactions.size()){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (err_reactions.size()){
if (err_reactions.size()) {

equation, std::exp((++iter)->first), T[i]);
}
}
format_to(err_reactions, "\nInvalid rate coefficient for reaction '{}'\nat P = {} , T = {}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message output would be a little cleaner if you added a format specifier for the pressure, e.g. P = {:.5g} so that some of rounding errors in unit conversion won't be as visible, e.g. you'll see P = 1.0132e+05 instead of P = 101324.99999999999.

Comment on lines 178 to 180
std::string captured_err = err.what();
captured_err.erase(std::remove(captured_err.begin(), captured_err.end(), '*'), captured_err.end());
std::string str_to_erase = "CanteraError thrown by Plog::validate:";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than trying to reformat the output of err.what(), you can use the err.getClass() and the err.getMessage() methods, which should simplify things a bit.

Comment on lines 185 to 192
} else {
std::regex re("(\\w+)::(\\w+)");
std::smatch match;
regex_search(captured_err, match, re);
std::string procedure = match[0];
std::string msg = captured_err.substr(25+procedure.length(), -1);
throw CanteraError(procedure,
"{}", msg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this block can be replaced by just re-throwing the existing exception, i.e. throw err. But even beyond that, I'm not sure why you wouldn't want to collect all reaction errors in the same way that you aggregate the Plog errors.

Comment on lines 174 to 177
try{
kin.addReaction(newReaction(R, kin));
}
catch (CanteraError& err){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These formatting suggestions apply here and elsewhere:

Suggested change
try{
kin.addReaction(newReaction(R, kin));
}
catch (CanteraError& err){
try {
kin.addReaction(newReaction(R, kin));
} catch (CanteraError& err) {

@12Chao
Copy link
Contributor Author

12Chao commented Sep 3, 2020

Hi @speth, thanks for the suggested changes. I was not sure about if it was helpful to capture all the errors and throw them all at once, so I only kept PLOG errors. I have changed the code to capture the error when add reactions and throw them at last. In the test case the error message is:

**********************************************************************
CanteraError thrown by KineticsFactory::addReactions:


***********************************************************************
CanteraError thrown by Plog::validate:


Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32019 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 32019 , T = 1000.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 1.0132e+05 , T = 500.0
Invalid rate coefficient for reaction 'CH2CHOO <=> CH3O + CO'
at P = 1.0132e+05 , T = 1000.0
***********************************************************************

***********************************************************************
CanteraError thrown by Plog::validate:


Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'
at P = 1.0132e+07 , T = 500.0
***********************************************************************
***********************************************************************

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @12Chao, I think this has become both useful with a simpler implementation. I just had a few additional suggestions.

interfaces/cython/cantera/test/test_kinetics.py Outdated Show resolved Hide resolved
src/kinetics/KineticsFactory.cpp Outdated Show resolved Hide resolved
src/kinetics/KineticsFactory.cpp Outdated Show resolved Hide resolved
src/kinetics/KineticsFactory.cpp Outdated Show resolved Hide resolved
src/kinetics/RxnRates.cpp Outdated Show resolved Hide resolved
Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @12Chao. I squashed the commits to reduce some of the churn and make it easier to see what changes were being made here, and fixed a couple of small formatting issues that had snuck in.

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small formatting changes. I would advocate for this change to be merged after the release of 2.5.0, since it's a somewhat large new feature, and the beta for 2.5.0 has already been published. Either that, or creating a release branch for 2.5 at this point before merging this PR.

Comment on lines 410 to 417
"at P = 32019, T = 500.0",
"at P = 32019, T = 1000.0",
"at P = 1.0132e+05, T = 500.0",
"at P = 1.0132e+05, T = 1000.0",
"Invalid rate coefficient for reaction 'CH2CHOO <=> CH3 + CO2'",
"at P = 1.0132e+05, T = 500.0",
"InputFileError thrown by Kinetics::checkReactionBalance:",
"The following reaction is unbalanced: H2O2 + OH <=> 2 H2O + HO2")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are indented by one too many spaces

Comment on lines 1 to 8
description: |-
This is the file for testing Plog errors

generator: ck2yaml
input-files: [mech.inp, therm.DAT, trans.dat]
cantera-version: 2.5.0a2
date: Mon, 29 Jun 2020 14:13:58 +0000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest deleting these lines to keep the test file as small as possible.

Comment on lines 32 to 38
transport:
model: gas
geometry: linear
well-depth: 244.0
diameter: 3.763
polarizability: 2.65
rotational-relaxation: 2.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this, and the other transport properties as well. I'm actually surprised you can specify a transport model in the phase, since a few species don't have transport parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this doesn't cause a problem is because loading the file results in an error in creating the Kinetics object before it gets to trying to create a Transport object.

@12Chao
Copy link
Contributor Author

12Chao commented Sep 4, 2020

Thanks for the review, @bryanwweber, I have made the changes according to the suggested changes.

@bryanwweber bryanwweber changed the title [Kinetics] Change error handling about invalid Plog reactions [Kinetics] Change error handling about invalid reactions Sep 4, 2020
@speth
Copy link
Member

speth commented Sep 4, 2020

@12Chao I'm not sure exactly what led to this, but your latest changes have reverted the cleanup work that I did and introduced a new, undesirable merge commit that prevents this PR from being integrated. I would suggest that you reset your branch to match the version I last updated, which I have pushed to my fork (https://github.com/speth/cantera/tree/pr/12Chao/903) and make the updates to address @bryanwweber's review starting from that point.

@12Chao
Copy link
Contributor Author

12Chao commented Sep 4, 2020

@speth sorry about that, I pulled the changes from my repo trying to sync the changes you made, and it messed up the commit history. I think I have fixed it.

Instead of reporting only the first error and stopping, continue adding
reactions and collect all of the errors that occur, then report them
at the end. For Plog reactions, report each temperature and pressure
combination that results in an invalid reaction rate.
@speth speth merged commit 2cd0619 into Cantera:main Sep 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants