Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pressure dependent network generation in RMG jobs #1658

Merged
merged 7 commits into from Jul 21, 2019
Merged

Conversation

mliu49
Copy link
Contributor

@mliu49 mliu49 commented Jul 17, 2019

Note

I consider this to be a major bug and therefore plan on making a patch release (2.4.1) once this is merged. My initial thought is to only include this fix, but I'm also considering whether to simply include everything that's been merged since the 2.4 release. Thoughts welcome.

Motivation or Problem

An apparent slowdown in RMG model generation was observed recently. The following plots were from a dodecane pyrolysis model. The exact commits are not too important here. This was just the initial observation that something major was different.

Screenshot from 2019-07-16 16-40-31

Further investigation showed that the problem was related to pdep networks. In the "after" model, RMG would continuously explore pdep networks on every iteration instead of adding new core species. Network leak rate ratios were all absurd values around 1e100~1e300.

After a ton of debugging, the core problem was identified as incorrect identification of newSpecies when calling processNewReactions. The newSpecies argument was used by addReactionToUnimolecularNetworks to identify the direction in which to process a new reaction, and was intended to correspond to the core species which was used to generate the reaction.

In #1459, the selection of newSpecies was changed following changes to no longer deflate and inflate reactions. The new method was to simply find the first core species in the reaction. However, this did not always correspond to the species used to generate the reaction. The result was improper creation of pdep networks.

Description of Changes

The goal was to fix how we identified newSpecies. The chosen implementation was to return the species tuples used to generate the reactions from react_all since they directly map the generated reactions to the core species used to generate them. Note that we make copies of every species before reaction generation, so the actual species contained in the reactions do not point to the actual core species.

Changes:

  • react generates a list of lists due to map, return that as is instead of chaining the lists together
  • in react_all, return the list of lists from react as is, along with the list of species tuples provided to react
  • in enlarge, match reactions with their species tuples, and take the species in the tuple with the lowest index as newSpecies (chosen for consistency, I don't think it really matters, since it seems that addReactionToUnimolecularNetworks only uses it decide on the direction to process the reaction)
  • add a flatten option to react primarily to simplify cases where you just want a flat list of reactions
  • update related unit tests to handle the changes in return values
  • organize imports in rmgpy.rmg.model

Testing

I used the dodecane model mentioned above to debug and test. I'm currently running it for longer to confirm.

Reviewer Tips

Read my explanation above, read the code, ask questions, maybe run a test case?

@mliu49 mliu49 added Topic: PDep Status: Ready for Review PR is complete and ready to be reviewed Type: Bug Priority: Critical Should be addressed ASAP labels Jul 17, 2019
@mliu49 mliu49 self-assigned this Jul 17, 2019
@mliu49
Copy link
Contributor Author

mliu49 commented Jul 18, 2019

Made the following changes following offline discussion:

  • Removed the flatten argument since it's only used for unit tests. Instead, flatten in the unit tests.
  • Do not sort the species tuples since newSpecies is only used to determine reaction direction.
  • Check that the reaction list is non-empty before processing
  • More thoroughly update docstrings

I went ahead and rebased on master.

@rwest
Copy link
Member

rwest commented Jul 18, 2019

Well done debugging that.
Should we add a test that would have caught this regression?

@codecov
Copy link

codecov bot commented Jul 18, 2019

Codecov Report

Merging #1658 into master will decrease coverage by <.01%.
The diff coverage is 78.26%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1658      +/-   ##
==========================================
- Coverage   41.69%   41.68%   -0.01%     
==========================================
  Files         176      176              
  Lines       29359    29342      -17     
  Branches     6053     6050       -3     
==========================================
- Hits        12241    12232       -9     
+ Misses      16248    16240       -8     
  Partials      870      870
Impacted Files Coverage Δ
rmgpy/rmg/react.py 86.79% <100%> (-0.25%) ⬇️
rmgpy/rmg/model.py 38.28% <72.22%> (-0.19%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95e8fcf...f968be8. Read the comment docs.

@mliu49
Copy link
Contributor Author

mliu49 commented Jul 18, 2019

I created a new functional test for enlarge with pressure dependence. I tried to make it as minimal as possible, but it was quite difficult.

I did confirm that it fails on master.

@mliu49 mliu49 force-pushed the pdep_fix branch 2 times, most recently from 07c8423 to e8108cb Compare July 19, 2019 21:51
@alongd
Copy link
Member

alongd commented Jul 20, 2019

The changes look OK to me

Motivation:
- processNewReactions requires a newSpecies argument which was not
being properly identified after multiprocessing changes in #1459

Background:
- The newSpecies argument is retained from early RMG when reactions
were generated right after adding a new species to the core.
- Hence, newSpecies referred to the newly added core species which
was used to generate the new reactions.
- The overall model generation algorithm has since changed, so
identifying newSpecies is not as straightforward.
- The multiprocessing PR changed newSpecies to be any core species,
missing the original purpose of identifying the species from which
the reaction was generated.

Changes:
- Use the species tuples created during reaction generation to
keep track of which species were used to generate the reaction
- Update unit tests which are affected by the changes in return values
from react and react_all
Also rename the spc_tuples arg in the react method to spc_fam_tuples
to reflect updated usage.
Sort alphabetically and also remove relative imports
This reverts commit d59da68.

Lesson to be learned: If a unit test fails, make sure you
understand why before changing the correct answer...
With tests for pdep network generation
@mliu49
Copy link
Contributor Author

mliu49 commented Jul 21, 2019

@amarkpayne, @ajocher, do you have any final comments?

@ajocher
Copy link
Contributor

ajocher commented Jul 21, 2019

Looks good to me.

print cls.dirname
os.makedirs(os.path.join(cls.dirname, 'pdep'))

TESTFAMILY = 'R_Recombination'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PEP-8, shouldn't variables be all lowercase with underscores? I thought all caps was reserved for module level or global variables. (I am just curious what the answer is, this is not worth holding this emergency PR back for)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I copied the method from another test class and didn't change the variable name. It should be all lowercase.

I also just noticed that I left a print statement in...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a commit fixing these into #1659.

@amarkpayne amarkpayne merged commit 68367e8 into master Jul 21, 2019
@amarkpayne amarkpayne deleted the pdep_fix branch July 21, 2019 21:05
@mliu49 mliu49 mentioned this pull request Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: Critical Should be addressed ASAP Status: Ready for Review PR is complete and ready to be reviewed Topic: PDep Type: Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants