Conversation
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #852 +/- ##
==========================================
- Coverage 59.21% 59.21% -0.01%
==========================================
Files 98 98
Lines 30023 30086 +63
Branches 7929 7951 +22
==========================================
+ Hits 17779 17814 +35
- Misses 9991 10031 +40
+ Partials 2253 2241 -12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6d5f426 to
64f013f
Compare
| p_label_map[atom.label] = base + j | ||
| label = atom.label | ||
| suffix = 2 | ||
| while label in p_label_map: |
There was a problem hiding this comment.
Also here, I don't see us defining p_label_map before it is used like we did with r_label_map
There was a problem hiding this comment.
line 526: p_label_map: Dict[str, int] = dict()
| label = val | ||
| suffix = 2 | ||
| while label in r_label_map: | ||
| label = f'{val}_{suffix}' |
There was a problem hiding this comment.
If the label here is a string and is set as a key in r_label_map would not the docstring need to be Dict[str, int]?
There was a problem hiding this comment.
yes, thanks! my typo from a prev PR. will fix
calvinp0
left a comment
There was a problem hiding this comment.
Thanks, much needed for ARC. Please check my comments
Three root causes prevented ARC from identifying several RMG reaction families: 1. get_reactant_num / get_product_num inferred counts by splitting the template group, ignoring explicit reactantNum / productNum in the RMG groups.py files. This broke Birad_recombination (reported 2 reactants instead of 1) and Br_Abstraction (reported 1 product instead of 2). 2. check_product_isomorphism was hardcoded for 1-2 products and returned False for 3-product families (Bimolec_Hydroperoxide_Decomposition). Generalized to any count, with an InChI fallback for species whose XYZ perception yields a different Lewis structure than SMILES. 3. get_reaction_family_products crashed on families with unsupported atom types (e.g., Na) or invalid adjacency lists, aborting the entire scan. Now catches KeyError, ValueError, and InvalidAdjacencyListError to skip problematic families gracefully.
Combining a bunch of reaction families which were reported as problematic in the past
FORM_BOND with the same label on both atoms (e.g., ['FORM_BOND', '*', 1, '*']) resolved both to the same atom, raising ValueError. Now uses the first and second labeled atom when labels match. LOSE_RADICAL/GAIN_RADICAL now applies to all atoms with the given label, not just the first. Fixes recognition of R_Recombination (CH3 + CH3 <=> C2H6) and similar families whose template groups span multiple reactant molecules with a shared label.
for InChI multiplicity check, duplicate labels, and family disambiguation
Added support for recognizing reaction families which were shown to be problematic in the past.
Specifically, addressing:
#813
#787
#738
#606
As well as:
that were mentioned offline by @kfir4444