Skip to content

Fam recognition#852

Merged
alongd merged 8 commits intomainfrom
fam_recognition
Apr 7, 2026
Merged

Fam recognition#852
alongd merged 8 commits intomainfrom
fam_recognition

Conversation

@alongd
Copy link
Copy Markdown
Member

@alongd alongd commented Mar 31, 2026

Added support for recognizing reaction families which were shown to be problematic in the past.

Specifically, addressing:

#813
#787
#738
#606

As well as:

Bimolec_Hydroperoxide_Decomposition
Birad_R_Recombination
Birad_recombination
Br_Abstraction

that were mentioned offline by @kfir4444

Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed
Comment thread arc/family/family_test.py Fixed

This comment was marked as resolved.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread arc/mapping/engine.py Outdated
Comment thread arc/mapping/engine_test.py Outdated
Comment thread arc/family/family.py
Comment thread arc/family/family_test.py Outdated
Comment thread arc/family/family_test.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.21%. Comparing base (e8ce91a) to head (fa8eba2).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #852      +/-   ##
==========================================
- Coverage   59.21%   59.21%   -0.01%     
==========================================
  Files          98       98              
  Lines       30023    30086      +63     
  Branches     7929     7951      +22     
==========================================
+ Hits        17779    17814      +35     
- Misses       9991    10031      +40     
+ Partials     2253     2241      -12     
Flag Coverage Δ
functionaltests 59.21% <ø> (-0.01%) ⬇️
unittests 59.21% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alongd alongd force-pushed the fam_recognition branch 2 times, most recently from 6d5f426 to 64f013f Compare April 5, 2026 10:25
@alongd alongd requested a review from calvinp0 April 7, 2026 08:37
Comment thread arc/family/family.py
Comment thread arc/family/family.py
Comment thread arc/family/family.py
p_label_map[atom.label] = base + j
label = atom.label
suffix = 2
while label in p_label_map:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, I don't see us defining p_label_map before it is used like we did with r_label_map

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 526: p_label_map: Dict[str, int] = dict()

Comment thread arc/family/family.py
label = val
suffix = 2
while label in r_label_map:
label = f'{val}_{suffix}'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the label here is a string and is set as a key in r_label_map would not the docstring need to be Dict[str, int]?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks! my typo from a prev PR. will fix

Copy link
Copy Markdown
Member

@calvinp0 calvinp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, much needed for ARC. Please check my comments

alongd added 8 commits April 7, 2026 14:59
Three root causes prevented ARC from identifying several RMG reaction families:

  1. get_reactant_num / get_product_num inferred counts by splitting the template group, ignoring explicit reactantNum / productNum in the RMG groups.py files. This broke Birad_recombination (reported 2 reactants instead of 1) and Br_Abstraction (reported 1 product instead of 2).
  2. check_product_isomorphism was hardcoded for 1-2 products and returned False for 3-product families (Bimolec_Hydroperoxide_Decomposition). Generalized to any count, with an InChI fallback for species whose XYZ  perception yields a different Lewis structure than SMILES.
  3. get_reaction_family_products crashed on families with unsupported atom types (e.g., Na) or invalid adjacency lists, aborting the entire scan. Now catches KeyError, ValueError, and InvalidAdjacencyListError to skip problematic families gracefully.
Combining a bunch of reaction families which were reported as problematic in the past
FORM_BOND with the same label on both atoms (e.g., ['FORM_BOND', '*', 1, '*']) resolved both to the same atom, raising ValueError. Now uses the first and second labeled atom when labels match. LOSE_RADICAL/GAIN_RADICAL now applies to all atoms with the given label, not just the first.

Fixes recognition of R_Recombination (CH3 + CH3 <=> C2H6) and similar families whose template groups span multiple reactant molecules with a shared label.
for InChI multiplicity check, duplicate labels, and family disambiguation
@alongd alongd force-pushed the fam_recognition branch from 64f013f to fa8eba2 Compare April 7, 2026 11:59
@alongd alongd merged commit 828ebef into main Apr 7, 2026
7 of 8 checks passed
@alongd alongd deleted the fam_recognition branch April 7, 2026 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants