-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #4721 #4722
Fixes #4721 #4722
Conversation
Hi @ptosco : the logic here fails for molecules where the dummy atom is in an aromatic ring but is connected to an atom which is not in one:
|
@greglandrum Right, that was way too straightforward to be true :-) I'll rework the PR it as soon as I have a bit of time. |
yeah, messing with the Kekulization code can't possibly be that easy. :-) |
…gs to rather tham just their size - expand the RingInfo API with a few useful methods - identify rings that are certainly aliphatic upfront - avoid unnecessary copying atomRings when RingInfo is already initialized
@greglandrum Don't be scared off by the number of changes. |
@greglandrum I'd like to improve further on this.
resolves to
This is still broken, too:
@greglandrum Do you agree with me? If so, I'll go ahead. |
@ptosco : yes, I completely agree. There should be one atom which is clearly aromatic |
- better handling of dummies in aromatic rings - exposed atomMembers() and bondMembers() - added several tests
@greglandrum this is now ready for review. In addition to the changes already discussed above I have done a pass of code modernization and cleanup. Apart from stylistic changes (replacement of |
- added test for the above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes suggested.
unsigned int ri = 0; | ||
for (const auto å : mol.getRingInfo()->atomRings()) { | ||
isRingNotCand.set(ri); | ||
for (auto ai : aring) { | ||
const auto at = mol.getAtomWithIdx(ai); | ||
if (isAromaticAtom(*at) && mol.getRingInfo()->numAtomRings(ai) == 1) { | ||
isRingNotCand.reset(ri); | ||
break; | ||
} | ||
} | ||
++ri; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment (not a request for a code change):
I'm starting to think that loops where you have to track an index (like this one) are better written as a loop over the index instead of using a range-based for. It's just too easy to mess up the indexing in the range-based for version.
Aside: it would be nice if C++ had the equivalent of Python's enumerate()
so that we could loop over index and element. I think this would be writeable with C++17.
@ptosco : sorry it took me so long to get to this one. |
@greglandrum I should have addressed all your remarks; please shou if that’s not the case. Thanks for reviewing! |
@ptosco since making behavioral changes to core pieces like kekulization makes me nervous, I tried a few things. This behavior doesn't feel right to me and I'm wondering if you can explain why it is right: |
@greglandrum It makes me nervous too, be assured, and I am totally for being conservative for low-level algorithms like kekulization/aromaticity as it is easy to break things. However, we should consider that the way the current aromaticity code deals with dummy atoms is not satisfactory (see examples above where structures are arbitrarily aromatized) and we should improve it if possible, rather than simply sticking to an unsatisfactory status quo. In the first example:
All 5 double bonds were already positioned in the SMILES, the system is fully conjugated and satisfies Hückel, so it is labelled as aromatic. In the second example
only 4 double bonds were positioned. With the available information, it is not possible to guess if the user wished the dummy atoms to be sp3 or sp2-hybridized. As there is no certainty on the intention of the user to represent an aromatic system or rather a bicyclotetraene, the system is left as is, as we discussed above.
The idea is that kekulization is inhibited unless there is certainty on the aromatic nature of the dummies. That seems a reasonable assumption to me, but I may certainly be wrong, and I am looking forward to hearing your opinion. |
@ptosco : that explanation makes perfect sense to me and I'm 👍 on the logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry for the lateness, but we did have some code to do this in the reaction sanitizer, in aromatize if possible for smarts patterns.
I'm just curious about the differences here. |
@bp-kelley Hi Brian, if
After this PR has been merged, sanitizing with
However, if you add an explicit double bond between the dummies (or use aromatic atoms/bonds in the SMARTS), the ambiguity is solved and therefore the structure will be aromatized upon sanitization:
So the behavior is consistent with what described in my comment above. |
@bp-kelley I think it’s important to remember (and I had to keep reminding myself of this) that the PR only changes the behavior for rings which contain dummy atoms. And then only for some of those, as @ptosco describes above. |
This PR fixes #4721. The fix is straightforward.