Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing a Mol block/file does not clear the "molTotValence" property from atoms #5423

Closed
ricrogz opened this issue Jul 11, 2022 · 0 comments
Closed
Labels
Milestone

Comments

@ricrogz
Copy link
Contributor

ricrogz commented Jul 11, 2022

When we parse a Mol block in which some atom has it's total valence specified, RDKit parses this valence into the molTotValence. After the parsing finishes, it uses the property to adjust Hydrogens on the atom(s) to match the specified valence, but the property is never cleared, and may "leak" if we export the molecule. It usually doesn't because most of the output formats in RDKit don't support atom properties, but CXSMILES does:

In [1]: mb = """ 
   ...:      RDKit          2D 
   ...:  
   ...:   0  0  0  0  0  0  0  0  0  0999 V3000 
   ...: M  V30 BEGIN CTAB 
   ...: M  V30 COUNTS 1 0 0 0 0 
   ...: M  V30 BEGIN ATOM 
   ...: M  V30 1 N -3.657143 -0.742857 0.000000 0 CHG=1 VAL=4 
   ...: M  V30 END ATOM 
   ...: M  V30 END CTAB 
   ...: M  END"""                                                                                                                                                                                                                                                                                        

In [2]: m = Chem.MolFromMolBlock(mb)                                                                                                                                                                                                                                                                     

In [3]: Chem.MolToCXSmiles(m)                                                                                                                                                                                                                                                                            
Out[3]: '[NH4+] |(-3.65714,-0.742857,),atomProp:0.molTotValence.4|'

The problem with this, besides propagating a property that the molecule didn't have in the first place, is that running some operations (e.g. a reaction) on the structure might change the total valence on the atom, but the property would still specify the initial value, as there isn't any code to update the property.

I think we should clear the molTotValence property once we have adjusted Hydrogens to match the valence in the postprocessing.

@ricrogz ricrogz added the bug label Jul 11, 2022
@ricrogz ricrogz mentioned this issue Jul 11, 2022
@greglandrum greglandrum added this to the 2022_03_5 milestone Jul 12, 2022
greglandrum pushed a commit that referenced this issue Aug 4, 2022
* fix

* always remove molTotValence
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants