Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"not" queries in molfiles get inverted #5930

Closed
d-b-w opened this issue Jan 4, 2023 · 1 comment
Closed

"not" queries in molfiles get inverted #5930

d-b-w opened this issue Jan 4, 2023 · 1 comment
Assignees
Labels
Milestone

Comments

@d-b-w
Copy link
Contributor

d-b-w commented Jan 4, 2023

Describe the bug
"Not" queries get turned into the atom that I'm looking to exclude.

To Reproduce

In [1]: from rdkit import Chem                                                                                                                                                                                                
In [2]: text = """ 
  ...:      RDKit          2D 
  ...:   
  ...:   0  0  0  0  0  0  0  0  0  0999 V3000 
  ...: M  V30 BEGIN CTAB 
  ...: M  V30 COUNTS 1 0 0 0 0 
  ...: M  V30 BEGIN ATOM 
  ...: M  V30 1 "NOT [N]" -2.742857 0.057143 0.000000 0 
  ...: M  V30 END ATOM 
  ...: M  V30 END CTAB 
  ...: M  END 
  ...: $$$$ 
  ...: """                                                                                                                                                                                                                   

In [3]: mol = Chem.MolFromMolBlock(text)                                                                                                                                                                                      
In [4]: print(Chem.MolToMolBlock(mol, forceV3000=True))                                                                                                                                                                                        


     RDKit          2D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 1 0 0 0 0
M  V30 BEGIN ATOM
M  V30 1 N -2.742857 0.057143 0.000000 0
M  V30 END ATOM
M  V30 END CTAB
M  END

Expected behavior
I'd expect to get back the "not N". This also happens if I use the forceV3000 flag, it must be happening on read. If I change the input atom name from "NOT [N]" to "NOT [N,C]", then everything works as expected and the output matches the input.

Configuration (please complete the following information):

  • RDKit version: 2022.09.3
  • Are you using conda? no, a local Schrodinger build.
@d-b-w d-b-w added the bug label Jan 4, 2023
@greglandrum
Copy link
Member

It does look like things get parsed properly (look at the output SMARTS):

In [9]: m = Chem.MolFromMolBlock('''
   ...:   Mrv2211 01052304532D          
   ...: 
   ...:   0  0  0     0  0            999 V3000
   ...: M  V30 BEGIN CTAB
   ...: M  V30 COUNTS 2 1 0 0 0
   ...: M  V30 BEGIN ATOM
   ...: M  V30 1 C -4.375 5.4583 0 0
   ...: M  V30 2 "NOT [N]" -3.0413 6.2283 0 0
   ...: M  V30 END ATOM
   ...: M  V30 BEGIN BOND
   ...: M  V30 1 1 1 2
   ...: M  V30 END BOND
   ...: M  V30 END CTAB
   ...: M  END
   ...: ''');print(Chem.MolToSmarts(m));print(Chem.MolToV3KMolBlock(m));
[#6]-[!#7]

     RDKit          2D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 2 1 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C -4.375000 5.458300 0.000000 0
M  V30 2 N -3.041300 6.228300 0.000000 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 1 2
M  V30 END BOND
M  V30 END CTAB
M  END

And the output problem is limited to single element not lists:

  ...:   Mrv2211 01052304532D          
   ...: 
   ...:   0  0  0     0  0            999 V3000
   ...: M  V30 BEGIN CTAB
   ...: M  V30 COUNTS 2 1 0 0 0
   ...: M  V30 BEGIN ATOM
   ...: M  V30 1 C -4.375 5.4583 0 0
   ...: M  V30 2 "NOT [N,O]" -3.0413 6.2283 0 0
   ...: M  V30 END ATOM
   ...: M  V30 BEGIN BOND
   ...: M  V30 1 1 1 2
   ...: M  V30 END BOND
   ...: M  V30 END CTAB
   ...: M  END
   ...: ''');print(Chem.MolToSmarts(m));print(Chem.MolToV3KMolBlock(m));
[#6]-[!#7&!#8]

     RDKit          2D

  0  0  0  0  0  0  0  0  0  0999 V3000
M  V30 BEGIN CTAB
M  V30 COUNTS 2 1 0 0 0
M  V30 BEGIN ATOM
M  V30 1 C -4.375000 5.458300 0.000000 0
M  V30 2 "NOT [N,O]" -3.041300 6.228300 0.000000 0
M  V30 END ATOM
M  V30 BEGIN BOND
M  V30 1 1 1 2
M  V30 END BOND
M  V30 END CTAB
M  END

@greglandrum greglandrum self-assigned this Jan 5, 2023
@greglandrum greglandrum added this to the 2022_09_4 milestone Jan 5, 2023
greglandrum added a commit to greglandrum/rdkit that referenced this issue Jan 5, 2023
greglandrum added a commit that referenced this issue Jan 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants