Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty query produces empty match, but at the same time is considered non-matching #4138

Closed
i-tub opened this issue May 14, 2021 · 1 comment
Labels
Milestone

Comments

@i-tub
Copy link
Contributor

i-tub commented May 14, 2021

I found this a bit surprising: you can have an empty query that "matches" anything, producing an empty match, but at the same time it's considered non-matching:

In [31]: mol = Chem.MolFromSmiles('CCO')                                                                          

In [32]: q = Chem.MolFromSmarts('')                                                                               

In [33]: mol.HasSubstructMatch(q)                                                                                 
Out[33]: False

In [34]: mol.GetSubstructMatches(q)                                                                               
Out[34]: ((),)

In [35]: bool(mol.GetSubstructMatches(q))                                                                         
Out[35]: True

What the right behavior should be is a bit of a philosophical question, but I would argue that if GetSubstructMatches returns a truthy value, HasSubstructMatch should do so as well.

My preference would be for GetSubstructMatches should return () instead of ((),), however, so they would both be false in a Boolean context.

Configuration (please complete the following information):

  • RDKit version: 2021.03.1
  • OS: CentOS 7
  • If you are not using conda: how did you install the RDKit? Schrödinger
@i-tub i-tub added the bug label May 14, 2021
@greglandrum greglandrum added this to the 2021_03_3 milestone May 18, 2021
@greglandrum
Copy link
Member

Agreed. GetSubstructMatches() should return an empty tuple here.

greglandrum added a commit to greglandrum/rdkit that referenced this issue May 18, 2021
adds a short-circuit test to the substructure matching so that it immediately returns an empty result set if either the molecule or the query is empty
greglandrum added a commit that referenced this issue Jun 9, 2021
adds a short-circuit test to the substructure matching so that it immediately returns an empty result set if either the molecule or the query is empty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants