Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MolFromSmiles and MolFromSmarts incorrectly accepting input with spaces #4503

Closed
greglandrum opened this issue Sep 13, 2021 · 3 comments
Closed

Comments

@greglandrum
Copy link
Member

To Reproduce
Here's the demo using 2021.03.5:

In [14]: Chem.MolFromSmiles('NON sense') is None
Out[14]: False

In [15]: Chem.MolFromSmiles('NON |sense|') is None
Out[15]: True

the same thing happens on master with both MolFromSmiles and MolFromSmarts.

Expected behavior
The default setting for SmilesParserParams::strictCXSMILES is True, so both of these should fail.

Configuration (please complete the following information):

  • RDKit version: master
  • OS: all
@greglandrum greglandrum added this to the 2021_09_1 milestone Sep 13, 2021
@greglandrum greglandrum changed the title MolFromSmiles and MolFromSmarts incorrectly accepting SMILES/SMARTS with spaces MolFromSmiles and MolFromSmarts incorrectly accepting input with spaces Sep 13, 2021
@greglandrum
Copy link
Member Author

greglandrum commented Sep 13, 2021

Thanks to @ricrogz for the examples in #4479

@greglandrum
Copy link
Member Author

After spending some time working on and thinking about this, I'm reluctant to make this change in the default behavior.
It's not hard to imagine that there's a lot of code out there which is perfectly happily working with having SMILES like NON sense parse without errors.
What I think I'm going to do is change the default value of {Smiles,Smarts}ParseParams::parseName to true, then these will both parse without problems by default, but if you want them to generate errors you can set parseName to false

greglandrum added a commit to greglandrum/rdkit that referenced this issue Sep 13, 2021
@bp-kelley
Copy link
Contributor

Historically, The RDKit used to strip any trailing whitespace, after a bit of debate, this behavior was changed here as the parser would just stop at the whitespace anyway.

The commit doesn't reflect that change in the merge

efcf242

note the commented out trim_if. We should probably just remove this line of code...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants