Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with scaffold_to_smiles function #158

Closed
Arkkienkeli opened this issue Apr 1, 2021 · 3 comments · Fixed by #159
Closed

Issue with scaffold_to_smiles function #158

Arkkienkeli opened this issue Apr 1, 2021 · 3 comments · Fixed by #159

Comments

@Arkkienkeli
Copy link

Hello,
I have an issue with scaffold_to_smiles function. The ChemProp version is from 31 March 2021.

I use it in a similar way like in https://github.com/chemprop/chemprop/blob/master/scripts/create_crossval_splits.py
and call it this way (data is MoleculeDataset here) :

scaffold_to_indices = scaffold_to_smiles(data.mols(), use_indices=True)

and I get an error:

Traceback (most recent call last):
  File "./chemprop/scaffold_split.py", line 70, in <module>
    create_crossval_splits()
  File "./chemprop/scaffold_split.py", line 64, in create_crossval_splits
    fold_indices = split_indices(all_indices, num_folds=5, data=data)
  File "./chemprop/scaffold_split.py", line 24, in split_indices
    scaffold_to_indices = scaffold_to_smiles(data.mols(), use_indices=True)
  File "./chemprop/chemprop/data/scaffold.py", line 41, in scaffold_to_smiles
    scaffold = generate_scaffold(mol)
  File "./chemprop/chemprop/data/scaffold.py", line 24, in generate_scaffold
    scaffold = MurckoScaffold.MurckoScaffoldSmiles(mol=mol, includeChirality=include_chirality)
  File "/home/.conda/envs/chemprop/lib/python3.9/site-packages/rdkit/Chem/Scaffolds/MurckoScaffold.py", line 109, in MurckoScaffoldSmiles
    scaffold = GetScaffoldForMol(mol)
  File "/home/.conda/envs/chemprop/lib/python3.9/site-packages/rdkit/Chem/Scaffolds/MurckoScaffold.py", line 73, in GetScaffoldForMol
    res = Chem.MurckoDecompose(mol)
Boost.Python.ArgumentError: Python argument types in
    rdkit.Chem.rdmolops.MurckoDecompose(list)
did not match C++ signature:
    MurckoDecompose(RDKit::ROMol mol)

There was a similar issue: #98
Do I use it correctly or is it a bug?
Thank you!

@hesther
Copy link
Contributor

hesther commented Apr 1, 2021

Thanks for reporting, this is another bug that must have been missed when we introduced multiple molecules input. I will look into and correct it shortly, but for your script, it should be sufficient to do:

scaffold_to_indices = scaffold_to_smiles(data.mols(flatten=True), use_indices=True)

(in case you have single-molecule input. If you have multi-molecule input, I believe scaffold splitting is not supported).

Let me know if this has worked.

@Arkkienkeli
Copy link
Author

@hesther
Yes, this workaround helped. Thank you!

@hesther hesther linked a pull request Apr 2, 2021 that will close this issue
@hesther
Copy link
Contributor

hesther commented Apr 2, 2021

@Arkkienkeli You are welcome, thanks again for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants