Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grow_mol error when used on fragments with dummy atoms #13

Closed
xescape opened this issue Apr 12, 2022 · 2 comments
Closed

grow_mol error when used on fragments with dummy atoms #13

xescape opened this issue Apr 12, 2022 · 2 comments

Comments

@xescape
Copy link

xescape commented Apr 12, 2022

Hello,

Thank you for maintain CReM! I ran into an error while using the program, and would like to see if you have any insights into this problem.

I'm trying to grow a fragment generated using RGroupDecomposition in rdkit, like so:

from rdkit import Chem
from rdkit.Chem import rdRGroupDecomposition as rgd
from crem.crem import grow_mol

mol = Chem.MolFromSmiles("Cc4cccc(Cc3cnc(NC(=O)c1cnn2ccc(C)nc12)s3)c4")
core = Chem.MolFromSmiles("Cc1cnc(NC(=O))s1")
params = rgd.RGroupDecompositionParameters()
params.RGroupLabels = rgd.RGroupLabels.AtomIndexLabels
fragment = rgd.RGroupDecompose([core], [mol], options=params)[0][0]['R1']
display(fragment)

image
The R1:1 is a dummy atom generated by RGroupDecomposition. If I try to run grow_mol at this point, I receive this error:

derivatives = list(grow_mol(r, db_name='/home/javi/data/rgroup/replacements02_sa2.db', max_replacements=9, return_mol=True))
>>>
ValueError                                Traceback (most recent call last)
crem.ipynb Cell 3' in <cell line: 62>()
---> 66 derivatives = list(grow_mol(r, db_name='replacements02_sa2.db', max_replacements=9, return_mol=True))

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/crem.py:487, in mutate_mol(mol, db_name, radius, min_size, max_size, min_rel_size, max_rel_size, min_inc, max_inc, max_replacements, replace_cycles, replace_ids, protected_ids, symmetry_fixes, min_freq, return_rxn, return_rxn_freq, return_mol, ncores, **kwargs)
    483 protected_ids = sorted(protected_ids)
    485 if ncores == 1:
--> 487     for frag_sma, core_sma, freq, ids in __gen_replacements(mol1=mol, mol2=None, db_name=db_name, radius=radius,
    488                                                             min_size=min_size, max_size=max_size,
    489                                                             min_rel_size=min_rel_size, max_rel_size=max_rel_size,
    490                                                             min_inc=min_inc, max_inc=max_inc,
    491                                                             max_replacements=max_replacements,
    492                                                             replace_cycles=replace_cycles,
    493                                                             protected_ids_1=protected_ids, protected_ids_2=None,
    494                                                             min_freq=min_freq, symmetry_fixes=symmetry_fixes,
    495                                                             **kwargs):
    496         for smi, m, rxn in __frag_replace(mol, None, frag_sma, core_sma, radius, ids, None):
    497             if max_replacements is None or (max_replacements is not None and len(products) < max_replacements):

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/crem.py:339, in __gen_replacements(mol1, mol2, db_name, radius, dist, min_size, max_size, min_rel_size, max_rel_size, min_inc, max_inc, max_replacements, replace_cycles, protected_ids_1, protected_ids_2, min_freq, symmetry_fixes, **kwargs)
    334 hac_ratio = num_heavy_atoms / mol_hac
    336 if (min_size <= num_heavy_atoms <= max_size and min_rel_size <= hac_ratio <= max_rel_size) \
    337         or (replace_cycles and cycle_pattern.search(core)):
--> 339     frag_sma = combine_core_env_to_rxn_smarts(core, env)
    341     min_atoms = num_heavy_atoms + min_inc
    342     max_atoms = num_heavy_atoms + max_inc

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/mol_context.py:349, in combine_core_env_to_rxn_smarts(core, env, keep_h)
    346         links[i].append(a.GetNeighbors()[0].GetIdx())
    347         att_to_remove.append(a.GetIdx())
--> 349 for i, j in links.values():
    350     m.AddBond(i, j, Chem.BondType.SINGLE)
    352 for i in sorted(att_to_remove, reverse=True):

ValueError: too many values to unpack (expected 2)

However, when I try to replicate the problem later on in a notebook, I get a different error:

list(grow_mol(fragment, db_name='replacements02_sa2.db', max_replacements=9, return_mol=True))
>>>
RuntimeError                              Traceback (most recent call last)
crem.ipynb Cell 4' in <cell line: 10>()
---> 10 list(grow_mol(fragment, db_name='replacements02_sa2.db', max_replacements=9, protected_ids=[9,10], return_mol=True))

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/crem.py:487, in mutate_mol(mol, db_name, radius, min_size, max_size, min_rel_size, max_rel_size, min_inc, max_inc, max_replacements, replace_cycles, replace_ids, protected_ids, symmetry_fixes, min_freq, return_rxn, return_rxn_freq, return_mol, ncores, **kwargs)
    483 protected_ids = sorted(protected_ids)
    485 if ncores == 1:
--> 487     for frag_sma, core_sma, freq, ids in __gen_replacements(mol1=mol, mol2=None, db_name=db_name, radius=radius,
    488                                                             min_size=min_size, max_size=max_size,
    489                                                             min_rel_size=min_rel_size, max_rel_size=max_rel_size,
    490                                                             min_inc=min_inc, max_inc=max_inc,
    491                                                             max_replacements=max_replacements,
    492                                                             replace_cycles=replace_cycles,
    493                                                             protected_ids_1=protected_ids, protected_ids_2=None,
    494                                                             min_freq=min_freq, symmetry_fixes=symmetry_fixes,
    495                                                             **kwargs):
    496         for smi, m, rxn in __frag_replace(mol, None, frag_sma, core_sma, radius, ids, None):
    497             if max_replacements is None or (max_replacements is not None and len(products) < max_replacements):

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/crem.py:314, in __gen_replacements(mol1, mol2, db_name, radius, dist, min_size, max_size, min_rel_size, max_rel_size, min_inc, max_inc, max_replacements, replace_cycles, protected_ids_1, protected_ids_2, min_freq, symmetry_fixes, **kwargs)
    312 else:
    313     mol = mol1
--> 314     f = __fragment_mol(mol, radius, protected_ids=protected_ids_1, symmetry_fixes=symmetry_fixes)
    316 if f:
    317     mol_hac = mol.GetNumHeavyAtoms()

File ~/workspace/sci-dev/sci-dev-venv/lib/python3.8/site-packages/crem/crem.py:95, in __fragment_mol(mol, radius, return_ids, keep_stereo, protected_ids, symmetry_fixes)
     92         atom.SetIntProp("Index", atom.GetIdx())
     94 # heavy atoms
---> 95 frags = rdMMPA.FragmentMol(mol, pattern="[!#1]!@!=!#[!#1]", maxCuts=4, resultsAsMols=True, maxCutBonds=30)
     96 frags += rdMMPA.FragmentMol(mol, pattern="[!#1]!@!=!#[!#1]", maxCuts=3, resultsAsMols=True, maxCutBonds=30)
     97 # hydrogen atoms

RuntimeError: Pre-condition Violation
	RingInfo not initialized
	Violation occurred on line 83 in file Code/GraphMol/RingInfo.cpp
	Failed Expression: df_init
	RDKIT: 2021.09.5
	BOOST: 1_67

This expression works fine when I use a molecule without a dummy atom, for example the core molecule defined above. Any help with this would be greatly appreciated.

Thank you!

@DrrDom
Copy link
Owner

DrrDom commented Apr 12, 2022

You are welcome!

If you would like to make replacements at a particular position there are two options.

Option 1. Remove a dummy atom, store the id of the attached atom (idx) and pass it as an argument replace_ids=[idx] to grow_mol. Before calling grow_mol you need to sanitize a molecule Chem.SanitizeMol(fragment) (that is why you have an error in the second case, after manipulation with a molecule you need to update internal states).

Option 2 (tricky). Replace a dummy atom with a hydrogen, sanitize a molecule and then call mutate_mol with min_size=0 and max_size=0. This will replace all explicit hydrogens in a molecule. While you maintain explicit hydrogens at attachment points only you will get what you want. This is less explicit and a more dangerous way.
In the example I decreased the radius to 2, because it cannot find such a context at radius 3 in a database.

idx = None
for atom in fragment.GetAtoms():
    if atom.GetAtomicNum() == 0:
        idx = atom.GetIdx()
        break

frag = Chem.EditableMol(fragment)
frag.ReplaceAtom(idx, Chem.Atom(1))
frag = frag.GetMol()
Chem.SanitizeMol(frag)

res = list(mutate_mol(frag, db_name='replacements.db', max_replacements=9, return_mol=True, min_size=0, max_size=0, radius=2))

@xescape
Copy link
Author

xescape commented Apr 12, 2022

It seems like removing the dummy atom and running SanitizeMol solved my problems. Thank you for the timely help!

@xescape xescape closed this as completed Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants