Adding Stereoisomer Enumeration #21

Feriolet · 2024-02-27T05:02:35Z

No description provided.

… Dimorphite_dl

DrrDom · 2024-02-27T21:05:29Z

easydock/database.py

+def get_isomers(mol):
+    opts = StereoEnumerationOptions(tryEmbedding=True,maxIsomers=32,rand=0xf00d)


max_isomers should be added to arguments and elevated to the level of script arguments (argparse arguments) with the default value 1.

add please spaces after the commas

DrrDom · 2024-02-27T21:06:35Z

easydock/database.py

+            if bond.GetStereo() == Chem.BondStereo.STEREOANY:
+                bond.SetStereo(Chem.rdChem.BondStereo.STEREONONE)
+        isomers = tuple(EnumerateStereoisomers(mol,options=opts))
+    return isomers

 def init_db(db_fname, input_fname, prefix=None):


max_isomers should be added to arguments with the default value 1

DrrDom · 2024-02-27T21:07:52Z

easydock/database.py

+        isomers = get_isomers(mol)
+        for stereo_index, stereo_mol in enumerate(isomers):
+            smi = Chem.MolToSmiles(stereo_mol, isomericSmiles=True)
+            if prefix:
+                mol_name = f'{prefix}-{mol_name}'
+            if mol_is_3d(mol):
+                data_mol.append((mol_name, stereo_index, smi, Chem.MolToMolBlock(stereo_mol)))
+            else:
+                data_smi.append((mol_name, stereo_index, smi))


if prefix: mol_name = f'{prefix}-{mol_name}' if mol_is_3d(mol): data_mol.append((mol_name, 0, smi, Chem.MolToMolBlock(stereo_mol))) else: isomers = get_isomers(mol, max_isomers=max_isomers) for stereo_index, stereo_mol in enumerate(isomers): smi = Chem.MolToSmiles(stereo_mol, isomericSmiles=True) data_smi.append((mol_name, stereo_index, smi))

Enumeration is only needed if input is not a 3D molecule

I added max_isomers to input arguments

I added 0 as a default stereo_id value

DrrDom · 2024-02-27T21:37:26Z

I looked through the changes and made some particular notes. Below are more general ones;

We may remove complex_id argument and use mol_id and stereo_id by default to name individual records. This will simplify code. Breaking backward compatibility should be a minor issue. Default value of stereo_id should be 0
Let's make implementation of alternative protonation scheme in a separate pull request. It is a little bit more complex. We have to add a command line argument, and a user may specify which protonation utility should be used from the list of supported ones. Dimorphite is not the best one and in future I'm thinking to implement support of different tools. Therefore, implementation should not be hard coded.
Please insert spaces where they usually needed and replace stereo_index with stereo_id to be more consistent with database field names.

…cking score

Feriolet · 2024-03-04T13:40:14Z

I have edited some of the commits based on your suggestions. Is there any more edits that I have missed?

DrrDom · 2024-03-08T17:55:18Z

easydock/database.py

@@ -136,23 +139,38 @@ def restore_setup_from_db(db_fname):

    return d, tmpfiles

+def get_isomers(mol, max_isomers):


add please default value max_isomers=1

DrrDom · 2024-03-08T17:56:19Z

easydock/database.py


-def init_db(db_fname, input_fname, prefix=None):
+def init_db(db_fname, input_fname, max_isomers, prefix=None):


add please default value max_isomers=1
This will make the code more compatible with previous versions, because previously by default we generated one random stereoisomer

DrrDom · 2024-03-08T17:57:05Z

easydock/database.py

        if prefix:
            mol_name = f'{prefix}-{mol_name}'
        if mol_is_3d(mol):
-            data_mol.append((mol_name, smi, Chem.MolToMolBlock(mol)))
+            smi = Chem.MolToSmiles(mol, isomericSmiles=True)
+            data_mol.append((mol_name, 0, smi, Chem.MolToMolBlock(stereo_mol)))


I think there should be mol, not stereo_mol

DrrDom · 2024-03-08T18:07:15Z

easydock/database.py

@@ -305,15 +324,16 @@ def add_protonation(db_fname, tautomerize=True, table_name='mols', add_sql=''):
            fd, output = tempfile.mkstemp()  # use output file to avoid overflow of stdout in extreme cases
            try:
                for smi, _, mol_id in data_list:
-                    tmp.write(f'{smi}\t{mol_id}\n')
+                    tmp.write(f'{smi}\t{mol_id}\t{stereo_id}\n')


I think we have to join mol_id and stereo_id, otherwise stereo_id may not become a part of a title of an output molecule

tmp.write(f'{smi}\t{mol_id}_{stereo_id}\n')

DrrDom · 2024-03-08T18:09:28Z

easydock/database.py

                for mol in Chem.SDMolSupplier(output, sanitize=False):
                    if mol:
-                        mol_name = mol.GetProp('_Name')
+                        mol_name, stereo_id = mol.GetProp('_Name').rsplit(maxsplit=1)


This should be adjusted to the previous comment

mol_name, stereo_id = mol.GetProp('_Name').rsplit('_', 1)

DrrDom · 2024-03-08T18:13:51Z

easydock/run_dock.py

@@ -173,6 +173,8 @@ def main():
                        help='number of cpus. This affects only docking on a single server.')
    parser.add_argument('-v', '--verbose', action='store_true', default=False,
                        help='print progress to STDERR.')
+    parser.add_argument('--max_isomers', metavar='N_STEREO', type=int, required=False, default=1,
+                        help='maximum number of isomers to enumerate. The default is set to 1.')


please replace isomer with stereoisomer to be maximally explicit

DrrDom · 2024-03-08T18:23:09Z

After those comments will be fixed, I think we will test and accept the PR.

Regarding replacement of chemaxon. I'll implement a generic interface which should enable easy integration of third-party protonation tools and you will be able to commit Dimorphite-DL. This will be really useful - #17.

Feriolet · 2024-03-12T01:46:09Z

Alright, I have edited the naming. Can you check if that is alright before testing it?

DrrDom · 2024-03-12T07:28:23Z

It looks OK, I do not see any issues for main functionality. It can be tested.

The will be one more change required - get_sdf_from_dock_db.py. It should be also adapted to stereoisomers, but I cannot decide how to do this to keep backward compatibility and easiness to use. I suggest to do that after we will implement the current changes. Probably I'll fix this script by myself. It has a little bit complex logic because we use it to extract data from similar database structures but not identical and keeping this compatibility is very convenient for other projects.

…e_sdf

…rary poses

DrrDom · 2024-03-14T09:54:28Z

It seems that everything works fine. Hope so)
Thanks a lot for the contribution!
I hope I'll soon implement a new interface to easily integrate custom protonation tools

Adding Stereoisomer Enumeration and Modifying ChemaXon protonation to…

d797435

… Dimorphite_dl

DrrDom reviewed Feb 27, 2024

View reviewed changes

Feriolet added 3 commits February 28, 2024 10:52

Updating pull request and revert protonation method back to ChemAxon

1fe8b43

Updating run_dock.py to have maxstereoisomer in init_db function

99319e3

Removing complex_id argument

ff2a105

Feriolet changed the title ~~Adding Stereoisomer Enumeration and Modifying ChemaXon protonation to Dimorphite_dl~~ Adding Stereoisomer Enumeration Feb 28, 2024

Feriolet added 4 commits February 28, 2024 14:00

Update init_db

b450d77

Fixing error of init_db function

7b87a18

Fixing error of max_isomer argument parse

115531f

Save only one stereoisomer for one mol_name (id) based on the best do…

ca5c668

…cking score

DrrDom reviewed Mar 8, 2024

View reviewed changes

Feriolet added 2 commits March 11, 2024 20:35

fix stereoisomer name and parameter

31ac14e

fix stereoisomer name and parameter

fc41f5a

DrrDom added 4 commits March 13, 2024 16:04

Change position of max_stereoisomers in arg list

7573791

Fix error in add_protonation and change treatment of stereo_id in sav…

a1cb826

…e_sdf

Fix potential issues in get_isomers function

7f60c8f

Adapt get_sdf_from_dock_db.py to stereo_id and fix retrieval of arbit…

17e01e4

…rary poses

DrrDom merged commit c2af6c3 into ci-lab-cz:master Mar 14, 2024

DrrDom mentioned this pull request Mar 14, 2024

adding stereoisomer enumeration #20

Closed

Feriolet deleted the new_branch branch March 14, 2024 09:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Stereoisomer Enumeration #21

Adding Stereoisomer Enumeration #21

Feriolet commented Feb 27, 2024

DrrDom Feb 27, 2024

DrrDom Feb 27, 2024

DrrDom Feb 27, 2024

DrrDom Feb 27, 2024 •

edited

Loading

DrrDom commented Feb 27, 2024

Feriolet commented Mar 4, 2024

DrrDom Mar 8, 2024

DrrDom Mar 8, 2024

DrrDom Mar 8, 2024

DrrDom Mar 8, 2024

DrrDom Mar 8, 2024

DrrDom Mar 8, 2024

DrrDom commented Mar 8, 2024

Feriolet commented Mar 12, 2024

DrrDom commented Mar 12, 2024

DrrDom commented Mar 14, 2024

		def get_isomers(mol):
		opts = StereoEnumerationOptions(tryEmbedding=True,maxIsomers=32,rand=0xf00d)

		@@ -136,23 +139,38 @@ def restore_setup_from_db(db_fname):

		return d, tmpfiles

		def get_isomers(mol, max_isomers):


		def init_db(db_fname, input_fname, prefix=None):
		def init_db(db_fname, input_fname, max_isomers, prefix=None):

Adding Stereoisomer Enumeration #21

Adding Stereoisomer Enumeration #21

Conversation

Feriolet commented Feb 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrrDom Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

DrrDom commented Feb 27, 2024

Feriolet commented Mar 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DrrDom commented Mar 8, 2024

Feriolet commented Mar 12, 2024

DrrDom commented Mar 12, 2024

DrrDom commented Mar 14, 2024

DrrDom Feb 27, 2024 •

edited

Loading