You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the REINVENT 4 Mol2MolSimilarity model generates new molecules with a similar 2D structure but not necessarily a similar 3D structure. Our goal is to train the REINVENT 4 Mol2MolSimilarity model to produce molecules with a 3D structure similar to the input molecule.
Approach 1
To achieve this, we will train the Mol2MolSimilarity model to generate new molecules with a similar 3D shape. The training process involves the following steps:
Input the molecule into the Mol2MolSimilarity model.
Pass the generated molecule to smiles-to-3d, which will generate 3D conformers from the SMILES notation.
smiles-to-3d will produce an SDF file, which we will then pass to vsflow to obtain the similarity score.
Use this similarity score as feedback for the Mol2MolSimilarity model to improve its performance.
Approach 2
While most steps are similar to Approach 1, this approach explores alternative tools for generating 3D conformers and calculating 3D shape similarity scores. One such tool we can investigate is Cheese.
Scope
Initiative 🐋
Objective(s)
To develop a model that can efficiently produce new molecules with a 3D structure similar to that of the input molecule.
We do a 3D Shape Similarity search via the CHEESE API. We can use the Enamine REAL database as the reference library. From that search, we get the top N compounds (L).
Of these 100 compounds, we use 80% as a training set for the transfer learning, 10% as a validation set, and 10% as a held out test set.
We train (fine-tune) REINVENT with transfer learning using the training set, and controling its performance with the validation set.
With the test set, we make sure that, indeed, the test compounds are similar in 3D shape to molecule A. This 3D shape comparison can be done with VSFlow, if that is easier.
I hope this makes sense?
Then, I have more ideas to complicate things further (for example, to search against other databases such as ZINC in CHEESE, to do multiple similarity searches, to penalize molecules that are similar in 2D (favouring scaffold hopping), etc.). But let's go step by step.
Please let me know if something is not clear, @ankitskvmdam !
Summary
Currently, the REINVENT 4 Mol2MolSimilarity model generates new molecules with a similar 2D structure but not necessarily a similar 3D structure. Our goal is to train the REINVENT 4 Mol2MolSimilarity model to produce molecules with a 3D structure similar to the input molecule.
Approach 1
To achieve this, we will train the Mol2MolSimilarity model to generate new molecules with a similar 3D shape. The training process involves the following steps:
Approach 2
While most steps are similar to Approach 1, this approach explores alternative tools for generating 3D conformers and calculating 3D shape similarity scores. One such tool we can investigate is Cheese.
Scope
Initiative 🐋
Objective(s)
To develop a model that can efficiently produce new molecules with a 3D structure similar to that of the input molecule.
Team
Timeline
TBD
Documentation
The text was updated successfully, but these errors were encountered: