You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your benchmark is very interesting and I would like to do some experiments with it, but I haven't found instructions on how to use your pre-trained models.
Would you mind telling me whether the following code correctly loads and uses your models?
importnumpyasnpimportpandasaspdimporth5pyfromtensorflowimportkerasfromrdkit.ChemimportAllChemfromrdchiral.mainimportrdchiralRunTextdefget_fingerprint(smiles: str) ->np.ndarray:
mol=AllChem.MolFromSmiles(smiles)
assertmolisnotNonefp=AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=2048) # QUESTION: is this the right fingerprint?returnnp.array(fp, dtype=float)
# Load templatesdf_templates=pd.read_hdf("./data/uspto_rxn_n5_unique_templates.hdf5", key="table")
# Load model, defining custom metrics because without these it gave an error...model=keras.models.load_model(
"./data/uspto_rxn_n5_keras_model.hdf5",
custom_objects={
"top10_acc": keras.metrics.TopKCategoricalAccuracy(k=10, name="top10_acc"),
"top50_acc": keras.metrics.TopKCategoricalAccuracy(k=10, name="top50_acc"),
}
)
# Example use case: run the best reaction for the first 2 targetstest_smiles= ["O=C(O)COCCOCCOCCOCCOCCOCCOCC(F)(F)F", "COc1cc(N)c(Cl)cc1C(=O)NCCCC1CN(Cc2ccccc2)CCO1"]
x=np.stack([get_fingerprint(s) forsintest_smiles])
template_probs=model(x).numpy()
most_likely_reactions=template_probs.argmax(axis=1)
fori, sminenumerate(test_smiles):
reactants=rdchiralRunText(df_templates["retro_template"].values[most_likely_reactions[i]], sm)
print(f"{i}: {reactants} >> {sm}")
This code runs and produces the following output (in particular, the second reaction fails). Is this the output that you would expect?
And thanks for trying out PaRoutes and coming with feedback.
We chose not to provide extensive documentation at this time because we cannot foresee all the possible use-cases that might come up. This is an excellent question from you and I will make some useful notes on this.
I believe you have managed to produce code that reproduce my procedure to do this, which is to use them together with the aizynthfinder package. I will explain this procedure but first I would like to emphasize that yes the predicted reactants are what you would obtain from the first predicted template. However, we typically look at top-20 or maybe even top-50 of the templates just so avoid situations like your second example where the first predicted template is not applicable. The second one would produce these reactants: COc1cc(N)c(Cl)cc1C(=O)Cl.NCCCC1CN(Cc2ccccc2)CCO1.
The output of the do_expansion method is a list of tuples of reactions. Each tuple represents a unique set of precursors that could have arisen from different templates. So what I am printing in the example is the first predicted set of precursors from the first template.
Hopefully this helps and you will be able to use the trained model from PaRoutes.
Your benchmark is very interesting and I would like to do some experiments with it, but I haven't found instructions on how to use your pre-trained models.
Would you mind telling me whether the following code correctly loads and uses your models?
This code runs and produces the following output (in particular, the second reaction fails). Is this the output that you would expect?
Thank you in advance for answering my question. Great manuscript and keep up the good open source work! 💯
The text was updated successfully, but these errors were encountered: