-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some molecules fail to generate BDE values in the pre-trained model, but works in the web API #12
Comments
do you have an example for a molecule this happens for? |
@pstjohn is it possible to add more logging to Alfabet to find out why empty result is returned? |
Possibly, are you not able to share an example molecule that gives an empty output? The best way to see the bonds the model is seeing is to just call the fragmentation function directly: from alfabet.fragment import get_fragments
get_fragments(smiles_string) |
I found 3 example molecules. CC(=O)OCC1=C\CC/C(C)=C/CC[C@@]2(C)CCC@@(/C=C/1)O2
|
Where did you get those SMILES strings? I'm getting rdkit parse errors: >>> rdkit.Chem.MolFromSmiles('CC(=O)OCC1=C\CC/C(C)=C/CC[C@@]2(C)CCC@@(/C=C/1)O2')
[13:07:52] SMILES Parse Error: syntax error while parsing: CC(=O)OCC1=C\CC/C(C)=C/CC[C@@]2(C)CCC@@(/C=C/1)O2
[13:07:52] SMILES Parse Error: Failed parsing SMILES 'CC(=O)OCC1=C\CC/C(C)=C/CC[C@@]2(C)CCC@@(/C=C/1)O2' for input: 'CC(=O)OCC1=C\CC/C(C)=C/CC[C@@]2(C)CCC@@(/C=C/1)O2' I'm not sure why alfabet isn't raising those errors directly, that's probably something I can fix |
The SMILES strings are in ChEMBL database. It seems that there are some errors while I copy the smiles string to github webpage. Because there "@" in the strings. The webpage (e.g. https://bde.ml.nrel.gov/result?name=CN1C%5BC%40H%5D2CN%28C%29C%5BC%40H%5D%28C1%29%5BC%40%40%5D2%28O%29c1ccccc1) will give the result but the python will return empty dataframe. |
Fixes #12, likely introduced in 0.4.0
Thanks for this, I see the issue now. I likely introduced this is 0.4.0; I have a fix and test (i think) in #13 |
Releasing a new patch version 0.4.1 to handle this -- let me know if you still have issues |
Empty DataFrame
Columns: [bond_type, fragment1, fragment2, is_valid_stereo, bde_pred, bdfe_pred, is_valid, molecule, bond_index, bde, bdfe, set]
Index: []
The text was updated successfully, but these errors were encountered: