-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Function - Save the latent representation for a molecules #119
Conversation
This pull request introduces 5 alerts when merging a2927bf into 1e7d122 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well, have tested it also on single molecule predictions and manually checked that the outputted fingerprints are actually the MPNN output.
There are some minor errors in the description of the model_fingerprint
function, seems like a copy-paste error, or possible a relic from a previous function.
Seems good to me otherwise!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you!
Kevin, I briefly fell for the same, but Simon actually implemented the last hidden layer of the FNN to be saved, not the MPNN output (which are two different use cases, one outputting the last latent representation before outputting the target values, and the other outputting the convoluted, aggregated atomic fingerprints after message passing). So I think we are good (and should have both options) |
Thanks for pointing that out @hesther! I agree, both of these should be available options. |
4850364
to
5c06e1b
Compare
@hesther @kevingreenman @mliu49 following pr #135 I was able to remove a little bit of weird trickiness from this one about how it handles smiles_columns. I think it's all good now and it runs successfully for one mol, two mols, and by command line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, and tested a bit - works fine from my point of view! Only issue that might arise: Once PR 137 is merged, we will need to update the arguments checking/reading/scaler section in chemprop/train/molecule_fingerprint.py
according to chemprop/train/make_predictions.py
@mliu49 can we merge this? |
I just merged #137. Could you update the arguments to molecule_fingerprint.py as suggested by Esther, and then rebase on master? (no need to squash the two commits) |
Make compatible with multiple molecules and newer chemprop Enable fingerprint command line function and adds description to readme Clean up import line Removing scratch work comments and fixing function description Change molecule number assertion to a raised error Change the assert for a fingerprint from multiple checkpoints to error
5c06e1b
to
214af42
Compare
I think this looks good. Somehow I missed the fact that there's significant code overlap with make_predictions. It would be really nice if some overlap could be split off into individual functions to be imported by molecule_fingerprints in the future. |
I added a new function that saves the fingerprint vector for a set of provided molecules. The function mirrors what is going on in predictions, except that the output is what is provided by the MPN before it runs through the FFN (excludes features concatenated onto the vector).
The function works through the command line (when
pip install -e .
is run again) and when using two molecules.Brief explanation of the file changes: