bugfix for atom meassages concerning bond feature vector #138
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Just confirmed a bug when using atom messages:
We set up the bond features in the order atom(length 133)-bond(length 14):
chemprop/chemprop/features/featurization.py
Lines 171 to 172 in 8258fcb
but then cut out the first (!) 14 values, instead of the last:
chemprop/chemprop/features/featurization.py
Lines 267 to 270 in 8258fcb
Have added a few print statements, and we indeed used the wrong features for atom messages (which are the first 14 one-hot encoded elements, and thus rather meaningless). I have corrected this bug by simply cutting out the last values instead of the first (alternatively, we could change the order of setting up the bond vectors, but this would not be backwards compatible even if atom-messages were not used).
A quick check of how this affects performance: