Merged
Conversation
some complementary groups were not actually complementary this ensures complementary groups are not used if they are not truly complementary
…or and make default
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixed small bug in atom set handling in extension generation
Fixed bug in complementary group handling. Complementary groups that were generated off of bond creation extensions were still considered complementary by node generation even though some were not complementary for the training set. The change checks whether the group is complementary for the training data at that node and only adds it as a complementary node if it is complementary with respect to the associated training data.
Removed a duplicate line of code
Small improvement to dictionary generation
Add weighting of multi evaluation regressor node selection based on occurrence (along with uncertainty) and make it default. The application I originally developed the algorithm for involved training on data that was distributed differently than the prediction cases. However, in most applications one should assume the training distribution is the same as the prediction distribution. This change doesn't seem to matter so much for larger training sets and does not always improve model performance, but it does make a very significant difference in improving model performance consistency in the <1000 datapoint regime.