-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Farthest point sampling method #13
base: master
Are you sure you want to change the base?
Conversation
This method is a filter for the training-set and can be used to reduce the training-set size easily. It chooses those N structures that are farthest from each other in terms of distances in sorted symmetry functions per structure.
This reverts commit 5aa03fe. revert
can be used to reduce the training-set size easily. It chooses those N structures that are farthest from each other in terms of distances in sorted symmetry functions per structure.
added nnp-fpssampling support
- Merge branch 'master' of github.com:CompPhysVienna/n2p2 - Rename tool to nnp-fps - Adapt coding style, minor code changes - Internal log file usage - Added CI tests for tool - Updated test_nnp.h to asynchronous IO and std::future buffer (long output buffers would not fit in ipstream?) - Added (unfinished) documentation
Codecov Report
@@ Coverage Diff @@
## master #13 +/- ##
==========================================
+ Coverage 58.33% 59.42% +1.08%
==========================================
Files 78 79 +1
Lines 11380 11555 +175
==========================================
+ Hits 6639 6866 +227
+ Misses 4741 4689 -52
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
I have a little difficulty understanding the reasoning behind the ordering of
because it shuffles completely the order of symmetry functions and atoms... I have to think about it... |
I think it would be reasonable to move the farthest point sampling from the level of structures to the atomic level, i.e. compare not combined symmetry function vectors of entire structures but rather atomic environment fingerprints of individual atoms. |
The method selects those N structures form the training-set that are farthest from each other in terms of a distance norm in symmetry function values. To this end, it collects all symmetry function values for a single structure into a vector "allG", sorts allG, and calculates the distance 1/n*|allG[i]-allG[j]| for structures with the same number of atoms and elements.
This application was inspired by
https://doi.org/10.1063/1.5024611