Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Farthest point sampling method #13

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

AKaiser85
Copy link

The method selects those N structures form the training-set that are farthest from each other in terms of a distance norm in symmetry function values. To this end, it collects all symmetry function values for a single structure into a vector "allG", sorts allG, and calculates the distance 1/n*|allG[i]-allG[j]| for structures with the same number of atoms and elements.

This application was inspired by
https://doi.org/10.1063/1.5024611

This method is a filter for the training-set and can be used to reduce the training-set size easily. It chooses those N structures that are farthest from each other in terms of distances in sorted symmetry functions per structure.
can be used to reduce the training-set size easily. It chooses those N structures that are farthest from each other in terms of distances in sorted symmetry functions per structure.
added nnp-fpssampling support
- Merge branch 'master' of github.com:CompPhysVienna/n2p2
- Rename tool to nnp-fps
- Adapt coding style, minor code changes
- Internal log file usage
- Added CI tests for tool
- Updated test_nnp.h to asynchronous IO and std::future buffer
  (long output buffers would not fit in ipstream?)
- Added (unfinished) documentation
@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2020

Codecov Report

Merging #13 into master will increase coverage by 1.08%.
The diff coverage is 92.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #13      +/-   ##
==========================================
+ Coverage   58.33%   59.42%   +1.08%     
==========================================
  Files          78       79       +1     
  Lines       11380    11555     +175     
==========================================
+ Hits         6639     6866     +227     
+ Misses       4741     4689      -52     
Flag Coverage Δ
#cpp 60.56% <92.00%> (+1.22%) ⬆️
#python 59.42% <92.00%> (+1.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/libnnp/Structure.cpp 66.60% <75.00%> (+0.47%) ⬆️
src/application/nnp-fps.cpp 93.71% <93.71%> (ø)
src/libnnp/SymmetryFunction.cpp 94.77% <0.00%> (+0.74%) ⬆️
src/libnnp/SymmetryFunctionAngularNarrow.cpp 84.54% <0.00%> (+0.90%) ⬆️
src/libnnp/SymmetryFunctionRadial.cpp 78.51% <0.00%> (+1.65%) ⬆️
src/libnnp/SymmetryFunctionAngularWide.cpp 82.29% <0.00%> (+2.39%) ⬆️
src/libnnp/Mode.cpp 72.56% <0.00%> (+4.86%) ⬆️
src/libnnp/Element.cpp 84.97% <0.00%> (+9.84%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af08e78...11f1240. Read the comment docs.

@singraber
Copy link
Member

I have a little difficulty understanding the reasoning behind the ordering of allG in Structure.cpp:
https://github.com/AKaiser85/n2p2/blob/d21ab22a9cb75765e3952dd14162f517414a5287/src/libnnp/Structure.cpp#L888-L893

    // sort them elementwise
    for (vector<vector<float>>::iterator it1 = allG.begin();
         it1 != allG.end(); ++it1)
    {
        sort(it1->begin(), it1->end());
    }

because it shuffles completely the order of symmetry functions and atoms... I have to think about it...

@singraber
Copy link
Member

I think it would be reasonable to move the farthest point sampling from the level of structures to the atomic level, i.e. compare not combined symmetry function vectors of entire structures but rather atomic environment fingerprints of individual atoms.

@singraber singraber added enhancement New feature or request tool Adds a new application to the tools collection labels Sep 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request tool Adds a new application to the tools collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants