-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GAP with SOAP for a molecule #213
Comments
The positions MUST be in Angstroms, otherwise the force (in eV/A) is not the precise derivative of the energy. Make sure the forces are really forces and not gradients (as given by some quantum chemistry packages).
On your soap command:
- (n_max,l_max) = (14,14) is complete overkill, and also slows everything down. set it to (12,6) to get high accuracy results, (8,4) for something quicker
- you need to add the "add_species=T" command, otherwise all atomic species are ignored
- This is further down, once you get the accuracy that you want, but if you want a potential that can be used for high temperature MD, you will need to do something to enforce atomic repulsion at close approach. Either an explicitly fitted 2b potential (can be added to the gap string, look at published papers), or some other baseline. In this case I recommend not using the "e0_method=average" command, but actually explicitly compute the isolated atom energies and add them to the training set (they will be picked up and used for e0 for each species)
Let me know how you get on!
…-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 26 Jun 2020, at 13:00, vvassilevg ***@***.***> wrote:
I would like to fit with GAP the PES of molecules using SOAP. As a test, I am using a glycine molecule (500 training points).
So far, I get high Mean Absolute Errors on the training set (the errors are of course even higher for the test set) for both Energies and Forces (above 0.07 eV and 0.7 ev/A, respectively).
I have tested different parameters for the gap fit command. One example is:
gap_fit at_file=train.xyz \
gap={soap cutoff=5.0 \
covariance_type=dot_product \
zeta=2 \
delta=0.016 \
atom_sigma=0.3 \
l_max=14 \
n_max=14 \
n_sparse=2000 \
sparse_method=cur_points} \
force_parameter_name=forces \
e0_method=average \
default_sigma={0.001 0.2 0.0 0.0} \
do_copy_at_file=F sparse_separate_file=F \
gp_file=gap_soap.xml
I have tried different values for l_max, n_max and cutoff.
An example of a molecule in my train.xyz files is:
10
Lattice="200.0 0.0 0.0 0.0 200.0 0.0 0.0 0.0 200.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-7735.046780 pbc="T T T"
N -5.400440 5.468773 2.837348 0.197830 0.001017 0.374624
H -6.365580 4.176578 3.913868 0.136175 0.068701 -0.293331
H -3.910505 6.045725 3.925944 -0.120680 0.003343 -0.049358
C -4.498210 4.417443 0.471706 -0.433094 1.062071 -0.682114
C -2.707543 2.205386 0.405318 0.102343 -0.550705 -0.104142
O -1.920056 1.231402 -1.521152 -0.009198 0.153232 0.169461
O -2.067417 1.333725 2.758112 -0.292697 0.513378 -0.084358
H -0.893916 -0.033471 2.454792 0.283477 -0.392345 -0.057309
H -6.148205 4.006446 -0.739463 0.074871 -0.442791 0.154181
H -3.631109 5.954848 -0.713001 0.060973 -0.415900 0.572343
Energy is in eV, forces in eV/A, and in this case, positions are in a.u., but I have also trained using A.
Is there anything missing (or wrong) when setting the gap_fit command and the soap descriptor?
I will be very grateful for any help you can provide.
Best regards,
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Dear Prof. Csányi, Thank you for your message. I already set my gap fit input with your suggested values of (n_max, l_max) and with "add_species=T", and I also set all positions in A. However, the results I am obtaining are practically the same (only forces improve a little). I think that my forces are not gradients. So, do you think that there is something else missing in my input? Best regards. |
Show me your new command line, and the scatter plot of target vs predicted energies, and target vs predicted force components, the latter coloured according to the element
…-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 29 Jun 2020, at 12:16, vvassilevg ***@***.***> wrote:
Dear Prof. Csányi,
Thank you for your message.
I already set my gap fit input with your suggested values of (n_max, l_max) and with "add_species=T", and I also set all positions in A. However, the results I am obtaining are practically the same (only forces improve a little).
I think that my forces are not gradients. So, do you think that there is something else missing in my input?
Best regards.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
The command is:
The plots are at the end of the message. Do you think that (n_max, l_max) = (12,6) could make a huge improvement, or Is it possible that I would need to add the 2b or 3b potentials? |
Why are you using such a small delta? it is supposed to be typical energy per atom (really: target function value for the descriptor, but here that is energy per atom), which I would expect to be about 0.1 eV, and you have a 1/10th of that. I wouldn’t expect the n_max,l_max to be your limiting factor.
…-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 29 Jun 2020, at 17:09, vvassilevg ***@***.***> wrote:
The command is:
gap_fit at_file=train.xyz \
gap={soap cutoff=5.0 \
covariance_type=dot_product \
zeta=2 \
delta=0.016 \
atom_sigma=0.3 \
add_species=T \
n_max=8 \
l_max=4 \
n_sparse=4000 \
sparse_method=cur_points} \
force_parameter_name=forces \
e0_method=average \
default_sigma={0.001 0.2 0.0 0.0} \
do_copy_at_file=F sparse_separate_file=F \
gp_file=gap_soap.xml
The plots are at the end of the message.
Do you think that (n_max, l_max) = (12,6) could make a huge improvement, or Is it possible that I would need to add the 2b or 3b potentials?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
I actually think the delta should not matter since there's only one GAP,
except for the relative ratio of regularization to delta which might be
smudging the predictions (otherwise the alphas just get scaled). I would
guess based on experience that the ratio atom_sigma to rcut might be too
small for only 8 radial basis functions to resolve accurately. I would
increase nmax or even better, increase atom_sigma. And increase the delta
as Gábor suggested.
…On Mon, 29 Jun 2020, 19:55 gabor1, ***@***.***> wrote:
Why are you using such a small delta? it is supposed to be typical energy
per atom (really: target function value for the descriptor, but here that
is energy per atom), which I would expect to be about 0.1 eV, and you have
a 1/10th of that. I wouldn’t expect the n_max,l_max to be your limiting
factor.
-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk.
http://www.cara.ngo/
> On 29 Jun 2020, at 17:09, vvassilevg ***@***.***> wrote:
>
>
> The command is:
>
> gap_fit at_file=train.xyz \
> gap={soap cutoff=5.0 \
> covariance_type=dot_product \
> zeta=2 \
> delta=0.016 \
> atom_sigma=0.3 \
> add_species=T \
> n_max=8 \
> l_max=4 \
> n_sparse=4000 \
> sparse_method=cur_points} \
> force_parameter_name=forces \
> e0_method=average \
> default_sigma={0.001 0.2 0.0 0.0} \
> do_copy_at_file=F sparse_separate_file=F \
> gp_file=gap_soap.xml
>
> The plots are at the end of the message.
>
> Do you think that (n_max, l_max) = (12,6) could make a huge improvement,
or Is it possible that I would need to add the 2b or 3b potentials?
>
>
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub, or unsubscribe.
>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#213 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADR3Q2ECT5QEDEIKJPWOQRDRZDBRXANCNFSM4OJIKUSQ>
.
|
I think the delta matters because there are both energies and forces... it's best to try to stick to the heuristic. I agree that there is a rescaling of both delta and sigma that probably leaves things invariant... n=8,l=4 is for a crude accuracy, but not this crude... |
And the fact that the forces are not around the x=y line is the really troublesome thing |
Yes, you are right about the deltas. I still think the atom_sigma is too
small for that cutoff. Compare 0.5/3.7 for the a-C GAP to 0.3/5 for this
one. Roughly twice as small and with the same number of radial functions. I
think that could make for a noisy kernel.
…On Mon, 29 Jun 2020, 20:37 gabor1, ***@***.***> wrote:
And the fact that the forces are not around the x=y line is the really
troublesome thing
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#213 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADR3Q2AS54NX7QTGUX4STZDRZDGLZANCNFSM4OJIKUSQ>
.
|
In general you are right Miguel, but the problems here seem to me bigger. |
Thank you both for your comments. I have tested different values of delta and the best result I have got so far is with 0.25: Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are practically the same, but slightly worse: Do you think I can improve the accuracy of forces with even higher values of delta and/or using (n_max,l_max) = (12,6)? Best regards, |
Your force errors look much much better. you should think carefully whether these are good enough for your purposes.. you are likely hitting the locality limit. no solutions are easy to make the descriptors longer range (you can use multiple soaps, but in any case you might need a lot more data etc). maybe time to compute some observable that is more directly related to what you want and see if it is good enough?
…-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/
On 30 Jun 2020, at 15:28, vvassilevg ***@***.***> wrote:
Thank you both for your comments.
I have tested different values of delta and the best result I have got so far is with 0.25:
Energy MAE -- 0.005587 eV
Force MAE -- 0.132692 eV/A
Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are practically the same, but slightly worse:
Energy MAE -- 0.006723 eV
Force MAE -- 0.143231 eV/A
Do you think I can improve the accuracy of forces with even higher values of delta and/or using (n_max,l_max) = (12,6)?
Best regards,
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Indeed, it looks a lot better. It seems that I was mistaken about both
deltas and the atom_sigma...
…On Tue, 30 Jun 2020 at 17:31, gabor1 ***@***.***> wrote:
Your force errors look much much better. you should think carefully
whether these are good enough for your purposes.. you are likely hitting
the locality limit. no solutions are easy to make the descriptors longer
range (you can use multiple soaps, but in any case you might need a lot
more data etc). maybe time to compute some observable that is more directly
related to what you want and see if it is good enough?
-- Gábor
Gábor Csányi
Professor of Molecular Modelling
Engineering Laboratory, University of Cambridge
Pembroke College Cambridge
Pembroke College supports CARA. A Lifeline to Academics at Risk.
http://www.cara.ngo/
> On 30 Jun 2020, at 15:28, vvassilevg ***@***.***> wrote:
>
>
> Thank you both for your comments.
>
> I have tested different values of delta and the best result I have got
so far is with 0.25:
> Energy MAE -- 0.005587 eV
> Force MAE -- 0.132692 eV/A
>
>
>
> Then I kept delta=0.25 and increased atoms_sigma to 0.5, the errors are
practically the same, but slightly worse:
> Energy MAE -- 0.006723 eV
> Force MAE -- 0.143231 eV/A
>
>
>
> Do you think I can improve the accuracy of forces with even higher
values of delta and/or using (n_max,l_max) = (12,6)?
>
> Best regards,
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub, or unsubscribe.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#213 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADR3Q2CCFAYNZIIHHQWZ5ELRZHZLJANCNFSM4OJIKUSQ>
.
--
*Dr. Miguel Caro*
*Academy of Finland Postdoctoral Researcher*
Department of Electrical Engineering and Automation
and Department of Applied Physics
Aalto University <http://www.aalto.fi>, Finland
*Email*: mcaroba@gmail.com
*Work*: miguel.caro@aalto.fi
*Website*: miguelcaro.org
|
I guess now that I know how to tune the parameters I can test the models to see how they perform and also work with other systems. Thank you for your help with this issue. Best regards, |
I would like to fit with GAP the PES of molecules using SOAP. As a test, I am using a glycine molecule (500 training points).
So far, I get high Mean Absolute Errors on the training set (the errors are of course even higher for the test set) for both Energies and Forces (above 0.07 eV and 0.7 ev/A, respectively).
I have tested different parameters for the gap fit command. One example is:
I have tried different values for l_max, n_max and cutoff.
An example of a molecule in my train.xyz files is:
10
Lattice="200.0 0.0 0.0 0.0 200.0 0.0 0.0 0.0 200.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-7735.046780 pbc="T T T"
N -5.400440 5.468773 2.837348 0.197830 0.001017 0.374624
H -6.365580 4.176578 3.913868 0.136175 0.068701 -0.293331
H -3.910505 6.045725 3.925944 -0.120680 0.003343 -0.049358
C -4.498210 4.417443 0.471706 -0.433094 1.062071 -0.682114
C -2.707543 2.205386 0.405318 0.102343 -0.550705 -0.104142
O -1.920056 1.231402 -1.521152 -0.009198 0.153232 0.169461
O -2.067417 1.333725 2.758112 -0.292697 0.513378 -0.084358
H -0.893916 -0.033471 2.454792 0.283477 -0.392345 -0.057309
H -6.148205 4.006446 -0.739463 0.074871 -0.442791 0.154181
H -3.631109 5.954848 -0.713001 0.060973 -0.415900 0.572343
Energy is in eV, forces in eV/A, and in this case, positions are in a.u., but I have also trained using A.
Is there anything missing (or wrong) when setting the gap_fit command and the soap descriptor?
I will be very grateful for any help you can provide.
Best regards,
The text was updated successfully, but these errors were encountered: