Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for dipole moments? (and transition dipoles) #12

Closed
nickhine opened this issue Aug 31, 2022 · 23 comments
Closed

Support for dipole moments? (and transition dipoles) #12

nickhine opened this issue Aug 31, 2022 · 23 comments
Assignees
Labels
enhancement New feature or request

Comments

@nickhine
Copy link

Hi,

Inspired by excellent talks on MACE at PsiK I gave it a go on the train home - it's a testament to how easy it is to use that I had a model trained on existing data for chromophores in solvent by the time I got back - excellent job!

To make it useful for the applications I have in mind in theoretical spectroscopy, it would need to be able to predict dipole moments, and ideally transition dipoles associated with excited states. My understanding from the theory behind MACE is that this should not be particularly hard to do, but I don't see anything to suggest it exists as a capability right now - is it in your plans?

Best wishes

Nick Hine (University of Warwick)

@davkovacs davkovacs self-assigned this Aug 31, 2022
@davkovacs
Copy link
Collaborator

Hi Nick,
I am working on a MACE dipole model right now. It is on my fork of MACE at the moment, I will message you when it is working probably next week and you can try it.

@nickhine
Copy link
Author

nickhine commented Sep 1, 2022

Fantastic - really glad to hear that and looking forward to testing it on big clusters of ethanol and acetonitrile.

Does it predict dipoles directly, or via partial charges on each atom? Benefits to both approaches I guess - TDMs can't necessarily be expressed via atom-centred partial charges, but partial charges can be useful in long range electrostatics.

Combined with the other proposed developments (stresses and exporting to LAMMPS) this will all combine into a pretty awesome package!

@davkovacs
Copy link
Collaborator

The model I am working on predicts atomic dipole moment vectors. The total dipole is computed as the sum of the atomic dipole contributions plus an optional fixed atomic charge contribution.
We could look into predicting dipoles from environment dependent partial charges too, though I am not a fan of that approach especially because of issues regarding global charge conservation.

@nickhine
Copy link
Author

nickhine commented Sep 1, 2022

That sounds ideal - the atomic dipole contributions would be able to handle any contributions in a TDM not expressible as
\mathbf{d} = \sum_I q_I \mathbf{R}_I

Worth noting that one might often want to learn transition dipole moments for an excited state as well as standard dipole moments.

PS: Oliver Unke's PhysNet code had a neat way of handling global charge conservation.

@davkovacs
Copy link
Collaborator

davkovacs commented Sep 5, 2022

Hi Nick,

if you want to try I have implemented the dipole prediction in this pull request:

#14

Any feedback would be very welcome. In my experience it is actually sufficient to use --num_interactions=1, i.e. a single layer MACE for learning dipoles. It might of course be because the dataset I was playing with is very easy. I was able to achieve test errors in the order of a 5-10 micro mili Debyes or even less if I went for a larger model.

As a word of caution, if you have a system that has an overall charge the predicted dipole moment will have a component that depends on the choice of the origin. This is dealt with by requiring the specification of atomic charges, that sum to the correct total charge. If you do not specify atomic charges they will be set to 0 by default, so that the predicted dipole will not have a contribution from them (it will be purely a sum of atom-centred dipole vectors).

@nickhine
Copy link
Author

nickhine commented Sep 5, 2022

Just trying to give that a go now: you have a line that my python setup doesn't like "from types import NoneType" - googling suggests that has been deprecated and "type(None)" should be used instead of NoneType.

I have a model training like that on a dataset consisting of ethanol, various forms of a dye, and clusters of both.

Can I check how it is meant to be used: if I understand correctly, setting model="AtomicDipolesMACE" just supplies dipoles as training data and is therefore going to create a separate model that only predicts dipoles. Would it be possible to load that alongside the energy and force model, in the same MACECalculator, so that it could handle both atoms.get_dipole_moment() and atoms.get_potential_energy() calls? Going to be tricky for me to test the result without using the ASE Calculator.

@davkovacs
Copy link
Collaborator

For now this model can only predict dipoles, and you need to train a separate model to predict the energies and forces. It would of course be possible to have a single model predicting both dipoles and energies and forces. For that we need to create a new model and a new loss function.

I can create an ASE calculator wrapping this dipole model that can only predict dipoles. Is it something that would be useful?

If you think it would be useful to have the combined model I can also have a go at that later this week.

@nickhine
Copy link
Author

nickhine commented Sep 5, 2022

I'll give it a go as it is - got it training now by removing the NoneType bit. Should be fine for testing.

For real use, in scenarios such as IR spectroscopy, PIMD etc, the a single model able to do energies, forces and dipoles with the same calculator would be the best option - otherwise you have to post-process with a different model to add dipoles.

@davkovacs
Copy link
Collaborator

Could you let me know where exactly that NoneType problem is? Maybe you are using a different version of python, because I have not experienced it, but I would like to fix it quickly.

Here is the ASE calculator for the dipole only model. I'll let you know when the combined model is ready.
#14 (comment)

@nickhine
Copy link
Author

nickhine commented Sep 5, 2022

It's in mace/mace/modules/models.py - wasn't there previously. I just switched it to using type(None) for the later comparison (line 532) and that was fine.

@davkovacs
Copy link
Collaborator

Should be fixed now.

@nickhine
Copy link
Author

nickhine commented Sep 6, 2022

Thanks. I have got it to train but not to generate a final model, and I have run into a few bits of unexpected behaviour:

  1. The loss function does not seem to be normalised appropriately for printing - first iteration it prints 0.0001 then by iteration 5 it shows 0.0000 and stays there.

  2. RMSE_MU_per_atom gets stuck at about 6 mDebye during training - that sounds worse than you were seeing? My clusters are probably larger though. I haven't been able to systematically test with the calculator against anything other than the training data because:

  3. CUDA runs out of memory every time when evaluating the metrics for the error table. This didn't happen with the same dataset when training energies and forces, and remains regardless of how small/simple I try to make the model (eg I tried max_ell = 2, num_interactions = 1, batch_size 4 and still went OOM)

slurm-1844592.out.txt

@davkovacs
Copy link
Collaborator

  1. I will look into changing the normalisation of the loss function for printing (but maybe also in general).

  2. 5- 6 mDebye is what I see when I am looking at more complicated systems, is that too high an error? I do not have a very good feel for it, but often that is well below 1%.

  3. Have you tried setting --valid_batch_size Maybe 12 is too large if you have very big molecules? One thing I noticed is that using L=2 gives a sizeable improvement (32x0e + 32x1o + 32x2e).

When I saw problems with running out of memory, I could always solve it by decreasing the batch size. I had this issue when I had much larger systems in the test set than in the training set. Can you just set the maximum epochs to 1, and submit the same script with the valid batch size decreased? But I agree, something very odd is happening,, and just observing the output file, it is not obvious to me what is the problem. @ilyes319 perhaps you have seen this happen before?

@nickhine
Copy link
Author

nickhine commented Sep 6, 2022

5-6mD is probably fine for most things like IR spectra. I just checked what I was able to get from PhysNet for the same data as that generates good IR predictions. I thought that was lower, but in fact it is about 4mD MAE for the whole dataset, so that's probably very similar actually given RMSE >~ MAE. It's just that you mentioned 5-10 microDebye above so I thought I might have got something wrong.

I had previously run with L=3 and valid_batch_size=4 and that still OOM'd. In fact it still OOMs with batch_size=valid_batch_size=1 even for a small model. I think something must be wrong either with that call or something earlier using a lot more memory than the original energy and force model.

@davkovacs
Copy link
Collaborator

I am really sorry about this, I cannot reproduce it on my computer.

I have implemented the combined model --model="EnergyDipolesMACE" in #14
It is basically a MACE, but in the readout phase it is predicting both an invariant scalar (energy) and a vector (dipole). I have also created the corresponding loss function, ASE calculator etc.

I have tried it on some toy data, and the following input seemed to give quite good results:

python ./mace/scripts/run_train.py \
    --name="EFMu_MACE" \
    --train_file="train.xyz" \
    --valid_fraction=0.05 \
    --test_file="test.xyz" \
    --charges_key="Qs" \
    --E0s="{1: 0.0, 8: -424.482621532878,}" \
    --model="EnergyDipolesMACE" \
    --num_interactions=2 \
    --max_ell=3 \
    --hidden_irreps='64x0e + 64x1o + 64x2e' \
    --num_cutoff_basis=5 \
    --correlation=3 \
    --r_max=4.5 \
    --batch_size=16 \
    --valid_batch_size=64 \
    --eval_interval=5 \
    --max_num_epochs=500 \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --loss="energy_forces_dipole" \
    --energy_weight=10.0 \
    --forces_weight=200.0 \
    --dipole_weight=1.0 \
    --error_table="EnergyDipoleRMSE" \
    --default_dtype="float32"\
    --device=cuda \
    --seed=1234 \

The defaults for weights in the loss function might not be optimal for your dataset, also I have implemented the swa energy optimisation, so if you find that the energy errors are too big, you could try setting --swa, and it will optimise the energies in the last phase of the training by changing the weights.

Let me know if this is what you wanted, or if anything else still doesn't work.

Also if you can maybe give me any hint how I could reproduce your memory issues that would also be helpful. I really don't see any reason at all from the code why the dipole model would use more memory than the energy model.

@nickhine
Copy link
Author

nickhine commented Sep 7, 2022

It is strange - I will check I have not broken anything in the wider setup by trying the energy&force model again for zero iterations to see that the evaluate step is still OK for that.

"EnergyDipolesMACE" sounds like exactly what I had in mind for IR on small molecules. I will give that a go soon too - unfortunately I'll be a bit busy for a few days now.

@davkovacs
Copy link
Collaborator

Small comment / recommendation, based on a few experiments I did, I definitely recommend using the flag --swa. It changes the loss function for the last 25% of the epochs to put higher weight on the energies, and this way the energy errors can improve in some cases by up to a factor of 5 without hurting the force or dipole errors.

@gabor1
Copy link
Collaborator

gabor1 commented Sep 8, 2022 via email

@nickhine
Copy link
Author

nickhine commented Sep 9, 2022

I have done a short 1-epoch training run of an EnergyDipolesMACE model, and it seems to work well. I used settings as above for specifying model size ie --num_interactions=2 --max_ell=3 --hidden_irreps='64x0e + 64x1o + 64x2e'. This time I got no issues with OOM while evaluating the final metrics so I did get the .model file. I don't really know what's different compared to when it was failing for AtomicDipolesMACE, but since EnergyDipolesMACE is what I want to be able to use, I don't propose to devote time to chasing that down.

Trying to use the calculator, I found I couldn't import EnergyDipolesMACE in the version as it stands: I had to edit mace/calculators/init.py as follows:

from .mace import DipoleMACECalculator, MACECalculator, EnergyDipoleMACECalculator
  
  __all__ = ["MACECalculator", "DipoleMACECalculator","EnergyDipoleMACECalculator"]

Did that bit maybe get missed from the commit?

I am training a full model now - quick tests of the 1-epoch version suggest it is not doing anything crazy and I can get sensible-looking dipoles from .get_dipole_moment()

@davkovacs
Copy link
Collaborator

That sounds great. I hope you will find that it works well. I am very interested to see if you have any interesting results.

As for the calculator, it was indeed accidentally left out of the commit, I have fixed it. Thank you for spotting it and reporting back.

@ilyes319 ilyes319 added the enhancement New feature or request label Sep 20, 2022
@ilyes319
Copy link
Contributor

I think the feature is added and working so I close for now.

@nickhine
Copy link
Author

Would it be possible to make a version of EnergyDipoleMACECalculator with the functionality of the MACECommitteeCalculator? Looks like it would just be a few extra lines.

@davkovacs
Copy link
Collaborator

Sorry for the late reply. I have implemented it in this PR and have refactored things a bit, so hopefully it is a bit nicer (but not backwards compatible).
#95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants