-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LAMMPS implementation terminates unexpectedly #87
Comments
Hey, I am tagging @wcwitt who will be able to help more in details. |
Hey, Looking at the trace it seems that you are trying to use a model compiled on GPU to run on CPU in archer. Could you check that the model you are loading is saved on a CPU? To do so, you could do:
and then load the model_cpu through your setup. |
That was my first guess too, based on the CUDA in the trace. Try @ilyes319's suggestion and we can go from there. @zakmachachi, just a warning, if you have access to a decent GPU, I predict you will prefer using that for MD. The CPU LAMMPS can't really compete (yet) in performance for most use cases. Feel free to email at wcw28@cam.ac.uk if you want to discuss any details you'd rather not post here. |
Hey, thanks for the swift reply both! So I used this script to switch the model from GPU to CPU compilation:
And it worked! But now I get the following error:
Some more info:
And got the following error from LAMMPS:
I guess the obvious thing here is to train and compile on Archer2, but I chose the other cluster as they have some fancy RTX cards which were not running out of memory during training. Archer2 sadly does not have any GPUs so training is an issue. @wcwitt Have you setup GPU MACE runs for LAMMPS? I think this could be an interesting approach as I think this swapping between CPU and GPU compilations is a bit messy from my side! |
@wcwitt @zakmachachi Can we close this issue, is there a fix somewhere? |
We've been emailing about it in combination with some other things. Let's leave it open for a bit longer and I'll post once it's ready to close |
Sure thank you! |
Describe the bug
I have installed the LAMMPS implementation of MACE on Archer2 and compiled the potential as per the instructions and I receive the following error when I run a small LAMMPS script:
To Reproduce
Steps to reproduce the behavior:
Cheers,
Zak
The text was updated successfully, but these errors were encountered: