Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASE calculator always fails on third calculation using same calculator instance #6

Closed
mkphuthi opened this issue Jun 27, 2022 · 2 comments

Comments

@mkphuthi
Copy link

Describe the bug
After loading a deployed model as an ASE calculator instance, the calculator consistently gives an error on the third different structure it calculates.

To Reproduce

from nequip.ase.nequip_calculator import NequIPCalculator
from ase.build import bulk
calc = NequIPCalculator.from_deployed_model('deployed_Li_model.pth', device='cpu')       #Same error for device='cuda'
a1 = bulk('Li', 'bcc', a=3.4)
a2 = bulk('Li', 'bcc', a=3.4).repeat([2,2,2])
a3 = bulk('Li', 'bcc', a=3.5)           #The 3 structures must be different
calc.get_potential_energy(a1)     #Works fine
calc.get_potential_energy(a2)     #Works fine
calc.get_potential_energy(a3)     #Gives torchscript error below even with forces

The traceback is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/ase/calculators/calculator.py", line 737, in get_property
    self.calculate(atoms, [name], system_changes)
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/nequip/ase/nequip_calculator.py", line 118, in calculate
    out = self.model(data)
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Unsupported value kind: Tensor

This happens for every 3 structures, I've tried multiple from different datasets and the error is reproduced.

Expected behavior
I expect to be able to "reuse" a calculator as many times as I want without having to create a new instance repeatedly

Environment (please complete the following information):

  • OS: centOS 7
  • python version: 3.9.12
  • python environment (commands are given for python interpreter):
    • nequip version: 0.5.5
    • e3nn version: 0.4.4
    • pytorch version: 1.10.2+cu102
  • (if relevant) GPU support with CUDA
    • cuda Version according to nvcc: 10.2
    • cuda version according to PyTorch : 10.2
@Linux-cpp-lisp
Copy link
Collaborator

Hi @mkphuthi ,

This is a known PyTorch bug in the JIT (which is why it always triggers on the 3rd call) that should be resolved by updating to PyTorch 1.11.

@mkphuthi
Copy link
Author

Resolved. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants