ASE calculator always fails on third calculation using same calculator instance #6

mkphuthi · 2022-06-27T13:59:57Z

Describe the bug
After loading a deployed model as an ASE calculator instance, the calculator consistently gives an error on the third different structure it calculates.

To Reproduce

from nequip.ase.nequip_calculator import NequIPCalculator
from ase.build import bulk
calc = NequIPCalculator.from_deployed_model('deployed_Li_model.pth', device='cpu')       #Same error for device='cuda'
a1 = bulk('Li', 'bcc', a=3.4)
a2 = bulk('Li', 'bcc', a=3.4).repeat([2,2,2])
a3 = bulk('Li', 'bcc', a=3.5)           #The 3 structures must be different
calc.get_potential_energy(a1)     #Works fine
calc.get_potential_energy(a2)     #Works fine
calc.get_potential_energy(a3)     #Gives torchscript error below even with forces

The traceback is:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/ase/calculators/calculator.py", line 737, in get_property
    self.calculate(atoms, [name], system_changes)
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/nequip/ase/nequip_calculator.py", line 118, in calculate
    out = self.model(data)
  File "/home/mphuthi/.conda/envs/allegro/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
RuntimeError: Unsupported value kind: Tensor

This happens for every 3 structures, I've tried multiple from different datasets and the error is reproduced.

Expected behavior
I expect to be able to "reuse" a calculator as many times as I want without having to create a new instance repeatedly

Environment (please complete the following information):

OS: centOS 7
python version: 3.9.12
python environment (commands are given for python interpreter):
- nequip version: 0.5.5
- e3nn version: 0.4.4
- pytorch version: 1.10.2+cu102
(if relevant) GPU support with CUDA
- cuda Version according to nvcc: 10.2
- cuda version according to PyTorch : 10.2

The text was updated successfully, but these errors were encountered:

Linux-cpp-lisp · 2022-06-27T17:58:40Z

Hi @mkphuthi ,

This is a known PyTorch bug in the JIT (which is why it always triggers on the 3rd call) that should be resolved by updating to PyTorch 1.11.

mkphuthi · 2022-06-27T18:25:15Z

Resolved. Thanks

mkphuthi closed this as completed Jun 27, 2022

Linux-cpp-lisp mentioned this issue Mar 15, 2023

RuntimeError when using nequip-evaluate mir-group/nequip#318

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASE calculator always fails on third calculation using same calculator instance #6

ASE calculator always fails on third calculation using same calculator instance #6

mkphuthi commented Jun 27, 2022

Linux-cpp-lisp commented Jun 27, 2022

mkphuthi commented Jun 27, 2022

ASE calculator always fails on third calculation using same calculator instance #6

ASE calculator always fails on third calculation using same calculator instance #6

Comments

mkphuthi commented Jun 27, 2022

Linux-cpp-lisp commented Jun 27, 2022

mkphuthi commented Jun 27, 2022