Skip to content

TorchScript and freezing of module attributes broken, _C._jit_pass_lower_graph failing #43948

@j-paulus

Description

@j-paulus

🐛 Bug

Problem appearing when scripting a model containing BatchNorm and then trying to use the scripted version for ONNX or CoreML export.

If forward() modifies an attribute of the model, something in the combination of scripting the model and then trying to convert it to ONNX / CoreML triggers an error in torch._C._jit_pass_lower_graph(model.forward.graph, model._c) resulting into the message the only valid use of a module is looking up an attribute but found = prim::SetAttr.

This seems to be closely related to #41674, #37720, and #34002.

To Reproduce

The triggering behaviour in my model is because of a BatchNorm layer, but it can be reproduced even with a simple counter attribute.

N, C, L = 2, 4, 5
if False:
    # example from: https://github.com/pytorch/pytorch/issues/34002#issuecomment-656769904
    class MyClass(torch.nn.Module):
        def __init__(self):
            super(MyClass, self).__init__()
            self.num_batches_tracked = 0
        def forward(self, x):
            self.num_batches_tracked += 1
            return x

else:
    # example using BN
    class MyClass(torch.nn.Module):
        def __init__(self):
            super(MyClass, self).__init__()
            self.bn = torch.nn.BatchNorm1d( C )
        def forward(self, x):
            return self.bn(x)

model = MyClass()
model.eval()
x_in = torch.zeros((N, C, L))
traced_model = torch.jit.trace(model, x_in)
scripted_model = torch.jit.script(model)

# ONNX export
print('ONNX export of plain model...')
torch.onnx.export(model, x_in, 'f.onnx', example_outputs=x_in)  # => OK

print('ONNX export of scripted model...')
torch.onnx.export(scripted_model, x_in, 'f.onnx', example_outputs=x_in) #  => FAIL

Output:

ONNX export of plain model...
ONNX export of scripted model...
Traceback (most recent call last):
File "torchscript_bug.py", line 46, in
torch.onnx.export(scripted_model, x_in, 'f.onnx', example_outputs=x_in) # => FAIL
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/torch/onnx/init.py", line 163, in export
return utils.export(model, args, f, export_params, verbose, training,
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/torch/onnx/utils.py", line 63, in export
_export(model, args, f, export_params, verbose, training, input_names, output_names,
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/torch/onnx/utils.py", line 483, in _export
graph, params_dict, torch_out = _model_to_graph(model, args, verbose,
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/torch/onnx/utils.py", line 320, in _model_to_graph
method_graph, params = torch._C._jit_pass_lower_graph(model.forward.graph, model._c)
RuntimeError:
temporary: the only valid use of a module is looking up an attribute but found = prim::SetAttr[name="num_batches_tracked"](%2, %23)
:

Using CoreML export instead of ONNX:

# CoreML export
import coremltools as ct

print('CoreML export of plain model...')
coreml_model = ct.convert(
    traced_model,
    inputs=[ct.TensorType(shape=x_in.shape)]
)  # => OK

print('CoreML export of scripted model...')
coreml_model = ct.convert(
    scripted_model,
    inputs=[ct.TensorType(shape=x_in.shape)]
)  #  => FAIL

Output:

CoreML export of plain model...
Converting Frontend ==> MIL Ops: 80%|████████████████████████ | 4/5 [00:00<00:00, 4141.50 ops/s]
Running MIL optimization passes: 100%|████████████████████████| 16/16 [00:00<00:00, 12241.68 passes/s]
Translating MIL ==> MLModel Ops: 100%|█████████████████████████████| 6/6 [00:00<00:00, 16131.94 ops/s]
CoreML export of scripted model...
Traceback (most recent call last):
File "torchscript_bug.py", line 59, in
coreml_model = ct.convert(
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/_converters_entry.py", line 303, in convert
proto_spec = _convert(
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 134, in _convert
prog = frontend_converter(model, **kwargs)
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/mil/converter.py", line 84, in call
return load(*args, **kwargs)
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 73, in load
converter = TorchConverter(torchscript, inputs, outputs, cut_at_symbols)
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 142, in init
raw_graph, params_dict = self._expand_and_optimize_ir(self.torchscript)
File "/Users/name/opt/anaconda3/envs/coreml_test/lib/python3.8/site-packages/coremltools/converters/mil/frontend/torch/converter.py", line 250, in _expand_and_optimize_ir
graph, params = _torch._C._jit_pass_lower_graph(
RuntimeError:
temporary: the only valid use of a module is looking up an attribute but found = prim::SetAttr[name="num_batches_tracked"](%2, %23)
:

Expected behavior

I would expect converting the scripted model work similar to the export on non-scripted or traced model.

Environment

PyTorch version: 1.7.0.dev20200823
Is debug build: False
CUDA used to build PyTorch: None

OS: Mac OSX 10.15.6 (x86_64)
GCC version: Could not collect
Clang version: 11.0.3 (clang-1103.0.32.62)
CMake version: version 3.18.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.7.0.dev20200823
[conda] blas 1.0 mkl
[conda] mkl 2019.4 233
[conda] mkl-service 2.3.0 py38hfbe908c_0
[conda] mkl_fft 1.1.0 py38hc64f4ea_0
[conda] mkl_random 1.1.1 py38h959d312_0
[conda] numpy 1.19.1 py38h3b9f5b6_0
[conda] numpy-base 1.19.1 py38hcfb5961_0
[conda] pytorch 1.7.0.dev20200823 py3.8_0 pytorch-nightly

coremltools 4.0b3

Exactly same behaviour also with PyTorch version: 1.5.1 (since CoreML Tools officially supports only this).

Additional context

cc @gmagogsfm @BowenBao @neginraoof

Metadata

Metadata

Assignees

Labels

module: onnxRelated to torch.onnxtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions