TorchModuleGraph traced_model parameter need to be traced in a specific way #2574

zheng-ningxin · 2020-06-17T23:13:16Z

Environment:

NNI version: master(latest)
NNI mode (local|remote|pai): local
Server OS (for remote mode only): Linux
Python version: 3.7
PyTorch/TensorFlow version: 1.4.0
Is conda/virtualenv/venv used?: yes
Is running in Docker?: No

I met an interesting problem of TorchModuleGraph. In the latest version of NNI, TorchModuleGraph has two sets of interfaces to build the graph, specifically, users can provide either the model and dummy input to trace the model, or an already traced model so that we don't need to trace the model again. The model and dummy input works fine, however, I found that if we use the traced_model as input for the TorchModuleGraph, we need to trace the model in a specific way, else it may meet some problems.
To reproduce:

class Mymodule(nn.Module):
    def __init__(self):
        super(Mymodule, self).__init__()
        self.c1=nn.Conv2d(4,20,3,groups=2)
        self.c2=nn.Conv2d(20,20,2,groups=2)
        self.c3=nn.Conv2d(20,20,3,groups=10)
    def forward(self,data):
        out=self.c1(data)
        out=self.c2(out)
        out=self.c3(out)
        return out

net = Mymodule().cuda()
data = torch.rand(1, 4, 224, 244).cuda()

When we trace the model in the following way(the same way with TrochModuleGraph to trace the model using model and dummy input) and pass the traced_model to TorchModuleGraph, it works fine.

with torch.onnx.set_training(net, False):
    trace = torch.jit.trace(net, data)
    torch._C._jit_pass_inline(trace.graph)
    _graph = TorchModuleGraph(net, data, trace)

In contrast, when we trace the model in a second way(as shown in the following code), it will raise an error in the code of tensorboard.

net.eval()
trace = torch.jit.trace(net, data)
torch._C._jit_pass_inline(trace.graph)
_graph = TorchModuleGraph(net, data, trace)

Output: 
Traceback (most recent call last):
  File "test_torchmodule_graph.py", line 32, in <module>
    _graph = TorchModuleGraph(net, data, trace)
  File "/home/core/znx/nni/build/nni/_graph_utils.py", line 238, in __init__
    self.name_to_node, self.input_to_node, self.output_to_node = self._build_graph()
  File "/home/core/znx/nni/build/nni/_graph_utils.py", line 525, in _build_graph
    node_cpps, input_to_node, output_to_node, 'module')
  File "/home/core/znx/nni/build/nni/_graph_utils.py", line 364, in _expand_module_node
    node_group, inputs=inputs, outputs=outputs)
  File "/home/core/znx/nni/build/nni/_graph_utils.py", line 210, in __init__
    self.add_nodes(node_cpps)
  File "/home/core/znx/nni/build/nni/_graph_utils.py", line 216, in add_nodes
    nodepy = NodePyOP(node_cpp)
  File "/home/core/anaconda3/envs/nnidoc/lib/python3.7/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 92, in __init__
    self.attributes = str({k: node_cpp[k] for k in node_cpp.attributeNames()}).replace("'", ' ')
  File "/home/core/anaconda3/envs/nnidoc/lib/python3.7/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 92, in <dictcomp>
    self.attributes = str({k: node_cpp[k] for k in node_cpp.attributeNames()}).replace("'", ' ')
TypeError: 'torch._C.Node' object is not subscriptable

More interestingly, when I trace the model in the second way(torch.onnx.set_training), but import torchvision this time, then everything works fine again.

import torchvision
net.eval()
trace = torch.jit.trace(net, data)
torch._C._jit_pass_inline(trace.graph)
_graph = TorchModuleGraph(net, data, trace)

I'll keep updating if I find something new.

The text was updated successfully, but these errors were encountered:

scarlett2018 added the nnidev label Jun 19, 2020

zheng-ningxin closed this as completed Jun 17, 2021

maksimovkonstantin mentioned this issue Oct 18, 2021

TypeError: 'torch._C.Node' object is not subscriptable openai/CLIP#79

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TorchModuleGraph traced_model parameter need to be traced in a specific way #2574

TorchModuleGraph traced_model parameter need to be traced in a specific way #2574

zheng-ningxin commented Jun 17, 2020 •

edited

Loading

TorchModuleGraph traced_model parameter need to be traced in a specific way #2574

TorchModuleGraph traced_model parameter need to be traced in a specific way #2574

Comments

zheng-ningxin commented Jun 17, 2020 • edited Loading

zheng-ningxin commented Jun 17, 2020 •

edited

Loading