[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

naokishibuya · 2020-11-17T03:19:05Z

Hello,

I encountered an exception from relay with a PyTorch model that has nn.Sigmoid.

To reproduce the error, the script below can be used.

The model has only one operator nn.Sigmoid.
The model is quantized.
The model is jit-traced.
Then, relay.build throws an exception TVMError: Unresolved call Op(tir.exp).
It fails for both target='cuda' and target='llvm'.
If I replace nn.Sigmoid with nn.Linear, it works.

import torch                                                                                      
from torch import nn                                                                              
from torch.quantization import QuantStub, DeQuantStub                                             
import tvm                                                                                        
from tvm import relay                                                                             
                                                                                                  
                                                                                                  
class Dummy(nn.Module):                                                                           
    def __init__(self):                                                                           
        super().__init__()                                                                        
        #self.activ = nn.Linear(1, 1) # If we use this instead of Sigmoid, it works.              
        self.activ = nn.Sigmoid()                                                                 
        self.quant = QuantStub()                                                                  
        self.dequant = DeQuantStub()                                                              
                                                                                                  
    def forward(self, x):                                                                         
        x = self.quant(x)                                                                         
        x = self.activ(x)                                                                         
        x = self.dequant(x)                                                                       
        return x                                                                                  
                                                                                                  
                                                                                                  
print('0. Preparation')                                                                           
model = Dummy().eval()                                                                            
input_shape = (1, 1)                                                                              
inp = torch.zeros(input_shape)                                                                    
target = 'cuda'                                                                                   
#target = 'llvm'  # llvm also fails                                                               
                                                                                                  
print('1. Quantization')                                                                          
model.qconfig = torch.quantization.get_default_qconfig("fbgemm")                                  
torch.quantization.prepare(model, inplace=True)                                                   
model(inp) # dummy calibration                                                                    
torch.quantization.convert(model, inplace=True)                                                   
                                                                                                  
print('2. Torch jit tracing')                                                                     
script_module = torch.jit.trace(model, inp).eval()                                                
                                                                                                  
print('3. Relay frontend')                                                                        
input_name = "input"                                                                              
input_shapes = [(input_name, input_shape)]                                                        
mod, params = relay.frontend.from_pytorch(script_module, input_shapes)                            
                                                                                                  
print('4. TVM runtime creation for', target)                                                      
with tvm.transform.PassContext(opt_level=3):                                                      
    lib = relay.build(mod, target=target, params=params)                                          
runtime = tvm.contrib.graph_runtime.GraphModule(lib["default"](tvm.context(target, 0)))           
                                                                                                  
print('5. TVM runtime execution')                                                                 
runtime.set_input(input_name, inp)                                                                
runtime.run()                                                                                     
                                                                                                  
print('Output:', runtime.get_output(0).asnumpy())

The output from running the script is as follows:

$ python repro.py 
0. Preparation
1. Quantization
2. Torch jit tracing
3. Relay frontend
4. TVM runtime creation for cuda
Traceback (most recent call last):
  File "repro.py", line 46, in <module>
    lib = relay.build(mod, target=target, params=params)
  File "/workspace/incubator-tvm/python/tvm/relay/build_module.py", line 260, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
  File "/workspace/incubator-tvm/python/tvm/relay/build_module.py", line 127, in build
    self._build(mod, target, target_host)
  File "/workspace/incubator-tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::CastNode const*, std::ostream&)+0x253) [0x7fbab531a3a3]
  [bt] (7) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (6) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::DivNode const*, std::ostream&)+0x191) [0x7fbab5315e91]
  [bt] (5) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (4) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::AddNode const*, std::ostream&)+0x191) [0x7fbab53157a1]
  [bt] (3) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (2) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenCUDA::VisitExpr_(tvm::tir::CallNode const*, std::ostream&)+0x154) [0x7fbab5332ad4]
  [bt] (1) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::CallNode const*, std::ostream&)+0x6bf) [0x7fbab531bcef]
  [bt] (0) /workspace/incubator-tvm/build/libtvm.so(+0xcf38e8) [0x7fbab53128e8]
  File "/workspace/incubator-tvm/src/target/source/codegen_c.cc", line 652
TVMError: Unresolved call Op(tir.exp)

The text was updated successfully, but these errors were encountered:

masahi · 2020-11-17T04:07:09Z

I can reproduce the error with cuda and llvm backend. For llvm, I get

  File "/home/masa/projects/dev/tvm/src/target/llvm/codegen_llvm.cc", line 776
TVMError: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: f == false: Cannot find intrinsic declaration, possible type mismatch: llvm.exp

Since exp intrin is declared at https://github.com/apache/incubator-tvm/blob/main/src/target/llvm/intrin_rule_llvm.cc#L36-L37, I wonder what the problem is.

masahi · 2020-11-17T04:17:19Z

@naokishibuya If I remove quant/dequant, it works. So I think it is the problem of sigmoid applied to uint8 type. Are you sure you want to run sigmoid on quantized input? I'm not sure how sigmoid is supposed to work on uint8.

naokishibuya · 2020-11-17T04:52:38Z

@masahi that's a great findings! The actual model I use has sigmoid at the end of the network to make the outputs in [0, 1]. So, I could call dequant before the sigmoid to avoid this exception with little increase in the latency. Thanks again.

masahi self-assigned this Nov 17, 2020

naokishibuya closed this as completed Nov 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

naokishibuya commented Nov 17, 2020 •

edited

masahi commented Nov 17, 2020 •

edited

masahi commented Nov 17, 2020 •

edited

naokishibuya commented Nov 17, 2020

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

Comments

naokishibuya commented Nov 17, 2020 • edited

masahi commented Nov 17, 2020 • edited

masahi commented Nov 17, 2020 • edited

naokishibuya commented Nov 17, 2020

naokishibuya commented Nov 17, 2020 •

edited

masahi commented Nov 17, 2020 •

edited

masahi commented Nov 17, 2020 •

edited