Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

Closed
naokishibuya opened this issue Nov 17, 2020 · 3 comments
Closed

[RELAY] [PyTorch] Quantized nn.Sigmoid causes an exception #6929

naokishibuya opened this issue Nov 17, 2020 · 3 comments
Assignees

Comments

@naokishibuya
Copy link

naokishibuya commented Nov 17, 2020

Hello,

I encountered an exception from relay with a PyTorch model that has nn.Sigmoid.

To reproduce the error, the script below can be used.

  • The model has only one operator nn.Sigmoid.
  • The model is quantized.
  • The model is jit-traced.
  • Then, relay.build throws an exception TVMError: Unresolved call Op(tir.exp).
  • It fails for both target='cuda' and target='llvm'.
  • If I replace nn.Sigmoid with nn.Linear, it works.
import torch                                                                                      
from torch import nn                                                                              
from torch.quantization import QuantStub, DeQuantStub                                             
import tvm                                                                                        
from tvm import relay                                                                             
                                                                                                  
                                                                                                  
class Dummy(nn.Module):                                                                           
    def __init__(self):                                                                           
        super().__init__()                                                                        
        #self.activ = nn.Linear(1, 1) # If we use this instead of Sigmoid, it works.              
        self.activ = nn.Sigmoid()                                                                 
        self.quant = QuantStub()                                                                  
        self.dequant = DeQuantStub()                                                              
                                                                                                  
    def forward(self, x):                                                                         
        x = self.quant(x)                                                                         
        x = self.activ(x)                                                                         
        x = self.dequant(x)                                                                       
        return x                                                                                  
                                                                                                  
                                                                                                  
print('0. Preparation')                                                                           
model = Dummy().eval()                                                                            
input_shape = (1, 1)                                                                              
inp = torch.zeros(input_shape)                                                                    
target = 'cuda'                                                                                   
#target = 'llvm'  # llvm also fails                                                               
                                                                                                  
print('1. Quantization')                                                                          
model.qconfig = torch.quantization.get_default_qconfig("fbgemm")                                  
torch.quantization.prepare(model, inplace=True)                                                   
model(inp) # dummy calibration                                                                    
torch.quantization.convert(model, inplace=True)                                                   
                                                                                                  
print('2. Torch jit tracing')                                                                     
script_module = torch.jit.trace(model, inp).eval()                                                
                                                                                                  
print('3. Relay frontend')                                                                        
input_name = "input"                                                                              
input_shapes = [(input_name, input_shape)]                                                        
mod, params = relay.frontend.from_pytorch(script_module, input_shapes)                            
                                                                                                  
print('4. TVM runtime creation for', target)                                                      
with tvm.transform.PassContext(opt_level=3):                                                      
    lib = relay.build(mod, target=target, params=params)                                          
runtime = tvm.contrib.graph_runtime.GraphModule(lib["default"](tvm.context(target, 0)))           
                                                                                                  
print('5. TVM runtime execution')                                                                 
runtime.set_input(input_name, inp)                                                                
runtime.run()                                                                                     
                                                                                                  
print('Output:', runtime.get_output(0).asnumpy())

The output from running the script is as follows:

$ python repro.py 
0. Preparation
1. Quantization
2. Torch jit tracing
3. Relay frontend
4. TVM runtime creation for cuda
Traceback (most recent call last):
  File "repro.py", line 46, in <module>
    lib = relay.build(mod, target=target, params=params)
  File "/workspace/incubator-tvm/python/tvm/relay/build_module.py", line 260, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
  File "/workspace/incubator-tvm/python/tvm/relay/build_module.py", line 127, in build
    self._build(mod, target, target_host)
  File "/workspace/incubator-tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in __call__
    raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::CastNode const*, std::ostream&)+0x253) [0x7fbab531a3a3]
  [bt] (7) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (6) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::DivNode const*, std::ostream&)+0x191) [0x7fbab5315e91]
  [bt] (5) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (4) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::AddNode const*, std::ostream&)+0x191) [0x7fbab53157a1]
  [bt] (3) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::PrintExpr(tvm::PrimExpr const&, std::ostream&)+0x95) [0x7fbab53151c5]
  [bt] (2) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenCUDA::VisitExpr_(tvm::tir::CallNode const*, std::ostream&)+0x154) [0x7fbab5332ad4]
  [bt] (1) /workspace/incubator-tvm/build/libtvm.so(tvm::codegen::CodeGenC::VisitExpr_(tvm::tir::CallNode const*, std::ostream&)+0x6bf) [0x7fbab531bcef]
  [bt] (0) /workspace/incubator-tvm/build/libtvm.so(+0xcf38e8) [0x7fbab53128e8]
  File "/workspace/incubator-tvm/src/target/source/codegen_c.cc", line 652
TVMError: Unresolved call Op(tir.exp)
@masahi masahi self-assigned this Nov 17, 2020
@masahi
Copy link
Member

masahi commented Nov 17, 2020

I can reproduce the error with cuda and llvm backend. For llvm, I get

  File "/home/masa/projects/dev/tvm/src/target/llvm/codegen_llvm.cc", line 776
TVMError: 
---------------------------------------------------------------
An internal invariant was violated during the execution of TVM.
Please read TVM's error reporting guidelines.
More details can be found here: https://discuss.tvm.ai/t/error-reporting/7793.
---------------------------------------------------------------
  Check failed: f == false: Cannot find intrinsic declaration, possible type mismatch: llvm.exp

Since exp intrin is declared at https://github.com/apache/incubator-tvm/blob/main/src/target/llvm/intrin_rule_llvm.cc#L36-L37, I wonder what the problem is.

@masahi
Copy link
Member

masahi commented Nov 17, 2020

@naokishibuya If I remove quant/dequant, it works. So I think it is the problem of sigmoid applied to uint8 type. Are you sure you want to run sigmoid on quantized input? I'm not sure how sigmoid is supposed to work on uint8.

@naokishibuya
Copy link
Author

@masahi that's a great findings! The actual model I use has sigmoid at the end of the network to make the outputs in [0, 1]. So, I could call dequant before the sigmoid to avoid this exception with little increase in the latency. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants