Description
The problem is exactly the same as #1025. Adding the argmax layer to the onnx model will cause the prediction to fail, but without adding argmax, running argmax on the cpu will be very slow.
Environment
TensorRT Version: 8.0.1.6
NVIDIA GPU: RTX2080
NVIDIA Driver Version: 11.6
CUDA Version: 10.2
CUDNN Version: 7.6
Operating System: WIN10
Thanks.
#2416 (comment)