Please help, ModelOPT int8 quantized model runs slower than fp16 quantized model.

I benchmarked my model with fp16 + int8 engine and fp16 engine and it seems fp16 is faster somehow even though its a timm convnext model with a lot of convolution layers. 

Here is a quantized model : https://drive.google.com/file/d/1kFJnHLcFAVFWyrIEvJz-l3TKW0WgsasZ/view?usp=sharing demonstrating my issue.