Open
Description
For Timm EfficientNet, Torch-TRT is 13% slower than ONNX-TRT. A noticeable difference is that Torch-TRT does not have Conv layer fused with activations (SiLU)
TorchTRT Optimization Level 3:
{ "name" : "[CONVOLUTION]-[aten_ops.convolution.default]-[blocks.0.0.se.conv_reduce/convolution_2]", "timeMs" : 14.7711, "averageMs" : 0.00563782, "medianMs" : 0.005152, "percentage" : 0.544981 }
{ "name" : "PWN(PWN([SIGMOID]-[aten_ops.sigmoid.default]-[blocks.0.0.se.act1/sigmoid_2]), PWN([ELEMENTWISE]-[aten_ops.mul.Tensor]-[blocks.0.0.se.act1/mul_2]))", "timeMs" : 13.3982, "averageMs" : 0.00511381, "medianMs" : 0.00512, "percentage" : 0.494328 }
Optimization level 5:
{ "name" : "[CONVOLUTION]_[aten_ops_convolution_default]_[blocks_0_0_se_conv_reduce/convolution_2]_myl0_5", "timeMs" : 12.3751, "averageMs" : 0.00512853, "medianMs" : 0.00512, "percentage" : 0.529315 }
{ "name" : "__myl_Silu_myl0_6", "timeMs" : 9.89395, "averageMs" : 0.00410027, "medianMs" : 0.004096, "percentage" : 0.423188 }
Onnxtrt:
{ "name" : "/blocks/blocks.0/blocks.0.0/se/conv_reduce/Conv + PWN(PWN(/blocks/blocks.0/blocks.0.0/se/act1/Sigmoid), PWN(/blocks/blocks.0/blocks.0.0/se/act1/Mul))", "timeMs" : 20.5231, "averageMs" : 0.00743321, "medianMs" : 0.007168, "percentage" : 0.815659 }