Skip to content

Latest commit

 

History

History
68 lines (55 loc) · 10.1 KB

supported_ops_rts_versions.md

File metadata and controls

68 lines (55 loc) · 10.1 KB

Supported Operators & Runtimes

Operator Mapping (TIDL-RT):

No TIDL Layer Type ONNX Ops TFLite Ops Notes
1 TIDL_ConvolutionLayer Conv CONV_2D
DEPTHWISE_CONV_2D
  • Regular & Depthwise convolution are imported as convolution
  • For TFLite DepthwiseConv2dNative, depth_multiplier shall be 1 if number of input channels > 1
  • ReLU & Batchnorm layers get merged into convolution to get better performance
  • Validated kernel sizes: 1x1, 3x3, 5x5, 7x7,1x3,3x1,1x5,5x1,1x7,7x1
  • If stride == 4, only supported kernel == 11x11
  • if stride == 2, kernel should be less than 7. Even kernel dimensions like 2x2, 4x4, 6x6 are not supported
  • Asymmetric stride (e.g. stride of 2 along only the vertical direction or horizontal direction) is supported via a combination of a non-strided convolution operation along with a 1x1 max pooling kernel with the asymmetric stride
  • Depthwise Separable Convolution only supports 3x3,5x5,7x7 with stride 1 and 3x3 with stride 2
  • Dilated Convolution is only supported for non-strided convolution
  • NxN by stride N convolution get transformed into a transpose layer followed by a innerproduct layer
Note : Please refer to MMALIB's release notes in your SDK (/mmalib_{version}/docs/user_guide/index.html) for all supported configurations
Note : Some of the kernel combinations are not optimized in the current release, please refer to MMALIB's release notes for the same
2 TIDL_BatchNormLayer BatchNormalization
Relu
PRelu
Sigmoid
LeakyRelu
HardSigmoid
Tanh
Elu
RELU
LEAKY_RELU
TANH
HARDSIGMOID
ELU
  • ReLU, Scale, Bias, PReLU, Leaky ReLU, Hard Sigmoid, TanH, ELU & GELU get imported as batchnorm
  • All channel-wise broadcast operations are mapped to Batchnorm
3 TIDL_PoolingLayer MaxPool
AveragePool
GlobalAveragePool
MAX_POOL_2D
AVERAGE_POOL_2D
MEAN
  • Pooling has been validated for the following kernel sizes: 3x3,2x2s,1x1 with stride 1 and stride 2 (both horizontal and vertical dimensions)
  • Max pooling supports 1x1 filters with asymmetric stride
  • Max pooling additionally supports 1x2,1x3 filters with a stride of 2 (Along the horizontal direction) & 2x1,3x1 filters with a stride of 2 (Along the vertical direction)
4 TIDL_EltWiseLayer Add
Mul
ADD
MUL
  • Support for 2 input tensors validated extensively, multiple input tensors have limited validation
  • Supports broadcasting of dimensions above width
5 TIDL_InnerProductLayer Gemm, MatMul FULLY_CONNECTED
  • Broadcast is only supported in channel dimension
  • For TDA4VM variable input case, doesn’t support unsigned input
  • Higher dimensional matmuls can be realized by reshaping the dimensions higher than 3rd dimension into the 3rd dimension
6 TIDL_SoftMaxLayer Softmax SOFTMAX
  • Supports 8-bit(/16-bit) inputs with 8-bit(/16-bit) outputs (both input and output are of the same bit-depth) with axis support for width (axis=-1) for any NxCxHxW tensor
  • Supports integer (8/16-bit) to float softmax only for flattened inputs
7 TIDL_Deconv2DLayer ConvTranspose TRANSPOSE_CONV
  • Only 8x8, 4x4 and 2x2 kernel with 2x2 stride is supported. It is recommended to use Resize/Upsample to get better performance. This layer is not supported in 16-bit for AM62A/AM67A
8 TIDL_ConcatLayer Concat CONCATENATION
  • Concat is supported on channel, height or width axis
9 TIDL_SliceLayer Split
Slice
NA
  • Slice is supported on all axes except for the batch axis & only one axis can be sliced per operator
  • Patch merging expressed with strided slice will be transformed into a transpose layer
10 TIDL_CropLayer NA NA
11 TIDL_FlattenLayer Flatten NA
  • 16-bit is not optimal in the current version
12 TIDL_ArgMaxLayer ArgMax ARG_MAX
  • Only axis == 1 is supported (For Semantic Segmentation)
13 TIDL_DetectionOutputLayer NA NA
14 TIDL_ShuffleChannelLayer Reshape + Transpose + Reshape NA
15 TIDL_ResizeLayer UpSample RESIZE_NEAREST_NEIGHBOR
RESIZE_BILINEAR
  • Only power of 2 and symmetric resize is supported
    Any resize ratio which is power of 2 and greater than 4 will be placed by combination of 4x4 resize layer and 2x2 resize layer
    For example, an 8x8 resize will be replaced by a 4x4 resize followed by a 2x2 resize
16 TIDL_DepthToSpaceLayer DepthToSpace DEPTH_TO_SPACE
  • Supports non-strided convolution with upscale factors of 2, 4 and 8. This layer is currently not supported for AM62A/AM67A
17 TIDL_SigmoidLayer Sigmoid/Logistic SIGMOID/LOGISTIC
18 TIDL_PadLayer Pad PAD
19 TIDL_ColorConversionLayer NA NA
  • Only YUV420 NV12 format conversion to RGB/BGR color format is supported
20 TIDL_BatchReshapeLayer NA NA
21 TIDL_DataConvertLayer NA NA
22 TIDL_ReshapeLayer Reshape RESHAPE
23 TIDL_ScatterElementsLayer ScatterND NA
  • It is supported with following constraints:
    • 'data' input is ignored and assumed as complete zero value buffer
    • 'indices' with only int32 data type, even though operator interface is int64
    • 'updates' data type can be int8,int16,uint8,uint16
    • 'element' and 'line/vector' updates are only supported for now
    • The datatype of the ‘update’ is same as the data type of the input ‘data’
    • ‘index_put’ operator in pytorch can be used to generate this operator in the onnx mode
24 TIDL_GatherLayer Gather NA
  • It is supported with following constraints:
    • 'line/vector' gathers are only supported for now
    • ‘index_select’ operator in pytorch can be used to generate this operator in the onnx mode with restriction on indices tensor to be one-dimensional
25 TIDL_TransposeLayer Transpose TRANSPOSE
  • Support for 4D Transpose is enabled, i.e. every possible permutation of (Batch, Channel, Height, Width) is supported
26 TIDL_LayernormLayer ReduceMean-Sub-Pow(2)-ReduceMean-Add-Sqrt-Div NA
  • Only supports the width axis (axis=-1)

Other compatible layers

No ONNX Ops TFLite Ops Notes
1 Split Split layer will be removed after import
2 MINIMUM For ReLU6 / ReLU8
3 Clip Parametric activation threshold PACT

Supported model formats & operator versions

Proto files from the versions below are used for validating pre-trained models. In most cases, models from new versions should also work since the core operators tend to remain the same

  • ONNX - 1.13.0
  • ONNX Runtime - 1.14.0 (OPSET-18)
  • TFLite - Tensorflow 2.12.0

# Feature set comparison across devices
Feature AM62A AM67A AM68A AM68PA AM69A
Support for Asymmetric, Per Channel Quantization
(Asymmetric, per-axis quantization)
✔️ ✔️ ✔️ ✔️
Support for LUT accelerated non-linear activations1 ✔️ ✔️ ✔️

1LUT accelerated non-linear activations include Sigmoid, Hard Sigmoid, GELU, TanH, Softmax & ELU