New in this Release
Description | Notes |
---|---|
Support for new device J722S/AM67A | |
Support for vision transformer models (Deit, Swin, DETR) Added/optimized new operators : Matmul, broadcasted (matmul, eltwise), 2D softmax, Layernorm, Patch embedding, Patch Merging, GeLU, SiLU | TDA4VM has limited validation, refer TIDL-3867 |
Support for ConvNext and YoloV8 model architecture Added/optimized new operators : Object detection layer for YoloV8 | TDA4VM has limited validation for matmul with variable input, refer TIDL-3867 |
Improved robustness for low latency inference mode (advanced_options:inference_mode = TIDL_infereneModeLowLatency) | Only applicable for AM69A/J784S4 |
Support for non-linear activation functions ( Tanh, Sigmoid, Softmax, GELU, ELU, SiLU ) for AM62A and J722S | Other devices already have in previous release(s) |
Optimization of scatterND sum operator | |
Migration TFLite-RT version 2.12 |
Fixed in this Release
ID | Description | Affected Platforms |
---|---|---|
TIDL-2950 | 7x7 Depthwise separable convolution with number of groups greater than panelWidth / kernel rows , results in wrong output on EVM/Target | All except AM62, TDA4VM |
TIDL-3873 | Transpose behavior for different combinations is not stable | All except AM62 |
TIDL-3874 | Matmul operator has issues with (A) With variable/activations as inputs (B) different dimensions | All except AM62 |
TIDL-3833 | Model inference gets stuck in conv layer on target/EVM, works on host emulation with below warning during init stage of inference : WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow! | All except TDA4VM, AM62 |
TIDL-3831 | Max Pool with asymmetric stride on target/EVM has functional mismatch with host emulation, target behavior is incorrect | All except AM62 |
TIDL-3812 | TVM_DLR : Models with two dimensional softmax have functional issue | All except AM62 |
TIDL-3773 | Layers with multiple consumers running in asymmetric quantization gives wrong output if any of the consumers do not support assyemtric quantization | All except AM62, TDA4VM |
TIDL-3747 | Resize layer with "coordinate_transformation_mode": "align_corners" not supported in TIDL | All except AM62 |
TIDL-3714 | Protobuf version is not in sync with what is required for model compilation | All except AM62 |
TIDL-3679 | Model compilation fails with quantization_scale_type:4 and tensor_bits:16-bit | All except AM62 |
TIDL-3659 | Concat Layer Along Height/Width giving wrong output on target when number of input channels is one | All except AM62 |
TIDL-3648 | Concat layer gives wrong output on target/evm with following message : "WorkloadUnit_XfrLinkInit: Error: Out of channel:" | All except AM62 |
TIDL-3641 | Low latency inference mode (inference mode = 2) has undergone limited functional validation | AM69A ( J784S4) |
TIDL-3010 | Data convert layer that does the layout change (from NHWC to NCHW) hangs on the target/EVM when the shape of the input tensor to the data convert is of the form 1x1x1xN | All except AM62 |
TIDL-2878 | Object detection post processing results in crash if all convolution heads are not part of same subgraph | All except AM62 |
TIDL-2821 | Non depthwise separable convolution layers with input pad = 0 and running in 16-bit hangs on EVM | All except AM62 |
TIDL-1878 | Custom layer with float output is resulting into error during compilation | All except AM62 |
Known Issues
ID | Description | Affected Platforms | Occurrence | Workaround in this release |
---|---|---|---|---|
TIDL-3863 | Networks with 7x7 depth wise separable layers and very large number of layers results in compilation failure. Refer error message “Memory limit exceeded for Workload Creation. Max number of Workload Limit per core is” during compilation stage | All except AM62 | Very Rare | Modify the network to avoid 7x7 DWS layers |
TIDL-3866 | Vision transformers with layerNorm operator with 16-bit data type can have bit mismatch b/w “Host emulation” vs “target/EVM”. This bit mismatch (1-bit delta) is harmless in correct functional behavior and can be ignored | All except AM62 | Frequent | None |
TIDL-3867 | Vision Transformer and Matmul with variable input feature has undergone limited validation on TDA4VM/J721E | TDA4VM/J721E | Frequent | None |
TIDL-3864 | Matmul with broadcast is supported with below constraint Let’s say if the tensor dimensions as input to Matmul are B1 x N1 x C1 x H1 x W1 and B2 x N2 x C2 x H2 x W2 then Either B1=N1=C1=1 should satisfy or B2=N2=C2=1 should satisfy |
All except AM62 | Frequent | None |
TIDL-3865 | Eltwise with broadcast is supported with below constraint Let’s say if the tensor dimensions as input to Eltwise are B1 x N1 x C1 x H1 x W1 and B2 x N2 x C2 x H2 x W2 then Either B1=N1=C1=1 should satisfy or B2=N2=C2=1 should satisfy W1 has to be W2 Examples: - (1) 1 x 1 x 1 x 1x 5 and 1 x 1 x 1 x 10 x 5 – supported - (2) 1 x 1 x 1 x 10 x 5 and 1 x 1 x 20 x 10 x 5 – supported - (3) 1 x 1 x 1 x 10 x 1 and 1 x 1 x 1 x 10 x 5 – not supported |
All except AM62 | Frequent | None |
TIDL-3868 | Vision transformer support has below constraints 1. advanced_options:inference_mode = TIDL_infereneModeLowLatency is not supported 2. advanced_options:high_resolution_optimization = 1 is not supported 3. Mixed precision is not supported |
All except AM62 | Frequent | None |
TIDL-3870 | Partial network with batch dimension is not supported | All except AM62 | Frequent | Use scripts provided as part of Model optimization |
TIDL-2991 | Non-strided row flow convolution with top pad > 1 and procSize < inWidth can lead to incorrect outputs | All except AM62 | Rare | Modify the network to avoid this situation |
TIDL-2947 | Convolution with pad greater than the input width results in incorrect outputs | All except AM62 | Rare | Modify the network to avoid this situation |
TIDL-3704 | CPP inference mode of ONNX RT throws below error and results in functional incorrect behavior “onnxruntime::common::Status onnxruntime::IExecutionFrame::GetOrCreateNodeOutputMLValue(int, int, const onnxruntime::TensorShape*, OrtValue*&, const onnxruntime::Node&) shape && tensor.Shape() == *shape was false. OrtValue shape verification failed.” |
All except AM62 | Rare | Set export TIDL_RT_ONNX_VARDIM=1 To prevent this |
TIDL-2592 | TFLite-RT with TIDL-RT delegation support models with only 4 dimensional tensors | All except AM62 | Rare | None |
TIDL-3872 | Preemption of a network by another network is not supported | J722S | Frequent | None |
TIDL-3871 | Low latency inference mode (single network instance split across multiple c7x cores) expressed by option advanced_options:inference_mode = TIDL_infereneModeLowLatency is not supported | J722S | Frequent |