Skip to content

09_02_06_00

Latest
Compare
Choose a tag to compare
@vtrip97 vtrip97 released this 08 Apr 14:42

New in this Release

Description Notes
Support for new device J722S/AM67A
Support for vision transformer models (Deit, Swin, DETR) Added/optimized new operators : Matmul, broadcasted (matmul, eltwise), 2D softmax, Layernorm, Patch embedding, Patch Merging, GeLU, SiLU TDA4VM has limited validation, refer TIDL-3867
Support for ConvNext and YoloV8 model architecture Added/optimized new operators : Object detection layer for YoloV8 TDA4VM has limited validation for matmul with variable input, refer TIDL-3867
Improved robustness for low latency inference mode (advanced_options:inference_mode = TIDL_infereneModeLowLatency) Only applicable for AM69A/J784S4
Support for non-linear activation functions ( Tanh, Sigmoid, Softmax, GELU, ELU, SiLU ) for AM62A and J722S Other devices already have in previous release(s)
Optimization of scatterND sum operator
Migration TFLite-RT version 2.12

Fixed in this Release

ID Description Affected Platforms
TIDL-2950 7x7 Depthwise separable convolution with number of groups greater than panelWidth / kernel rows , results in wrong output on EVM/Target All except AM62, TDA4VM
TIDL-3873 Transpose behavior for different combinations is not stable All except AM62
TIDL-3874 Matmul operator has issues with (A) With variable/activations as inputs (B) different dimensions All except AM62
TIDL-3833 Model inference gets stuck in conv layer on target/EVM, works on host emulation with below warning during init stage of inference : WARNING: srcJoint freq greater than mapping buffer space. Might cause overflow! All except TDA4VM, AM62
TIDL-3831 Max Pool with asymmetric stride on target/EVM has functional mismatch with host emulation, target behavior is incorrect All except AM62
TIDL-3812 TVM_DLR : Models with two dimensional softmax have functional issue All except AM62
TIDL-3773 Layers with multiple consumers running in asymmetric quantization gives wrong output if any of the consumers do not support assyemtric quantization All except AM62, TDA4VM
TIDL-3747 Resize layer with "coordinate_transformation_mode": "align_corners" not supported in TIDL All except AM62
TIDL-3714 Protobuf version is not in sync with what is required for model compilation All except AM62
TIDL-3679 Model compilation fails with quantization_scale_type:4 and tensor_bits:16-bit All except AM62
TIDL-3659 Concat Layer Along Height/Width giving wrong output on target when number of input channels is one All except AM62
TIDL-3648 Concat layer gives wrong output on target/evm with following message : "WorkloadUnit_XfrLinkInit: Error: Out of channel:" All except AM62
TIDL-3641 Low latency inference mode (inference mode = 2) has undergone limited functional validation AM69A ( J784S4)
TIDL-3010 Data convert layer that does the layout change (from NHWC to NCHW) hangs on the target/EVM when the shape of the input tensor to the data convert is of the form 1x1x1xN All except AM62
TIDL-2878 Object detection post processing results in crash if all convolution heads are not part of same subgraph All except AM62
TIDL-2821 Non depthwise separable convolution layers with input pad = 0 and running in 16-bit hangs on EVM All except AM62
TIDL-1878 Custom layer with float output is resulting into error during compilation All except AM62

Known Issues

ID Description Affected Platforms Occurrence Workaround in this release
TIDL-3863 Networks with 7x7 depth wise separable layers and very large number of layers results in compilation failure. Refer error message “Memory limit exceeded for Workload Creation. Max number of Workload Limit per core is” during compilation stage All except AM62 Very Rare Modify the network to avoid 7x7 DWS layers
TIDL-3866 Vision transformers with layerNorm operator with 16-bit data type can have bit mismatch b/w “Host emulation” vs “target/EVM”. This bit mismatch (1-bit delta) is harmless in correct functional behavior and can be ignored All except AM62 Frequent None
TIDL-3867 Vision Transformer and Matmul with variable input feature has undergone limited validation on TDA4VM/J721E TDA4VM/J721E Frequent None
TIDL-3864 Matmul with broadcast is supported with below constraint
Let’s say if the tensor dimensions as input to Matmul are B1 x N1 x C1 x H1 x W1 and B2 x N2 x C2 x H2 x W2 then
Either B1=N1=C1=1 should satisfy or B2=N2=C2=1 should satisfy
All except AM62 Frequent None
TIDL-3865 Eltwise with broadcast is supported with below constraint
Let’s say if the tensor dimensions as input to Eltwise are B1 x N1 x C1 x H1 x W1 and B2 x N2 x C2 x H2 x W2 then
Either B1=N1=C1=1 should satisfy or B2=N2=C2=1 should satisfy
W1 has to be W2
Examples:
- (1) 1 x 1 x 1 x 1x 5 and 1 x 1 x 1 x 10 x 5 – supported
- (2) 1 x 1 x 1 x 10 x 5 and 1 x 1 x 20 x 10 x 5 – supported
- (3) 1 x 1 x 1 x 10 x 1 and 1 x 1 x 1 x 10 x 5 – not supported
All except AM62 Frequent None
TIDL-3868 Vision transformer support has below constraints
1. advanced_options:inference_mode = TIDL_infereneModeLowLatency is not supported
2. advanced_options:high_resolution_optimization = 1 is not supported
3. Mixed precision is not supported
All except AM62 Frequent None
TIDL-3870 Partial network with batch dimension is not supported All except AM62 Frequent Use scripts provided as part of Model optimization
TIDL-2991 Non-strided row flow convolution with top pad > 1 and procSize < inWidth can lead to incorrect outputs All except AM62 Rare Modify the network to avoid this situation
TIDL-2947 Convolution with pad greater than the input width results in incorrect outputs All except AM62 Rare Modify the network to avoid this situation
TIDL-3704 CPP inference mode of ONNX RT throws below error and results in functional incorrect behavior
“onnxruntime::common::Status onnxruntime::IExecutionFrame::GetOrCreateNodeOutputMLValue(int, int, const onnxruntime::TensorShape*, OrtValue*&, const onnxruntime::Node&) shape && tensor.Shape() == *shape was false. OrtValue shape verification failed.”
All except AM62 Rare Set export TIDL_RT_ONNX_VARDIM=1
To prevent this
TIDL-2592 TFLite-RT with TIDL-RT delegation support models with only 4 dimensional tensors All except AM62 Rare None
TIDL-3872 Preemption of a network by another network is not supported J722S Frequent None
TIDL-3871 Low latency inference mode (single network instance split across multiple c7x cores) expressed by option advanced_options:inference_mode = TIDL_infereneModeLowLatency is not supported J722S Frequent