29 Jun 14:15

7a0d5a6

Latest

Release Notes 3.5

Version Compatibility

Vitis™ AI v3.5 and the DPU IP released with the v3.5 branch of this repository are verified as compatible with Vitis, Vivado™, and PetaLinux version 2023.1. If you are using a previous release of Vitis AI, you should review the version compatibility matrix for that release.

Documentation and Github Repository

Merged UG1313 into UG1414
Streamlined UG1414 to remove redundant content
Streamlined UG1414 to focus exclusively on core tool usage. Core tools such as the Optimizer, Quantizer and Compiler are now being utilized across multiple targets (ie Ryzen™ AI, EPYC™) and this change seeks to make UG1414 more portable to these targets
Migrated Adaptable SoC and Alveo specific content from UG1414 to Github.IO
New Github.IO Toctree structure
Integrated VART Runtime APIs in Doxygen format

Docker Containers and GPU Support

Removed Anaconda dependency from TensorFlow 2 and PyTorch containers in order to address Anaconda commercial license requirements
Updated Docker container to disable Ubuntu 18.04 support (which was available in Vitis AI but not officially supported). This was done to address CVE-2021-3493.

Model Zoo

Add more classic models without modification such as YOLO series and 2D Unet
Provide model info card for each model and Jupyter Notebook tutorials for new models
New copyleft repo for GPL license models

ONNX CNN Quantizer

Initial release
This is a new quantizer that supports the direct PTQ quantization of ONNX models for DPU. It is a plugin built for the ONNXRuntime native quantizer.
Supports power-of-two quantization with both QDQ and QOP format.
Supports Non-overflow and Min-MSE quantization methods.
Supports various quantization configurations in power-of-two quantization in both QDQ and QOP format.
Supports signed and unsigned configurations.
Supports symmetry and asymmetry configurations.
Supports per-tensor and per-channel configurations.
Supports ONNX models in excess of 2GB.
Supports the use of the CUDAExecutionProvider for calibration in quantization.

PyTorch CNN Quantizer

Pytorch 1.13 and 2.0 support
Mixed precision quantization support, supporting float32/float16/bfloat16/intx mixed quantization
Support of bit-wise accuracy cross check between quantizer and ONNX-runtime
Split and chunk operators were automatically converted to slicing
Dict input/output support for model forward function
Keywords argument support for model forward function
Matmul subroutine support
Add support for BFP data type quantization
QAT supports training on mutiple GPUs
QAT supports operations with multiple inputs or outputs

TensorFlow 2 CNN Quantizer

Updated to support Tensorflow 2.12 and Python 3.8.
Adds support for quantizing subclass models.
Adds support for mix precision, supports layer-wise data type configuration, supports float32, float16, bfloat16, and int quantization.
Adds support for BFP datatypes, and add a new quantize strategy called 'bfp'.
Adds support to quantize Keras nested models.
Adds experimental support for quantizing the frozen pb format model in TensorFlow 2.x.
Adds a new 'gpu' quantize strategy which uses float scale quantization and is used in GPU deployment scenarios.
Adds support to exporting the quantized model to frozen pb format or onnx format.
Adds support to exporting the quantized model with power-of-two scales to frozen pb format with "FixNeuron" inside, to be compatible with some compilers with pb format input.
Adds support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.

Bug Fixes:

Fixes a gradient bug in the 'pof2s_tqt' quantize strategy.
Fixes a bug of quantization position change introduced by the fast fine-tuning process after the PTQ.
Fixes a graph transformation bug when a TFOpLambda op has multiple inputs.

TensorFlow 1 CNN Quantizer

Adds support for fast fine-tuning that improves PTQ accuracy.
Adds support for folding Reshape and ResizeNearestNeighbor operators.
Adds support for splitting Avgpool and Maxpool with large kernel sizes into smaller kernel sizes.
Adds support for quantizing Sum, StridedSlice, and Maximum operators.
Adds support for setting the input shape of the model, which is useful in the deployment of models with undefined input shapes.
Adds support for setting the opset version in exporting onnx format.

Bug Fixes:

Fixes a bug where the AddV2 operation is misunderstood as a BiasAdd.

Compiler

New operators supported: Broadcast add/mul, Bilinear downsample, Trilinear downsample, Group conv2d, Strided-slice
Performance improved on XV2DPU
Error message improved
Compilation time speed up

PyTorch Optimizer

Removed requirement for license purchase
Migrated to Github open-source
Supports PyTorch 1.11, 1.12 and 1.13
Supports pruning of grouped convolution
Supports setting the number of channels to be a multiple of the specified number after pruning

TensorFlow 2 Optimizer

Removed requirement for license purchase
Migrated to Github open-source
Supports TensorFlow 2.11 and 2.12
Supports pruning of tf.keras.layers.SeparableConv2D
Fixed tf.keras.layers.Conv2DTranspose pruning bug
Supports setting the number of channels to be a multiple of the specified number after pruning

Runtime

Supports Versal AI Edge VEK280 evalustion kit
Buffer optimization for multi-batches to improve performance
Add new tensor buffer interface to enhance zero copy

Vitis ONNX Runtime Execution Provider (VOE)

Supports ONNX Opset version 18, ONNX Runtime 1.16.0 and ONNX version 1.13
Supports both C++ and Python APIs(Python version 3)
Supports VitisAI EP and other EPs to work together to deploy the model
Provide Onnx examples based on C++ and Python APIs
VitisAI EP is open source and upstreamed to ONNX public repo on Github

Library

Added three new model libraries and support for five additional models

Model Inspector:

Support inspection for new DPU IPs

Profiler

Added Profiler support for DPUCV2DX8G

DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge / Core)

First general access release
Configurable from C20B1 to C20B14
Support most 2D operators required to deploy models found in the Model Zoo
General support for the VE2802/VC2802 and V70
Early access support for the VE2302 via this lounge

DPU IP - Zynq Ultrascale+ DPUCZDX8G

IP has reached maturity
No updates for this release
No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
No updated pre-built board image will be published for minor (ie x.5) releases

DPU IP - Versal AIE Targets DPUCVDX8H

IP has reached maturity
No updates for this release
No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
No updated pre-built board image will be published for minor (ie x.5) releases

DPU IP - CNN - Alveo Data Center DPUCVDX8G

IP has reached maturity
No updates for this release
No updated reference design (DPU TRD) will be published for minor (ie x.5) releases
No updated pre-built board image will be published for minor (ie x.5) releases

WeGO

Enhanced WeGO to support V70 DPU GA release.
Upgraded WeGO to provide support for PyTorch 1.13.1 and TensorFlow r2.12.
Enhanced WeGO-Torch to support PyTorch 2.0 as a preview feature.
Introduced new C++ API support for WeGO-Torch in addition to Python APIs.
Implemented WeGO-TF1 and WeGO-TF2 as out-of-tree plugins.

Known Issues

Engineering to add comments

AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.

Assets 2

13 Jan 10:04

janifer112x

v3.0

2057c58

Vitis AI 3.0 Release

Release Notes 3.0¶

Documentation and Github Repository¶

Migrated core documentation to Github.IO.
Incorporated offline HTML documentation for air-gapped users.
Restructured user documentation.
Restructured repository directory structure for clarity and ease-of-use.

Docker Containers and GPU Support¶

Migrated from multi-framework to per framework Docker containers.
Enabled Docker ROCm GPU support for quantization and pruning.

Model Zoo¶

Updated Model Zoo with commentary regarding dataset licensing restrictions
Added 14 new models and deprecated 28 models for a total of 130 models
Added super resolution 4x, as well as 2D and 3D semantic segmentation for Medical applications
Optimized models for benchmarks:
- MLPerf: 3D-Unet
- FAMBench: MaskRCNN
Provides optimized backbones supporting YoloX, v4, v5, v6 and EfficientNet-Lite
Ease-of-use enhancements, including replacing markdown-format performance tables with a downloadable Model Zoo spreadsheet
Added 72 PyTorch and TensorFlow models for AMD EPYC™ CPUs, targeting deployment with ZenDNN
Added models to support AMD GPU architectures based on ROCm and MLGraphX

TensorFlow 2 CNN Quantizer¶

Based on TensorFlow 2.10
Updated the Model Inspector to for improved accuracy of partitioning results expected from the DPU compiler.
Added support for datatype conversions for float models, including FP16, BFloat16, FP32, and double.
Added support for exporting quantized ONNX format models (to support the ONNX Runtime).
Added support for new layer types including SeparableConv2D and PReLU.
Added support for unsigned integer quantization.
Added support for automatic modification of input shapes for models with variable input shapes.
Added support to align the input and output quantize positions for Concat and Pooling layers.
Added error codes and improved the readability of the error and warning messages.
Various bug fixes.

TensorFlow 1 CNN Quantizer¶

Separated the quantizer code from the TensorFlow code, making it a plug-in module to the official TensorFlow code base.
Added support for exporting quantized ONNX format models (to support the ONNX Runtime).
Added support for datatype conversions for float models, including FP16, BFloat16, FP32 and double.
Added support for additional operations, including Max, Transpose, and DepthToSpace.
Added support for aligning input and output quantize positions of Concat and Pooling operations.
Added support for automatic replacement of Softmax with DPU-accelerated Softmax.
Added error codes and improved the readability of the error and warning messages.
Various bug fixes.

PyTorch CNN Quantizer¶

Support PyTorch 1.11 and 1.12.
Support exporting torch script format quantized model.
QAT supports exporting trained model to ONNX and torch script.
Support FP16 model quantization.
Optimized Inspector to support more pattern types, and backward compatible of device assignment.
Cover more PyTorch operators: more than 560 types of PyTorch operators are supported.
Enhanced parsing to support control flow parsing.
Enhanced message system with more useful message text.
Support fusing and quantization of BatchNorm without affine calculation.

Compiler¶

Added support for new operators, including: strided_slice, cost volume, correlation 1D & 2D, argmax, group conv2d, reduction_max, reduction_mean
Added support for Versal™ AIE-ML architectures DPUCV2DX8G (V70 and Versal AI Edge)
Focused effort to improve the intelligibility of error and partitioning messages

PyTorch Optimizer¶

Added support for fine-grained model pruning (sparsity)
OFA support for convolution layers with kernel sizes = (1,3) and dialation
OFA support for ConvTranspose2D
Added pruning configuration that allows users to specify pruning hyper-parameters
Specific exception types are defined for each type of error
Enhanced parallel model analysis with increased robustness
Support for PyTorch 1.11 and 1.12

TensorFlow 2 Optimizer¶

Added support for Keras ConvTranspose2D, Conv3D, ConvTranspose3D
Added support TFOpLambda operator
Added pruning configuration that allows users to specify pruning hyper-parameters
Specific exception types are defined for each type of error
Added support for TensorFlow 2.10

Runtime and Library¶

Added support for Versal AI Edge VEK280 evaluation kit and Alveo™ V70 accelerator cards (Early Access)
Added support for ONNX runtime, with eleven ONNX-specific examples
Added four new model libraries to the Vitis™ AI Library and support for fifteen additional models
Focused effort to improve the intelligibility of error messages

Profiler¶

Added Profiler support for DPUCV2DX8G (VEK280 Early Access)
Added Profiler support for Versal DDR bandwidth profiling

DPU IP - Zynq Ultrascale+ DPUCZDX8G¶

Upgraded to enable Vivado™ and Vitis 2022.2 release
Added support for 1D and 2D Correlation, Argmax and Max
Reduced resource utilization
Timing closure improvements

DPU IP - Versal AIE Targets DPUCVDX8G¶

Upgraded to enable Vivado and Vitis 2022.2 release
Added support for 1D and 2D Correlation
Added support for Argmax and Max along the channel dimension
Added support for Cost-Volume
Reduced resource utilization
Timing closure improvements

DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge)¶

Early access release supporting early adopters with an early, unoptimized AIE-ML DPU
Supports most 2D operators (currently does not support 3D operators)
Batch size support from 1~13
Supports more than 90 Model Zoo models

DPU IP - CNN - Alveo Data Center DPUCVDX8H¶

Upgraded to enable Vitis 2022.2 release
Timing closure improvements via scripts supplied for .xo workflows

DPU IP - CNN - V70 Data Center DPUCV2DX8G¶

Early access release supporting early adopters with an unoptimized DPU
Supports most 2D operators (currently does not support 3D operators)
Batch size 13 support
Supports more than 70 Model Zoo models

WeGO¶

Integrated WeGO with the Vitis-AI Quantizer to enable on-the-fly quantization and improve easy-of-use
Introduced serialization and deserialization with the WeGO flow to offer the capability of building once and running anytime
Incorporated AMD ZenDNN into WeGO, enabling additional optimization for AMD EPYC CPU targets
Improve WeGO robustness to offer a better developer experience and support a wider range of models

Known Issues¶

Bitstream loading error occurs when the AIE-ML DPU application running on the VEK280 kit is interrupted manually
HDMI not functional for the early access VEK280 image. The issue will be fixed in the next release

AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc.

Assets 2

16 Jun 01:34

hanxue

v2.5

c26eae3

Vitis AI 2.5 Release

New Features/Highlights

AI Model Zoo added 14 new models, including BERT-based NLP, Vision Transformer (ViT), Optical Character Recognition (OCR), Simultaneous Localization and Mapping (SLAM), and more Once-for-All (OFA) models
Added 38 base & optimized models for AMD EPYC server processors
AI Quantizer added model inspector, now supports TensorFlow 2.8 and Pytorch 1.10
Whole Graph Optimizer (WeGO) supports Pytorch 1.x and TensorFlow 2.x
Deep Learning Processing Unit (DPU) for Versal® ACAP supports multiple compute units (CU), new Arithmetic Logic Unit (ALU) engine, Depthwise convolution and more operators supported by the DPUs on VCK5000 and Alveo™ data center accelerator cards
Inference server supports ZenDNN as backend on AMD EPYC™ server processors
New examples added to Whole Application Acceleration (WAA) for VCK5000 Versal development card and Zynq® UltraScale+™ evaluation kits

Release Notes

AI Model Zoo

Added 14 new models, and 134 models in total
Expanded model categories for diverse AI workloads :
    Added models for data center application requirements including text detection and end-to-end OCR
    Added BERT-based NLP and Vision Transformer (ViT) models on VCK5000
    More OFA-optimized models, including OFA-RCAN for Super-Resolution and OFA-YOLO for Object Detection
    Added models for industrial vision and SLAM, including Interest Point Detection & Description model and Hierarchical Localization model.
Added 38 base & optimized models for AMD EPYC CPU
EoU enhancement:
    Improved model index by application categories

AI Quantizer-CNN

Added Model Inspector that inspects a float model and shows partition results
Support Tensorflow 2.8 and Pytorch 1.10
Support float-scale and per-channel quantization
Support configuration for different quantize strategies

AI Optimizer

OFA enhancement:
    Support even kernel size of convolution
    Support ConvTranspose2d
    Updated examples
One-step and iterative pruning enhancement:
    Resumed model analysis or search after exception

AI Compiler

Support ALU for DPUCZDX8G
Support new models

AI Library / VART

Added 6 new model libraries and support 17 new models
Custom Op Enhancement
Added new CPU operators
Xdputil Tool Enhancement
Two new demos on VCK190 Versal development board

AI Profiler

Full support on custom OP and Graph Runner
Stability optimization

Edge DPU-DPUCZDX8G

New ALU engine to replace pool engine and DepthWiseConv engine in MISC:
    ALU: support new features, e.g. large-kernel-size MaxPool, AveragePool, rectangle-kernel-size AveragePool, 16bit const weights
    ALU: support HardSigmoid and HardSwish
    ALU: support DepthwiseConv + LeakyReLU
    ALU: support the parallelism configuration
DPU IP and TRD on ZCU102 with encrypted RTL IP based on 2022.1 Vitis platform

Edge DPU-DPUCVDX8G

Optimized ALU that better support features like channel-attention
Support multiple compute units
Support DepthwiseConv + LeakyReLU
Support Versal DPU IP and TRD on VCK190 with encrypted RTL and AIE code which still in C32B1-6/C64B1-5, and based on 2022.1 Vitis platform

Cloud DPU-DPUCVDX8H

Enlarged DepthWise convolution kernel size that ranges from 1x1 to 8x8
Support AIE based pooling and ElementWise add & multiply, and big kernel size pooling
Support more DepthWise convolution kernel sizes

Cloud DPU-DPUCADF8H

Support ReLU6/LeakyReLU and MobileNet series models
Fixed the issue of missing directories in some cases in the .XO flow

Whole Graph Optimizer (WeGO)

Support PyTorch 1.x and TensorFlow 2.x in-framework inference
Added 19 PyTorch 1.x/Tensorflow 2.x/Tensorflow 1.x examples, including classification, object detection, segmentation and more

Inference Server

Added gRPC API to inference server flow
Support Tensorflow/Pytorch
Support AMD ZenDNN as backend

WAA

New examples for VCK5000 & ZCU104 - ResNet & adas_detection
New ResNet example containing AIE based pre-prorcessing kernel
Xclbin generation using Pre-built DPU flow for ZCU102/U50 ResNet and adas_detection applications
Xclbin generation using build flow for ZCU104/VCK190 ResNet and adas_detection applications
Porting of all VCK190 examples from ES1 to production version and use base platform instead of custom platform

Assets 2

20 Jan 07:21

hanxue

v2.0

d02dcb6

Vitis AI 2.0 Release

Release 2.0

New Features/Highlights

General Availability (GA) for VCK190(Production Silicon), VCK5000(Production Silicon) and U55C
Add support for newer Pytorch and Tensorflow version: Pytorch 1.8-1.9, Tensorflow 2.4-2.6
Add 22 new models, including Solo, Yolo-X, UltraFast, CLOCs, PSMNet, FairMOT, SESR, DRUNet, SSR as well as 3 NLP models and 2 OFA (Once-for-all) models
Add the new custom OP flow to run models with DPU un-supported OPs with enhancement across quantizer, compiler and runtime
Add more layers and configurations of DPU for VCK190 and DPU for VCK5000
Add OFA pruning and TF2 keras support for AI optimizer
Run inference directly from Tensorflow (Demo) for cloud DPU

Release Notes

Model Zoo

22 new models added, 130 total
- 19 new Pytorch models including 3 NLP and 2 OFA models
- 3 new Tensorflow models
Added new application models
- AD/ADAS: Solo for instance segmentation, Yolo-X for traffic sign detection, UltraFast for lane detection, CLOCs for sensor fusion
- Medical: SESR for super resolution, DRUNet for image denoise, SSR for spectral remove
- Smart city and industrial vision: PSMNet for binocular depth estimation, FairMOT for joint detection and Re-ID
EoU Enhancements
- Updated automatic script to search and download required models

Quantizer

TF2 quantizer
- Add support TF 2.4-2.6
- Add support for custom OP flow, including shape inference, quantization and dumping
- Add support for CUDA 11
- Add support for input_shape assignment when deploying QAT models
- Improve support for TFOpLambda layers
- Update support for hardware simulation, including sigmoid layer, leaky_relu layer, global and non-global average pooling layer
- Bugfixs for sequential models and quantize position adjustment
TF1 quantizer
- Add quantization support for new ops, including hard-sigmoid, hard-swish, element-wise multiply ops
- Add support for replacing normal sigmoid with hard sigmoid
- Update support for float weights dumping when dumping golden results
- Bugfixs for inconsistency of python APIs and cli APIs
Pytorch quantizer
- Add support for pytorch 1.8 and 1.9
- Support CUDA 11
- Support custom OP flow
- Improve fast finetune performance on memory consumption and accuracy
- Reduce memory consumption by feature map among quantization
- Improve QAT functions including better initialization of quantization scale and new API for getting quantizer’s parameters
- Support more quantization of operations: some 1D and 3D ops, DepthwiseConvTranspose2D, pixel-shuffle, pixel-unshuffle, const
- Support CONV/BN merging in pattern of CONV+CONCAT+BN
- Some message enhancement to help user locate problem
- Bugfixs about consistency with hardware

Optimizer

TensorFlow 1.15
- Support tf.keras.Optimizer for model training
TensorFlow 2.x
- Support TensorFlow 2.3-2.6
- Add iterative pruning
PyTorch
- Support PyTorch 1.4-1.9.1
- Support shared parameters in pruning
- Add one-step pruning
- Add once-for-all(OFA)
- Unified APIs for iterative and one-step pruning
- Enable pruned model to be used by quantizer
- Support nn.Conv3d and nn.ConvTranspose3d

Compiler

DPU on embedded platforms
- Support and optimize conv3d, transposedconv3d, upsample3d and upsample2d for DPUCVDX8G(xvDPU)
- Improve the efficiency of high resolution input for DPUCVDX8G(xvDPU)
- Support ALUv2 new features
DPU on Alveo/Cloud
- Support depthwise-conv2d, h-sigmoid and h-swish for DPUCVDX8H(DPUv4E)
- Support depthwise-conv2d for DPUCAHX8H(DPUv3E)
- Support high resolution model inference
Support custom OP flow

AI Library and VART

Support all the new models in Model Zoo: end-to-end deployment in Vitis AI Library
Improved GraphRunner to better support custom OP flow
Add examples on how to integrate custom OPs
Add more pre-implemented CPU OPs
DPU driver/runtime update to support Xilinx Device Tree Generator (DTG) for Vivado flow

AI Profiler

Support CPU tasks tracking in graph runner
Better memory bandwidth analysis in text summary
Better performance to enable the analysis of large models

Custom OP Flow

Provides new capability of deploying models with DPU unsupported OPs
- Define custom OPs in quantization
- Register and implement custom OPs before the deployment by graph runner
Add two examples
- Pointpillars Pytorch model
- MNIST Tensorflow 2 model

DPU

CNN DPU for Zynq SoC / MPSoC, DPUCZDX8G (DPUv2)
- Upgraded to 2021.2
- Update interrupt connection in Vivado flow
CNN DPU for Alveo-HBM, DPUCAHX8H (DPUv3E)
- Support depth-wise convolution
- Support U55C
CNN DPU for Alveo-DDR, DPUCADF8H (DPUv3Int8)
- Updated U200/U250 xlcbins with XRT 2021.2
- Released XO Flow
- Released IP Product Guide (PG400)
CNN DPU for Versal, DPUCVDX8G (xvDPU)
- C32 (32-aie cores for a single batch) and C64 (64-aie cores for a single batch) configurable
- Support configurable batch size 1~5 for C64
- Support and optimize new OPs: conv3d, transposedconv3d, upsample3d and upsample2d
- Reduce Conv bubbles and compute redundancy
- Support 16-bit const weights in ALUv2
CNN DPU for Versal, DPUCVDX8H (DPUv4E)
- Support depth-wise convolution with 6 PE configuration
- Support h-sigmoid and h-swish

Whole App Acceleration

Upgrade to Vitis and Vivado 2021.2
Custom plugin example: PSMNet using Cost Volume (RTL Based) accelerator on VCK190
New accelerator for Optical Flow (TV-L1) on U50
High resolution segmentation application on VCK190
Options to compare throughput & accuracy between FPGA and CPU Versions
- Throughput improvements ranging from 25% to 368%
Reorganized for better usability and visibility

TVM

Add support of DPUs for U50 and U55C

WeGO (Whole Graph Optimizer)

Run inference directly from Tensorflow framework for cloud DPU
- Automatically perform subgraph partitioning and apply optimization/acceleration for DPU subgraphs
- Dispatch non-DPU subgraphs to TensorFlow running on CPU
Resnet50 and Yolov3 demos on VCK5000

Inference Server

Support xmodel serving in cloud / on-premise (EA)

Known Issues

vai_q_caffe hangs when TRAIN and TEST phases point to the same LMDB file
TVM compiled Inception_v3 model gives low accuracy with DPUCADF8H (DPUv3Int8)
TensorFlow 1.15 quantizer error in QAT caused by an incorrect pattern match

Assets 2

27 Oct 14:06

hanxue

v1.4.1

5e500db

Vitis AI 1.4.1 Release

Release 1.4.1

New Features/Highlights

Vitis AI RNN docker public release, including RNN quantizer and compiler
New unified xRNN runtime for U25 & U50LV based on VART Runner interface and XIR xmodels
Release Versal DPU TRD based on 2021.1
Versal WAA app updated to provide better throughput using the new XRT C++ APIs and zero copy
TVM easy-of-use improvement
Support VCK190 and VCK5000 production boards
Some bugs fixed, e.g. update on xCompiler data alignment issue affecting WAA, quantizer bug fixed

Assets 2

27 Jul 06:05

hanxue

v1.4

84798c7

Vitis AI 1.4 Release

New Features/Highlights

Support new platforms, including Versal ACAP platforms VCK190, VCK5000 and Kria SoM
Better Pytorch and Tensorflow model support: Pytorch 1.5-1.7.1, improved quantization for Tensorflow 2.x models
New models, including 4D Radar detection, Image-Lidar sensor fusion, 3D detection & segmentation, multi-task, depth estimation, super resolution for automotive, smart medical and industrial vision applications
New Graph Runner API to deploy models with multiple subgraphs
DPUCADX8G (DPUv1)deprecated with DPUCADF8H (DPUv3Int8)
DPUCAHX8H (DPUv3E) and DPUCAHX8L (DPUv3ME) release with xo
Classification & Detection WAA examples for Versal (VCK190)

Assets 2

21 Apr 09:39

hanxue

v1.3.2

46762d9

v1.3.2 Release

Enable Ubuntu 20.04 on MPSoC (Vitis AI Runtime and Vitis AI Library)
Added environment variable for Vitis AI Library’s model search path
Bug fixes for pytorch / LSTM and log improvement

Assets 2

23 Mar 07:48

hanxue

v1.3.1

5f40342

Vitis-AI 1.3.1 Release

Update compiler to improve performance by 5% in average for most models
Added zero copy support (new APIs in VART / Vitis AI Library)
Added cross-layer equalization support in TensorFlow v1.15
Added WAA U50 TRD
Updated U280 Pre-processing using Multi-preprocessing JPEG decode kernels
Bug fixes and improvements for v1.3

Assets 2

17 Dec 22:08

andyluo7

v1.3

e86b6ef

Vitis-AI 1.3 Release

Added support for Pytorch and Tensorflow 2.3 frameworks
Added more ready-to-use AI models for a wider range of applications, including 3D point cloud detection and segmentation, COVID-19 chest image segmentation and other reference models
Unified XIR-based compilation flow from edge to cloud
Vitis AI Runtime (VART) fully open source
New RNN overlay for NLP applications
New CNN DPUs for the low-latency and higher throughput applications on Alveo cards
EoU enhancement with Beta version model partitioning and custom layer/operators plug-in

Assets 2

30 Jul 03:29

kamranjk

v1.2.1

0353caa

Vitis-AI Release 1.2.1

v1.2.1

Added Zynq Ultrascale Plus Whole App examples
Updated U50 XRT and shell to Xilinx-u50-gen3x4-xdma-2-202010.1-2902115
Updated docker launch instructions
Updated TRD makefile instructions

Assets 2

Releases: Xilinx/Vitis-AI

Vitis AI 3.5 Release

Release Notes 3.5

Version Compatibility

Documentation and Github Repository

Docker Containers and GPU Support

Model Zoo

ONNX CNN Quantizer

PyTorch CNN Quantizer

TensorFlow 2 CNN Quantizer

TensorFlow 1 CNN Quantizer

Compiler

PyTorch Optimizer

TensorFlow 2 Optimizer

Runtime

Vitis ONNX Runtime Execution Provider (VOE)

Library

Model Inspector:

Profiler

DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge / Core)

DPU IP - Zynq Ultrascale+ DPUCZDX8G

DPU IP - Versal AIE Targets DPUCVDX8H

DPU IP - CNN - Alveo Data Center DPUCVDX8G

WeGO

Known Issues

Vitis AI 3.0 Release

Release Notes 3.0¶

Documentation and Github Repository¶

Docker Containers and GPU Support¶

Model Zoo¶

Optimized models for benchmarks:

TensorFlow 2 CNN Quantizer¶

TensorFlow 1 CNN Quantizer¶

PyTorch CNN Quantizer¶

Compiler¶

PyTorch Optimizer¶

TensorFlow 2 Optimizer¶

Runtime and Library¶

Profiler¶

DPU IP - Zynq Ultrascale+ DPUCZDX8G¶

DPU IP - Versal AIE Targets DPUCVDX8G¶

DPU IP - Versal AIE-ML Targets DPUCV2DX8G (Versal AI Edge)¶

DPU IP - CNN - Alveo Data Center DPUCVDX8H¶

DPU IP - CNN - V70 Data Center DPUCV2DX8G¶

WeGO¶

Known Issues¶

Vitis AI 2.5 Release

New Features/Highlights

Release Notes

AI Model Zoo

AI Quantizer-CNN

AI Optimizer

AI Compiler

AI Library / VART

AI Profiler

Edge DPU-DPUCZDX8G

Edge DPU-DPUCVDX8G

Cloud DPU-DPUCVDX8H

Cloud DPU-DPUCADF8H

Whole Graph Optimizer (WeGO)

Inference Server

WAA

Vitis AI 2.0 Release

Release 2.0

New Features/Highlights

Release Notes

Model Zoo

Quantizer

Optimizer

Compiler

AI Library and VART

AI Profiler

Custom OP Flow

DPU

Whole App Acceleration

TVM

WeGO (Whole Graph Optimizer)

Inference Server

Known Issues

Vitis AI 1.4.1 Release