-
Notifications
You must be signed in to change notification settings - Fork 626
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Erin Truax
committed
Dec 2, 2019
1 parent
9c317be
commit 36b31c5
Showing
80 changed files
with
773 additions
and
4,944 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,66 @@ | ||
DPU V3.1 | ||
# Zynq UltraScale+ MPSoC DPU TRD | ||
|
||
The Xilinx Deep Learning Processor Unit(DPU) is a configurable computation engine dedicated for convolutional neural networks. The degree of parallelism utilized in the engine is a design parameter and application. It includes a set of highly optimized instructions, and supports most convolutional neural networks, such as VGG, ResNet, GoogleNet, YOLO, SSD, MobileNet, FPN, and others. | ||
|
||
### Features | ||
|
||
- One AXI salve interface for accessing configuration and status registers. | ||
|
||
- One AXI master interface for accessing instructions. | ||
|
||
- Supports configurable AXI master interface with 64 or 128 bits for accessing data depending on the target device. | ||
|
||
- Supports individual configuration of each channel. | ||
|
||
- Supports optional interrupt requeset generation. | ||
|
||
- Some highlights of DPU functionality include: | ||
- Configurable hardware architecture includes: B512, B800, B1024, B1152, B1600, B2304, B3136, and B4096 | ||
- Maximum of three cores | ||
- Convolution and deconvolution | ||
- Depthwise convolution | ||
- Max poolling | ||
- Average poolling | ||
- ReLU, RELU6, and Leaky ReLU | ||
- Concat | ||
- Elementwise-sum | ||
- Dilation | ||
- Reorg | ||
- Fully connected layer | ||
- Softmax | ||
- Bach Normalization | ||
- Split | ||
|
||
### Hardware Architecture | ||
|
||
The detailed hardware architecture of the DPU is shown in the following figure. After start-up, the DPU fetches instructions from off-chip memory to control the operation of the computing engine. The instructions are generated by the DNNC where substantial optimizations have been performed. On-chip memory is used to buffer input, intermediate, and output data to achieve high throughput and efficiency. The data is reused as much as possible to reduce the memory bandwidth. A deep pipelined design is used for the computing engine. The processing elements (PE) take full advantage of the finegrained building blocks such as multipliers, adders and accumulators in Xilinx devices. | ||
|
||
![DPU Hardware Architecture](./prj/Vitis/doc/dpu_hardware_arch.png) | ||
|
||
|
||
There are three dimensions of parallelism in the DPU convolution architecture - pixel parallelism, input channel parallelism, and output channel parallelism. The input channel parallelism is always equal to the output channel parallelism. The different architectures require different programmable logic resources. The larger architectures can achieve higher performance with more resources. The parallelism for the different architectures is listed in the table. | ||
|
||
|Connolution Architecture|Pixel Parallelism(PP)|Input Channel Parallelism(ICP)|Output Channel Parallelism(OCP)|Peak(operations/per clock)| | ||
|:---|:---|:---|:---|:---| | ||
|B512|4|8|8|512| | ||
|B800|4|10|10|800| | ||
|B1024|8|8|8|1024| | ||
|B1152|4|12|12|1152| | ||
|B1600|8|10|10|1600| | ||
|B2304|8|12|12|2304| | ||
|B3136|8|14|14|3136| | ||
|B4096|8|16|16|4096| | ||
|
||
|
||
**** | ||
|
||
[DPU TRD Vitis Flow](./prj/Vitis/README.md) | ||
|
||
**** | ||
|
||
The Vivado flow will come soon. | ||
|
||
**** | ||
|
||
|
||
For the Vitis flow, go to <a href="prj/Vitis">prj/Vitis</a>. | ||
|
||
The Vivado flow is coming soon. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,50 +1,4 @@ | ||
# | ||
# (c) Copyright 2018 Xilinx, Inc. All rights reserved. | ||
# | ||
# This file contains confidential and proprietary information | ||
# of Xilinx, Inc. and is protected under U.S. and | ||
# international copyright and other intellectual property | ||
# laws. | ||
# | ||
# DISCLAIMER | ||
# This disclaimer is not a license and does not grant any | ||
# rights to the materials distributed herewith. Except as | ||
# otherwise provided in a valid license issued to you by | ||
# Xilinx, and to the maximum extent permitted by applicable | ||
# law: (1) THESE MATERIALS ARE MADE AVAILABLE "AS IS" AND | ||
# WITH ALL FAULTS, AND XILINX HEREBY DISCLAIMS ALL WARRANTIES | ||
# AND CONDITIONS, EXPRESS, IMPLIED, OR STATUTORY, INCLUDING | ||
# BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, NON- | ||
# INFRINGEMENT, OR FITNESS FOR ANY PARTICULAR PURPOSE; and | ||
# (2) Xilinx shall not be liable (whether in contract or tort, | ||
# including negligence, or under any other theory of | ||
# liability) for any loss or damage of any kind or nature | ||
# related to, arising under or in connection with these | ||
# materials, including for any direct, or any indirect, | ||
# special, incidental, or consequential loss or damage | ||
# (including loss of data, profits, goodwill, or any type of | ||
# loss or damage suffered as a result of any action brought | ||
# by a third party) even if such damage or loss was | ||
# reasonably foreseeable or Xilinx had been advised of the | ||
# possibility of the same. | ||
# | ||
# CRITICAL APPLICATIONS | ||
# Xilinx products are not designed or intended to be fail- | ||
# safe, or for use in any application requiring fail-safe | ||
# performance, such as life-support or safety devices or | ||
# systems, Class III medical devices, nuclear facilities, | ||
# applications related to the deployment of airbags, or any | ||
# other applications that could lead to death, personal | ||
# injury, or severe property or environmental damage | ||
# (individually and collectively, "Critical | ||
# Applications"). Customer assumes the sole risk and | ||
# liability of any use of Xilinx products in Critical | ||
# Applications, subject only to applicable laws and | ||
# regulations governing limitations on product liability. | ||
# | ||
# THIS COPYRIGHT NOTICE AND DISCLAIMER MUST BE RETAINED AS | ||
# PART OF THIS FILE AT ALL TIMES. | ||
# | ||
#/bin/sh | ||
#!/bin/bash | ||
|
||
CXX=${CXX:-g++} | ||
$CXX -std=c++11 -O3 -I. -o demo_classification demo_classification.cpp -lopencv_core -lopencv_video -lopencv_videoio -lopencv_imgproc -lopencv_imgcodecs -lopencv_highgui -lglog -ldpbase -ldpproto -lvitis_dpu |
Oops, something went wrong.