ztachip is a RISCV accelerator for vision and AI edge applications running on low-end FPGA devices or custom ASIC.
Acceleration provided by ztachip can be up to 20-50x compared with a non-accelerated RISCV implementation on many vision/AI tasks. ztachip performs also better when compared with a RISCV that is equipped with vector extension.
An innovative tensor processor hardware is implemented to accelerate a wide range of different tasks from many common vision tasks such as edge-detection, optical-flow, motion-detection, color-conversion to executing TensorFlow AI models. This is one key difference of ztachip when compared with other accelerators that tend to accelerate only a narrow range of applications only (for example convolution neural network only).
A new tensor programming paradigm is introduced to allow programmers to leverage the massive processing/data parallelism enabled by ztachip tensor processor.
This is first release so documentation is still very lacking. Please visit this site later for more information including hardware architecture description and programmers's guide.
-
SW/compiler: compiler to generate instructions for the tensor processor.
-
SW/apps: vision and AI stack implementation. Many prebuilt acceleration functions are provided to provide programmers with a fast path to leverage ztachip acceleration. This folder is also a good place to learn on how to program your own custom acceleration functions.
-
SW/base: SW framework library and some utilities
-
SW/fs: read-only file system to be downloaded together with the build image.
-
SW/src: codes for the reference design example. This is a good place to learn on how to use ztachip prebuilt vision and AI stack.
-
HW/examples: HDL codes for the reference design.
-
HW/examples/GHRD/MyVexRiscv.scala: RISCV core used in this example is based on VexRiscv implementation. This file is used by VexRiscv project to generate the Riscv core.
-
HW/platform: This is a thin wrapper layer to help ztachip to be synthesized efficiently on different FPGA or ASIC. Choose the appropriate sub-folder that corresponds to your FPGA target. A generic implementation is also provided for simulation environment. Any FPGA/ASIC can be supported with the appropriate implementation of this wrapper layer.
-
HW/src: main ztachip HDL source codes.
sudo apt-get install autoconf automake autotools-dev curl python3 libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev python3-pip
pip3 install numpy
The build below is a pretty long.
export PATH=/opt/riscv/bin:$PATH
git clone https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
./configure --prefix=/opt/riscv --with-arch=rv32im --with-abi=ilp32
sudo make
git clone https://github.com/ztachip/ztachip.git
export PATH=/opt/riscv/bin:$PATH
cd ztachip
cd SW/compiler
make clean all
cd ../fs
python3 bin2c.py
cd ..
make clean all -f makefile.kernels
make clean all
-
Download Xilinx Vivado Webpack free edition.
-
With Vivado, open project ztachip/HW/examples/GHRD/GHRD.xpr
-
The target board used in this example is the ArtyA7-100T
-
Build FPGA image and program it to flash as described in Vivado User Guide
The following demos are demonstrated on the ArtyA7-100T FPGA development board.
-
Image classification with TensorFlow's Mobinet
-
Object detection with TensorFlow's SSD-Mobinet
-
Edge detection using Canny algorithm
-
Point-of-interest using Harris-Corner algorithm
-
Motion detection
-
Multi-tasking with ObjectDetection, edge detection, Harris-Corner, Motion Detection running at same time
To run the demo, press button0 to switch between different AI/vision applications.
Reference design example required the hardware components below...
Attach the VGA and Camera modules to Arty-A7 board according to picture below
Connect camera_module to Arty board according to picture below
In this example, we will load the program using GDB debugger and JTAG
sudo apt-get install libtool automake libusb-1.0.0-dev texinfo libusb-dev libyaml-dev pkg-config
git clone https://github.com/SpinalHDL/openocd_riscv
cd openocd_riscv
./bootstrap
./configure --enable-ftdi --enable-dummy
make
cp <ztachip installation folder>/tools/openocd/soc_init.cfg .
cp <ztachip installation folder>/tools/openocd/usb_connect.cfg .
cp <ztachip installation folder>/tools/openocd/xilinx-xc7.cfg .
cp <ztachip installation folder>/tools/openocd/jtagspi.cfg .
cp <ztachip installation folder>/tools/openocd/cpu0.yaml .
Launch OpenOCD to provide JTAG connectivity for GDB debugger
cd <openocd_riscv installation folder>
sudo src/openocd -f usb_connect.cfg -c 'set MURAX_CPU0_YAML cpu0.yaml' -f soc_init.cfg
Open another terminal, then launch GDB debugger
export PATH=/opt/riscv/bin:$PATH
cd <ztachip installation folder>/SW/src
riscv32-unknown-elf-gdb ../build/ztachip.elf
From GDB debugger prompt, issue the commands below
target remote localhost:3333
set remotetimeout 60
set arch riscv:rv32
monitor reset halt
load
After sucessfully loading the program, issue command below at GDB prompt
continue
Press button0 to switch between different AI/vision applications.
Go to Programmer's guide to writing tensor applications for information on how to write your custom acceleration codes.
Go to Vision/AI stack users guide for information on how to use ztachip prebuilt vision/ai stack.
This project is free to use. But for business consulting and support, please contact vuongdnguyen@hotmail.com
Follow ztachip on Twitter: https://twitter.com/ztachip