muRISCV-NN

muRISCV-NN is a collection of efficient deep learning kernels for embedded platforms and microcontrollers. It is based on ARM's CMSIS-NN library but targets the RISC-V ISA instead.

It offers accelerated kernels using the RISC-V "V" vector extension v1.0, and the RISC-V packed "P" extension v0.9.6.

Integration

muRISCV-NN aims to stay functionally equivalent to CMSIS-NN so that no functional difference should be noticeable to users of either CMSIS-NN or muRISCV-NN. This way, muRISCV-NN acts as a drop-in replacement for CMSIS-NN and can be used with embedded deep learning frameworks such as TensorFlow Lite for Microcontrollers (TFLM) or microTVM.

We provide integration for both TFLM and microTVM in the Integration/ directory. Using these deep learning frameworks, we are able to run the complete suit of MLPerf Tiny Deep Learning Benchmarks consisting of MobileNet, ResNet, and AutoEncoder models.

Simulation

You can simulate muRISCV-NN using a number of different simulators. We provide support for instruction-level simulators (such as Spike or riscvOVPsim), as well as register transfer level (RTL) implementations (Vicuna running on Verilator).

Please refer to the Sim/ directory for more information on each simulator and its corresponding files.

Tests

In order to ensure functional correctness on an individual kernel level, we provide a suite of unit tests in Tests/. The unit tests use the same data as upstream CMSIS-NN, thus ensuring functional equivalency.

Toolchain

muRISCV-NN supports both the RISC-V GNU Compiler vector toolchain and LLVM (which has built-in RISC-V vector support). We provide pre-compiled toolchains in the Toolchain/ directory. Additionally, we also offer instructions on how to compile and install your own toolchain.

Upstream CMSIS-NN

muRISCV-NN is not a GIT fork "in the traditional sense". Instead, we aim to pull in changes from "upstream" CMSIS-NN manually on a regular basis in order to stay consistent and up-to-date. A direct fork would not make much sense, as our code differs too much in functionality and naming compared to CMSIS-NN.

The latest upstream CMSIS-NN commit muRISCV-NN is based on is 8ec46de (only respecting commits affecting the CMSIS/NN/ directory).

Performance

When running ResNet on TensorFlow Lite for Microcontrollers (TFLM), muRISCV-NN delivers close to 100x dynamic instruction count reduction:

Kernels	Extension	VLEN	Dynamic Instr. [x10^6]
Baseline	-	-	688
muRISCV-NN	-	-	62.5
muRISCV-NN	P-Ext.	-	49.5
muRISCV-NN	V-Ext.	64	12.3
muRISCV-NN	V-Ext.	128	9.67
muRISCV-NN	V-Ext.	256	8.41
muRISCV-NN	V-Ext.	512	7.47
muRISCV-NN	V-Ext.	1024	7.21

Stay tuned for more performance numbers in the near future!

Publications

muRISCV-NN: Challenging Zve32x Autovectorization with TinyML Inference Library for RISC-V Vector Extension (https://dl.acm.org/doi/10.1145/3637543.3652878)

CF '24 Companion: Proceedings of the 21st ACM International Conference on Computing Frontiers Workshops and Special Sessions

BibTeX

@inproceedings{10.1145/3637543.3652878,
  author = {van Kempen, Philipp and Jones, Jefferson Parker and Mueller-Gritschneder, Daniel and Schlichtmann, Ulf},
  title = {muRISCV-NN: Challenging Zve32x Autovectorization with TinyML Inference Library for RISC-V Vector Extension},
  year = {2024},
  isbn = {9798400704925},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3637543.3652878},
  doi = {10.1145/3637543.3652878},
  abstract = {With the rapid adoption of deep learning workloads to resource-constrained edge devices, efficient and data-parallel computing paradigms are becoming increasingly important. The RISC-V ISA provides a set of vector extensions featuring powerful data computation capabilities to accelerate deep learning workloads at the edge. However, the RISC-V ecosystem lacks a lightweight, open-source, and vendor-agnostic compute library to support these extensions on embedded platforms. After porting the existing ARM Cortex-M specific kernel implementation to the RISC-V vector ISA, we optimized the operator implementations to make the most out of the data-level parallelism provided by supported targets. In comparison to programs vectorized by LLVM's built-in auto-vectorizer, we see an up to 60\% advantage in runtime for convolutional models and large vectors while introducing less ROM overheads. Furthermore, muRISCV-NN integrates well with existing ML deployment frameworks, is bit-accurate to CMSIS-NN, and can, thus, be used as a drop-in replacement with minimal changes to the compilation flow.},
  booktitle = {Proceedings of the 21st ACM International Conference on Computing Frontiers Workshops and Special Sessions},
  pages = {75–78},
  numpages = {4},
  keywords = {Compilers, Neural Network Inference, RISC-V, Vectorization},
  location = {Ischia, Italy},
  series = {CF '24 Companion}
}

Acknowledgment

This research is partially funded by the German Federal Ministry of Education and Research (BMBF) within the project Scale4Edge (grant number 16ME0127).

Name		Name	Last commit message	Last commit date
Latest commit History 395 Commits
.github/workflows		.github/workflows
CMake		CMake
Include		Include
Integration		Integration
Scripts		Scripts
Sim		Sim
Source		Source
Tests		Tests
Toolchain		Toolchain
Wiki		Wiki
.clang-format		.clang-format
.gitignore		.gitignore
BMBF_gefoerdert_2017_en.jpg		BMBF_gefoerdert_2017_en.jpg
CITATION.cff		CITATION.cff
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
muriscv_nn_badge.png		muriscv_nn_badge.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

muRISCV-NN

Integration

Simulation

Tests

Toolchain

Upstream CMSIS-NN

Performance

Publications

Acknowledgment

About

Releases

Packages

Contributors 4

Languages

License

tum-ei-eda/muriscv-nn

Folders and files

Latest commit

History

Repository files navigation

muRISCV-NN

Integration

Simulation

Tests

Toolchain

Upstream CMSIS-NN

Performance

Publications

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages