A collection of works for software/hardware co-design in deep learning.
- QAPPA : A Framework for Navigating Quality-Energy Tradeoffs with Arbitrary Quantization
Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks(DATE, 2017, Brown)
- DianNao:A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning
- Minerva:Enabling Low-Power,Highly-Accurate Deep Neural Network Accelerators
DVAFS:Trading Computational Accuracy for Energy Through Dynamic-Voltage-Accuracy-Frequency-Scaling (DATE,2017,KU Leuven)
- ENVISION:A 0.26-to-10TOPS/W Subword-Parallel Dynamic-Voltage-Accuracy-Frequency-Scalable Convolutional Neural Network Processor in 28nm FDSOI (ISSCC,2017,KU Leuven)
Minimum Energy Quantized Neural Networks (Arxiv,2017,KU Leuven)
- Proteus:Exploiting Numerical Precision Variability in Deep Neural Networks(ICS,2016)
Stripes: Bit-Serial Deep Neural Network Computing(IEEE Computer Architecture Letters,2016)
Bit-pragmatic Deep Neural Network Computing (MICRO,2017)
- Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks(Arxiv, 2017)
- Dynamic Stripes: Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks(Arxiv, 2017)
- Tartan: Accelerating Fully-Connected and Convolutional Layers in Deep Learning Networks by Exploiting Numerical Precision Variability(Arxiv, 2017)
Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks(DAC, 2017)
- Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks(DATE, 2017)
- YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights(ISVLSI,2016)
- FINN: A Framework for Fast, Scalable Binarized Neural Network Inference(FPGA, 2017)
Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC(2017, intel)
- Aladdin: APre-RTL, Power-Performance Accelerator Simulator Enabling Large Design Space Exploration of Customized Architectures
- PARADE: A Cycle-Accurate Full-System Simulation Platform for Accelerator-Rich Architectural Design and Exploration
- Co-Designing Accelerators and SoC Interfaces using gem5-Aladdin
- Automatic Generation of Efficient Accelerators for Reconfigurable Hardware(ISCA,16,Stanford)