Recipe for FPGA cooking
Branch: master
Clone or download

 _____ ____   ____    _    
|  ___|  _ \ / ___|  / \   
| |_  | |_) | |  _  / _ \  
|  _| |  __/| |_| |/ ___ \ 
|_|   |_|    \____/_/   \_\


This repository is intended for folks who are new and want to learn something about FPGA. This repository is a collection of useful resources and links rather than a thorough FPGA tutorial. Traditional HDL (Hard and Difficult Language) is not the main focus, instead, we focus on using high-level languages (e.g., C++) to cook FPGA.

Originally, this repository was started by a newbie to record his learning of FPGA, and late made public in the hope that it could help researchers to start their journey along with FPGA, with less pain and whiskey.

Resources collected here, or the way contents are organized, are not in their perfect shape. This repository is still raw and need major improvements. Any form of contribution is welcomed and appreciated.

Main contents:

    • Basics about Digital Design
    • Basics about FPGA
    • Relevant Courses and Books
    • Papers about FPGA internal
  • Xilinx
  • submodules/: Github repositories about FPGA
  • hls/: Sample Xilinx HLS C++ code
    • AXI Stream
    • Network protocol processing
  • xilinx_arty_a7: Sample Xilinx projects for Arty A7 100 board
    • Tri-mode MAC reference design
    • Simple LED
    • Clocked LED
    • Some implementation questions about FPGA

Get Started

FPGA Intro

Digital Basics


High-Level Synthesis (HLS)





How to apply Operating System concept to FPGA? How to virtualize on-board memory and on-chip logic? And, how is FPGA ultimately different from CPU in items of resource sharing? Papers in this section could give you some hint.


Memory Hierarchy

Dynamic Memory Allocation

Integrate with Virtual Memory

Integrate OS/CPU/FPGA


What are the typical applications that can be offloaded into FPGA? What has already been done before? This section lists many interesting applications and systems deployed on FPGA.

Integrate with Frameworks

  • Map-reduce as a Programming Model for Custom Computing Machines, FCCM'08
    • This paper proposes a model to translate MapReduce code written in C to code that could run on FPGA and GPU. Many details are omitted, and they don't really have the compiler.
    • Single-host framework, everything is in FPGA and GPU.
  • Axel: A Heterogeneous Cluster with FPGAs and GPUs, FPGA'10
    • A distributed MapReduce Framework, targets clusters with CPU, GPU, and FPGA. Mainly the idea of scheduling FPGA/GPU jobs.
    • Distributed Framework.
  • FPMR: MapReduce Framework on FPGA, FPGA'10
    • A MapReduce framework on a single host's FPGA. You need to write Verilog/HLS for processing logic to hook with their framework. The framework mainly includes a data transfer controller, a simple schedule that enable certain blocks at certain time.
    • Single-host framework, everything is in FPGA.
  • Melia: A MapReduce Framework on OpenCL-Based FPGAs, IEEE'16
    • Another framework, written in OpenCL, and users can use OpenCL to program as well. Similar to previous work, it's more about the framework design, not specific algorithms on FPGA.
    • Single-host framework, everything is in FPGA. But they have a discussion on running on multiple FPGAs.
    • Four MapReduce FPGA papers here, I believe there are more. The marriage between MapReduce and FPGA is not something hard to understand. FPGA can be viewed as another core with different capabilities. The thing is, given FPGA's reprogram-time and limited on-board memory, how to design a good scheduling algorithm and data moving/caching mechanisms. Those papers give some hints on this.
  • UCLA: When Apache Spark Meets FPGAs: A Case Study for Next-Generation DNA Sequencing Acceleration, HotCloud'16
  • UCLA: Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale, SoCC'16
    • A system that hooks FPGA with Spark.
    • There is a line of work that hook FPGA with big data processing framework (Spark), so the implementation of FPGA and the scale-out software can be separated. The Spark can schedule FPGA jobs to different machines, and take care of scale-out, failure handling etc. But, I personally think this line of work is really just an extension to ReconOS/FUSE/BORPH line of work. The main reason is: both these two lines of work try to integrate jobs run on CPU and jobs run on FPGA, so CPU and FPGA have an easier way to talk, or put in another way, CPU and FPGA have a better division of labor. Whether it's single-machine (like ReconOS, Melia), or distributed (like Blaze, Axel), they are essentially the same.
  • UCLA: Heterogeneous Datacenters: Options and Opportunities, DAC'16
    • Follow up work of Blaze. Nice comparison of big and wimpy cores.

Cloud Infrastructure

Programmable Network


Machine Learning

  • TODO


  • TODO



Video Processing

  • TODO


  • TODO


  • TODO


  • UCB GC
  • UCLA Java

FPGA Internal


Partial Reconfiguration

Logical Optimization and Technology Mapping

Place and Route