Skip to content
This repository

General Purpose Graphics Processing Unit (GPGPU) IP Core

branch: master

Fix (software) race condition in thread test

The other threads can restart thread 0 after it has stopped itself.
Only thread 0 should set the thread enable mask.
latest commit 8a504f5ea6
Jeff Bush authored April 24, 2014
Octocat-spinner-32 benchmarks Allow overriding microarchitecture version in environment April 20, 2014
Octocat-spinner-32 firmware Update license April 20, 2014
Octocat-spinner-32 rtl Fix hang: wb_rollback_last_subcycle should not have been registered April 24, 2014
Octocat-spinner-32 tests Fix (software) race condition in thread test April 24, 2014
Octocat-spinner-32 tools Print expected mask correctly on error. April 23, 2014
Octocat-spinner-32 .gitignore Tweak April 20, 2014
Octocat-spinner-32 LICENSE.txt Update license April 20, 2014
Octocat-spinner-32 Makefile Update license April 20, 2014
Octocat-spinner-32 Update docs April 20, 2014

This project is a multi-core GPGPU (general purpose graphics processing unit) IP core, implemented in SystemVerilog. Documentation is available here:
Pull requests/contributions are welcome.

Required Tools/Libraries

On Ubuntu, most of these (with the exception of the cross compiler) can be be installed using the package manager: sudo apt-get install verilator gcc g++ python libreadline-dev. However, if you are not on a recent distribution, they may be too old, in which case you'll need to build them manually.

I've run this on Linux and MacOS X (Lion). I have not tested this on Windows, although I would expect it to work in Cygwin, potentially with some modifications.

To run on FPGA


  • Emacs + verilog mode tools, for AUTOWIRE/AUTOINST (Note that this doesn't require using Emacs as an editor. Using 'make autos' in the rtl/v1/ directory will run this operation in batch mode if the tools are installed).
  • Java (J2SE 6+) for visualizer app
  • GTKWave (or similar) for analyzing waveform files

Running in Verilog simulation

To build tools and verilog models:

First, you must download and build the LLVM toolchain from here: The README file in the root directory provides instructions.

Once this is done, from the top directory of this project:


Running verification tests (in Verilog simulation)

From the top directory:

make test

Running 3D Engine (in Verilog simulation)

cd firmware/3D-renderer
make verirun

(output image stored in fb.bmp)

Running on FPGA

This runs on Terasic's DE2-115 evaluation board. These instructions are for Linux only.

  • Build USB blaster command line tools (
    • Update your PATH environment variable to point the directory where you built the tools.
    • Create a file /etc/udev/rules.d/99-custom.rules and add the line: ATTRS{idVendor}=="09fb" , MODE="0660" , GROUP="plugdev"
  • Build the bitstream (ensure quartus binary directory is in your PATH, by default installed in ~/altera/13.1/quartus/bin/)
    cd rtl/v1/fpga/de2-115
  • Load the bitstream onto the board (this only needs to be done once each time the board is power cycled)
    make program 
  • Load program into memory and execute it using the runit script as below. The script assembles the source and uses the jload command to transfer the program over the USB blaster cable that was used to load the bitstream. jload will automatically reset the processor as a side effect, so the bitstream does not need to be reloaded each time.
    cd ../../../tests/fpga/blinky
Something went wrong with that request. Please try again.