Skip to content

yewhenp/PureFPGA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MCCP - MultiCore CoProcessor. GPGPU for Z80 based computer

first presentation

here

second presentation

here

third presentation

here

fourth presentation

here

fifth presentation

here

sixth presentation

here

Idea of project

Tha main purpose of this project is to create general purpose video card using FPGA technology. This project is part of bigger project - computer based on Z80. In this project is used DE10-nano Cyclone V FPGA.

Product requirements

  • general purpose (capability to program it)
  • cuda-like architecture
  • video output to VGA or\and hdmi
  • 32 bit simple cores
  • simple RISC ISA
  • each core is mostly independent and can support branches
  • using HPS component to prepare data for videocard

Prerequisites

For usage:

  • Quartus Prime, version 18+
  • minicom / putty or other serial port communication software
  • python 3
  • DE10 nano cyclone V FPGA board
  • Linux on HPS component (download here)

Additional (for development):

Usage

  • Open project in Quartus Prime (project file - ./MCCP/ucu_gpu.qpf)
  • Connect DE10-nano board to power and USB Blaster
  • Open Programmer. Choose board in Hardware Setup, Add .sof file, press Auto-Detect, tick Program/Configure, press Start
  • Connect board to uart to USB port. Open minicom/putty, set bit rate 115200, login to linux.
  • Memory of videocard is mapped to [0xC0000000 - 0xC003fffc].
# write to memory
memtool address=value
# read 
memtool address number_of_words #(word has 32 bits)
  • fill RAM/ROM with text file (.txt extension, numbers divided by whitespace, or .out extension, binary numbers divided by whitespace (for programs))
RAM=0xC0000000
ROM=0xC0040000
./linux_code/mem_write ram.txt $RAM
./linux_code/mem_write my_prgram.out $ROM
  • before starting - activate needed number of cores:
./scripts/activate_cores.sh 4
  • To start videocard,
  1. clear finish interrupt from the videocard
memtool -8 0xFF200001=0x0
  1. send start interrupt to the videocard interrupt
memtool -8 0xFF200000=0x1

Details on realization

memory

  • Videocard can address only 64Kb of FPGA memory.
  • Videocard's memory is mapped to address space of linuz through hps-to-fpga interface
  • Special module reads starting interrupts on 0xFF200000 through lightweight interface. Also after work is done, videocard sends interrupt on 0xFF200001. Carefully, in 60 ticks, this module clears interrupt
  • Each core can access any address in this 64Kb memory.
  • There is arbiter that has clock with higher freq, it manages all requests to access memory from cores.
  • There is FPGA ROM memory, where program lives. Program is hardcoded and cannot be changed during runtime. In ./assembler there is extremenly simple bootloader, that allows to chooce address of starting program for each core. So it is possible to make each core run different programs.

instruction execution cycle

ISA documentation here

  • Each core can be interpreted as individual core.
  • After finishing work, core sends interrupt with number 1, that is passed to interrupt controller, that counts interrupts. After all cores sent finishing signal, interrupt controler writes to 0xFF200001 (read usage), which is signal to linux that work is done.

Plots

How matrix multiplication scales

How summing up vector scales with number of cores

About

Multicore coprocessor, written on FPGA

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published