A fully custom 16-bit CPU with PS/2 keyboard input, VGA display output, and 7-segment debugging — synthesizable on Terasic DE0 (Cyclone III) and DE0-CV (Cyclone V) FPGA boards.
- Overview
- Architecture
- Module Reference
- Hardware Requirements
- Software Prerequisites
- Project Structure
- Getting Started
- Build Targets
- Writing Programs
- VGA Color Encoding
- Debugging
- Contributing
- License
FPGA-16 is a ground-up hardware implementation of a 16-bit processor system designed for FPGA deployment. The CPU features a 3-operand instruction format with direct and indirect addressing, a full ALU with arithmetic and logic operations, and a microcoded FSM controller with ~52 states.
The system integrates:
| Subsystem | Description |
|---|---|
| CPU | 16-bit microcoded processor with 3-operand ISA |
| Memory | 64×16-bit synchronous RAM with MIF initialization |
| PS/2 Controller | Hardware keyboard interface with scancode decoding |
| VGA Controller | 800×600 @ 72 Hz display with color-coded output |
| 7-Segment Display | Real-time PC and SP visualization |
| Clock Divider | 50 MHz → 1 Hz for visible CPU stepping |
┌─────────────────────────────────────────────────────────────────────────┐
│ FPGA (50 MHz) │
│ │
│ ┌───────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ PS/2 Port │────────►│ ps2.v │────────►│ scan_codes.v │ │
│ │ (Keyboard)│ serial │ Deserializer │ 16-bit │ Digit Decoder │ │
│ └───────────┘ └──────────────┘ code │ (0-9 keys) │ │
│ └────────┬─────────┘ │
│ control + num[3:0]│ │
│ ┌───────────┐ ┌──────────────┐ ┌────────▼─────────┐ │
│ │ 50 MHz │────────►│ clk_div.v │────────►│ cpu.v │ │
│ │ Crystal │ │ ÷50000000 │ 1 Hz │ 16-bit FSM CPU │ │
│ └───────────┘ └──────────────┘ │ (microcoded) │ │
│ └──┬──────┬───────┘ │
│ addr/we │ │ out[15:0]│
│ data │ │ │
│ ┌──────────▼──┐ │ │
│ │ memory.v │ │ │
│ │ 64×16 RAM │ │ │
│ │ (MIF init) │ │ │
│ └─────────────┘ │ │
│ │ │
│ ┌───────────────┐ ┌────────────────▼───────┐ │
│ ┌──────────┐◄─────────│ vga.v │◄──│ color_codes.v │ │
│ │ VGA Port │ RGB+Sync│ 800×600@72Hz │ │ Number → Color Map │ │
│ │ (Monitor)│ └───────────────┘ └────────────────────────┘ │
│ │
│ ┌──────────┐◄─────────────────────────────────────────────────────── │
│ │ 7-Seg │ bcd.v → ssd.v (PC on HEX0-1, SP on HEX2-3) │
│ │ Displays │ │
│ └──────────┘ │
│ │
│ ┌──────────┐◄────── LED[4:0] = out[4:0], LED[9] = status │
│ │ LEDs │ │
│ └──────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
The CPU uses a 3-operand, 16-bit fixed-width instruction encoding:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│ Opcode │X_d│ X_addr │Y_d│ Y_addr │Z_d│ Z_addr │
│ (4 bits) │ i │ (3 bits) │ i │ (3 bits) │ i │ (3 bits) │
└───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
| Field | Bits | Description |
|---|---|---|
Opcode |
[15:12] |
Operation selector (4 bits → 16 opcodes) |
X_di |
[11] |
X indirect flag (1 = dereference) |
X_addr |
[10:8] |
Destination operand address |
Y_di |
[7] |
Y indirect flag |
Y_addr |
[6:4] |
First source operand address |
Z_di |
[3] |
Z indirect flag |
Z_addr |
[2:0] |
Second source operand address |
| Opcode | Hex | Mnemonic | Operation | Description |
|---|---|---|---|---|
0000 |
0 |
MOV X, Y |
mem[X] ← mem[Y] |
Move / copy data |
0001 |
1 |
ADD X, Y, Z |
mem[X] ← mem[Y] + mem[Z] |
Addition |
0010 |
2 |
SUB X, Y, Z |
mem[X] ← mem[Y] − mem[Z] |
Subtraction |
0011 |
3 |
MUL X, Y, Z |
mem[X] ← mem[Y] × mem[Z] |
Multiplication |
0100 |
4 |
DIV X, Y, Z |
mem[X] ← mem[Y] / mem[Z] |
Division (reserved) |
0111 |
7 |
IN X |
mem[X] ← keyboard |
Read digit from PS/2 keyboard |
1000 |
8 |
OUT X |
display ← mem[X] |
Output value to VGA & LEDs |
1111 |
F |
STOP |
— | Halt execution |
| Code | Operation | Code | Operation |
|---|---|---|---|
000 |
ADD | 100 |
NOT |
001 |
SUB | 101 |
XOR |
010 |
MUL | 110 |
OR |
011 |
DIV | 111 |
AND |
The CPU operates as a multi-cycle microcoded FSM (~52 states):
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐
│ FETCH │────►│ DECODE │────►│ EXECUTE │────►│ WRITEBACK │──┐
│ Read PC │ │ X, Y, Z │ │ ALU │ │ Memory │ │
│ Inc PC │ │ Resolve │ │ I/O │ │ Write │ │
└─────────┘ │ Indirect │ └──────────┘ └───────────┘ │
▲ └──────────┘ │
└─────────────────────────────────────────────────────────────┘
Key internal registers:
PC— Program Counter (6-bit, resets to address 8)SP— Stack Pointer (6-bit, resets to address 63)MAR— Memory Address RegisterMDR— Memory Data RegisterA— AccumulatorIRH— Instruction Register (16-bit)
Address Content
─────── ──────────────────────────
0 – 7 General-purpose registers (R0–R7)
8 – 63 Program code + data space
─────── ──────────────────────────
Total: 64 words × 16 bits = 128 bytes
Programs start execution at address 8. Addresses 0–7 serve as working registers accessed via the 3-bit operand fields.
| Module | File | Purpose | Data Width |
|---|---|---|---|
cpu |
src/synthesis/modules/cpu.v |
Microcoded FSM processor | 16-bit |
alu |
src/synthesis/modules/alu.v |
Arithmetic & Logic Unit (8 ops) | 16-bit |
memory |
src/synthesis/modules/memory.v |
Synchronous RAM (64×16) | 16-bit |
register |
src/synthesis/modules/register.v |
Parameterized register with shift/inc/dec | configurable |
vga |
src/synthesis/modules/vga.v |
VGA timing generator (800×600 @ 72 Hz) | 12-bit RGB |
ps2 |
src/synthesis/modules/ps2.v |
PS/2 keyboard deserializer | 16-bit |
scan_codes |
src/synthesis/modules/scan_codes.v |
Scancode → digit decoder (0–9) | 4-bit |
color_codes |
src/synthesis/modules/color_codes.v |
Number → dual-color VGA mapper | 24-bit |
ssd |
src/synthesis/modules/ssd.v |
7-segment display decoder (hex) | 7-bit |
bcd |
src/synthesis/modules/bcd.v |
Binary → BCD converter (6-bit) | 4+4-bit |
deb |
src/synthesis/modules/debouncer.v |
Button debouncer (256-cycle filter) | 1-bit |
red |
src/synthesis/modules/red.v |
Rising edge detector | 1-bit |
clk_div |
src/synthesis/modules/clk_div.v |
Clock divider (default 50 M → 1 Hz) | 1-bit |
top |
src/synthesis/modules/top.v |
System integration (all modules) | — |
DE0_TOP |
src/synthesis/DE0_TOP.v |
Board wrapper — DE0 (Cyclone III) | — |
DE0_CV_TOP |
src/synthesis/DE0_CV_TOP.v |
Board wrapper — DE0-CV (Cyclone V) | — |
| Component | Specification |
|---|---|
| FPGA Board | Terasic DE0 (Cyclone III EP3C16F484C6) or DE0-CV (Cyclone V 5CEBA4F23C7) |
| PS/2 Keyboard | Any standard PS/2 keyboard (directly connected or via USB-PS/2 adapter) |
| VGA Monitor | Supports 800×600 @ 72 Hz |
| USB Blaster | For JTAG programming |
| Tool | Version | Purpose |
|---|---|---|
| Intel Quartus Prime | 13.1+ | Synthesis, place & route, programming |
| ModelSim / QuestaSim | — | RTL simulation & waveform analysis |
| Icarus Verilog | 10+ | Lint checking (optional) |
| GNU Make | 3.81+ | Build orchestration (bundled in tooling/xpack/) |
.
├── Makefile # ← Root build file (you are here)
├── README.md # Project documentation
├── src/
│ ├── simulation/ # Simulation testbenches
│ │ ├── top.sv # SystemVerilog testbench
│ │ └── top.v # Verilog testbench
│ └── synthesis/ # Synthesizable RTL
│ ├── DE0_TOP.v # Board wrapper (Cyclone III)
│ ├── DE0_CV_TOP.v # Board wrapper (Cyclone V)
│ └── modules/
│ ├── alu.v # Arithmetic Logic Unit
│ ├── bcd.v # Binary-to-BCD converter
│ ├── clk_div.v # Clock divider
│ ├── color_codes.v # Number-to-color mapper
│ ├── cpu.v # 16-bit CPU core
│ ├── debouncer.v # Button debouncer
│ ├── memory.v # 64×16 synchronous RAM
│ ├── ps2.v # PS/2 keyboard receiver
│ ├── red.v # Rising edge detector
│ ├── register.v # Parameterized register
│ ├── scan_codes.v # Scancode decoder
│ ├── ssd.v # 7-segment decoder
│ ├── top.v # System top-level integration
│ └── vga.v # VGA timing controller
└── tooling/
├── makefile # Legacy makefile (see root Makefile)
├── mem_init.mif # Initial program (Altera MIF format)
└── config/
├── list-icarus-verilog.lst # Icarus Verilog library paths
├── list-src-files-simul.lst # Simulation source list
├── list-src-files-synth.lst # Synthesis source list
├── run.tcl # Simulator run script
├── waveform-define.do # Waveform display config
└── boards/
├── cyclone3/ # Cyclone III pin assignments
└── cyclone5/ # Cyclone V pin assignments + SDC
git clone https://github.com/<your-username>/fpga-16-cpu.git
cd fpga-16-cpuEdit the tool paths in the root Makefile to match your installation:
# Simulator executable path
SIMUL_TOOL_EXE_DIR_PATH := C:\\"Program Files"\\altera\\13.1\\modelsim_ase\\win32aloem\\
# Quartus executable path
SYNTH_TOOL_EXE_DIR_PATH := C:\\"Program Files"\\altera\\13.1\\quartus\\bin\\Board selection — change these variables for your target:
| Board | SYNTH_TOP_LEVEL_MODULE |
SYNTH_DEVICE_FAMILY |
SYNTH_DEVICE_PART |
|---|---|---|---|
| DE0 (Cyclone III) | DE0_TOP |
CycloneIII |
EP3C16F484C6 |
| DE0-CV (Cyclone V) | DE0_CV_TOP |
CycloneV |
5CEBA4F23C7 |
# Navigate to the tooling directory
cd tooling
# View all available targets
../xpack/bin/make -f ../Makefile help
# Full synthesis flow
../xpack/bin/make -f ../Makefile synth
# Program the FPGA
../xpack/bin/make -f ../Makefile program
# Run simulation
../xpack/bin/make -f ../Makefile simul| Target | Description |
|---|---|
simul |
Compile all sources and run shell-mode simulation |
simul_lib |
Create the ModelSim/QuestaSim work library |
simul_cmp |
Compile all Verilog/SystemVerilog sources |
simul_run_gui |
Launch simulation with waveform viewer |
simul_run_sh |
Run simulation in headless shell mode |
simul_wave_old |
Open a previously saved waveform file |
simul_wave_new |
Run simulation and open the resulting waveform |
simul_clean |
Remove all simulation artifacts |
| Target | Description |
|---|---|
synth |
Full flow: Analysis → Synthesis → Place & Route → Assembly → STA |
synth_map |
Analysis & Synthesis (HDL → logic netlist) |
synth_fit |
Place & Route (netlist → device fitting) |
synth_asm |
Assembly (generate .sof programming file) |
synth_sta |
Static Timing Analysis |
synth_net |
Generate post-fit EQN netlist |
program |
Synthesize and program FPGA via JTAG |
synth_clean |
Remove all synthesis artifacts |
| Target | Description |
|---|---|
lint |
Run Icarus Verilog lint checks on synthesis sources |
list_all |
Regenerate all source file lists |
info |
Display current build configuration |
clean |
Remove all generated artifacts (simulation + synthesis) |
help |
Show all targets with descriptions |
Programs are defined in tooling/mem_init.mif using Altera's Memory Initialization File format. The CPU begins execution at address 8.
CONTENT BEGIN
[0..7]: 0000; -- Registers R0–R7 (initialized to 0)
8: 7101; -- IN R1 ; Read a number from keyboard
9: 8101; -- OUT R1 ; Display it
10: 0210; -- MOV R2, R1 ; Copy to R2
11: 1312; -- ADD R3, R1, R2 ; R3 = R1 + R2
12: 8301; -- OUT R3 ; Display sum
13: 7401; -- IN R4 ; Read another number
14: 2334; -- SUB R3, R3, R4 ; R3 = R3 - R4
15: 0530; -- MOV R5, R3 ; R5 = result
16: 8501; -- OUT R5 ; Display it
17: 7301; -- IN R3 ; Read another number
18: 3553; -- MUL R5, R5, R3 ; R5 = R5 * R3
19: 8501; -- OUT R5 ; Display product
20: F000; -- STOP ; Halt
[21..63]: 0000;
END;
To encode an instruction manually:
Opcode (4b) | X_di (1b) | X_addr (3b) | Y_di (1b) | Y_addr (3b) | Z_di (1b) | Z_addr (3b)
Example: ADD R3, R1, R2 → 0001 | 0 | 011 | 0 | 001 | 0 | 010 → 0x1312
IN R1 → 0111 | 0 | 001 | 0 | 000 | 0 | 001 → 0x7101
STOP → 1111 | 0 | 000 | 0 | 000 | 0 | 000 → 0xF000
The CPU output value (0–63) is displayed on the VGA monitor as two colored halves — the tens digit controls the left half, the ones digit controls the right half:
| Digit | Color | RGB (12-bit) |
|---|---|---|
| 0 | ⬛ Black | 000 |
| 1 | 🟥 Red | F00 |
| 2 | 🟧 Orange | F80 |
| 3 | 🟨 Yellow | FF0 |
| 4 | 🟩 Green | 0F0 |
| 5 | 🩵 Cyan | 0FF |
| 6 | 🟦 Teal | 088 |
| 7 | 🔵 Blue | 00F |
| 8 | 🟪 Magenta | F0F |
| 9 | ⬜ White | FFF |
Example: Output value 42 → Left half = Green (4), Right half = Orange (2)
| Display | Content |
|---|---|
HEX0–HEX1 |
Program Counter (PC) in decimal |
HEX2–HEX3 |
Stack Pointer (SP) in decimal |
| LED | Signal |
|---|---|
LED[4:0] |
Lower 5 bits of CPU output |
LED[9] |
CPU status (1 = waiting for keyboard input) |
Run the GUI simulation to inspect internal signals:
make simul_run_guiThe waveform configuration in tooling/config/waveform-define.do pre-loads key signals into the viewer.
The CPU runs at 1 Hz (50 MHz ÷ 50,000,000) for visible step-by-step execution. To change the speed, modify the DIVISOR parameter in the top-level instantiation.
- Fork the repository
- Create a feature branch (
git checkout -b feature/branch-prediction) - Commit changes with clear messages (
git commit -m "feat(cpu): add conditional branching") - Push to the branch (
git push origin feature/branch-prediction) - Open a Pull Request
- One module per
.vfile, filename matches module name - Use
parameterfor configurable widths; avoid hardcoded magic numbers - Active-low resets named
rst_n - Consistent indentation (tabs in Verilog, spaces in Makefiles)
This project is released under the MIT License.