Project 2: GPU Compute Core & Ray-Tracing Accelerator

Overview

This project implements a custom SIMT GPU Streaming Multiprocessor (SM) integrated with a dedicated hardware Ray-Tracing Compute Unit (RTCU). It is designed to demonstrate hardware/software co-design for graphics acceleration, bypassing Apple Silicon's historical lack of dedicated ray-tracing hardware blocks in earlier generations.

graph LR
    SM[GPU Streaming Multiprocessor] -->|1024-bit Packed Ray| RTCU[Ray-Tracing Compute Unit]
    RTCU -->|Intersection Test| BVH[Unified Memory / BVH Nodes]
    RTCU -->|Hit/Miss Results| SM

Architecture Components

1. `gpu_sm_core.sv` (SIMT Multiprocessor)

Implements a 32-lane vector streaming multiprocessor.

SIMT Execution Mask: Manages thread active masks across the 32-lane warp.
Instruction Decoder: Decodes custom graphics instructions. Employs a custom opcode (0x7B) to dispatch ray-tracing operations directly to the co-processor.
Packed Array Interface: Flattens 32-thread vector paths (e.g., 32 threads * 32-bit float coordinates) into 1024-bit packed vectors for synthesizable, high-bandwidth communication with the RTCU.

2. `rtcu_core.sv` (Ray-Tracing Compute Unit)

A dedicated hardware accelerator designed to perform parallel Bounding Volume Hierarchy (BVH) node traversal and ray-triangle intersection tests.

Memory Interface: Fetches scene geometry and BVH tree structures directly from Unified Memory using a 256-bit wide bus.
Pipelined Traversal: Employs an internal state machine (IDLE, FETCH_BVH, INT_BOX, FETCH_TRI, INT_TRI) to walk the spatial index and check for ray intersections.

3. `sim/gpu_sim.py` (Architectural Simulator)

A cycle-accurate architectural simulator written in Python.

Functional Math Model: Implements vector math, camera ray generation, and bounding box/triangle intersection tests.
Output: Path-traces a scene with spherical geometry and shadows, generating a native render.bmp file to verify the visual correctness of the rendering pipeline.

Verification & Simulation

Testbench: `tb/tb_gpu_core.sv`

Verifies the SM-to-RTCU interface:

Simulates an instruction fetch containing the custom RT opcode (0x7B).
Checks that the SM decodes the opcode and asserts the rtcu_dispatch_valid signal.
Verifies that the 1024-bit packed ray coordinates (origin and direction vectors) are properly driven onto the bus.

How to Run:

RTL Simulation: Run the design directly on EDA Playground using this pre-configured link: 👉 Live EDA Playground Simulator

Alternatively, copy tb_gpu_core.sv and the source design files into the playground manually, select Aldec Riviera Pro, and click Run.
Visual Simulator: Execute python sim/gpu_sim.py in your local terminal to run the architectural path-tracer and generate the visual render.bmp output.

Understanding the Synthesized Schematic (Block Diagram)

The included docs/Schematic_gpu_top.pdf represents the top-level hardware routing between the Streaming Multiprocessor (SM) and the Ray-Tracing Compute Unit (RTCU) produced by the Yosys synthesis suite. For engineers reviewing this schematic, note the following symbolic representations:

Octagonal Nodes: Represent the physical input and output ports of the gpu_top module.
Comparator Box ($eq): You can trace the fetch_instr bus directly into a comparator checking against 7'b1111011 (the binary representation of the 0x7B custom Ray-Tracing opcode).
AND Gate Box ($logic_and): The output of the comparator is logically ANDed with fetch_valid to generate the rtcu_dispatch_valid handshake signal.
1024-bit Bus Routing: The massive, ultra-wide data buses mapping the 32-lane packed ray origins and directions (e.g., rtcu_ray_dir_x) route cleanly and directly from the SM to the RTCU without combinatorial delay.
Hardware Handshake: The 1'1 (True) constant node is driven continuously into the rtcu_dispatch_ready port, confirming the single-cycle handshake capability of the coprocessor.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
sim		sim
src		src
tb		tb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project 2: GPU Compute Core & Ray-Tracing Accelerator

Overview

Architecture Components

1. `gpu_sm_core.sv` (SIMT Multiprocessor)

2. `rtcu_core.sv` (Ray-Tracing Compute Unit)

3. `sim/gpu_sim.py` (Architectural Simulator)

Verification & Simulation

Testbench: `tb/tb_gpu_core.sv`

How to Run:

Understanding the Synthesized Schematic (Block Diagram)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project 2: GPU Compute Core & Ray-Tracing Accelerator

Overview

Architecture Components

1. gpu_sm_core.sv (SIMT Multiprocessor)

2. rtcu_core.sv (Ray-Tracing Compute Unit)

3. sim/gpu_sim.py (Architectural Simulator)

Verification & Simulation

Testbench: tb/tb_gpu_core.sv

How to Run:

Understanding the Synthesized Schematic (Block Diagram)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `gpu_sm_core.sv` (SIMT Multiprocessor)

2. `rtcu_core.sv` (Ray-Tracing Compute Unit)

3. `sim/gpu_sim.py` (Architectural Simulator)

Testbench: `tb/tb_gpu_core.sv`

Packages