IMP - Inference Machine Pipeline

⚠️ DEVELOPMENT STATUS - UNTESTED ON PHYSICAL HARDWARE ⚠️

This project is in active development. The system has been:

✅ Designed with correct KV260 memory constraints (single 512KB BRAM buffer)
✅ Compiled through brief-compiler to valid SystemVerilog
✅ Verified with Verilator lint and simulation
❌ NOT YET tested on actual KV260 hardware

Do not deploy to production hardware without physical validation.

The IMP Project

The IMP project is a response to the current status quo of AI into subscription-based cloud services. I created a custom language called Brief which allowed me to write the same syntax for both hardware and software, and I realised I could map neural network operations directly to FPGA gate logic. By combining 1.58-bit ternary quantization with Gated Delta Networks, the system enables a 9B parameter model to run on a standard $250 Kria KV260 board. Executing bare-metal on the ARM processor removes the overhead of a traditional operating system, maximizing memory availability and reducing the 19.2 GB/s bandwidth bottleneck.

I am open-sourcing the IMP engine and the Brief compiler under the GPLv2 license to ensure the logic remains a public, reciprocal resource. This architecture significantly lowers the environmental footprint of AI by replacing data-center power requirements with efficient, edge-native silicon logic. The goal is to provide individuals with the hardware design tools and model access usually reserved for large corporations. By moving AI from a rented service to locally-owned hardware, we ensure that the ability to process and generate information remains a permanent utility under the user's direct control.

I am currently actively developing this project, and current known limitations will likely be solved in the near future.

Architecture

                    ┌─────────────────┐
                    │  ARM Cortex-A53 │
                    │  (Brief Kernel) │
                    └────────┬────────┘
                             │ AXI4-Lite MMIO
                             ▼
                    ┌─────────────────┐
                    │  FPGA Neural    │
                    │  Core Engine    │
                    │  (SystemVerilog)│
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │     DDR4        │
                    │  (Model Weights) │
                    └─────────────────┘

Key Features:

Ternary computation (-1, 0, +1) for 1.58-bit quantization
Single 512KB BRAM buffer streamed from DDR4
Bare-metal ARM execution (no OS overhead)
Brief language compiles to both ARM and FPGA

Quick Start

Prerequisites

KV260 board
Rust toolchain for ARM (thumbv7neon-none-eabihf)
Xilinx Vivado (for FPGA synthesis)
Verilator (for simulation)

Build & Simulate

# Generate hardware from Brief specification
./brief-compiler verilog neuralcore.ebv --hw hardware.toml -o generated/

# Run Verilator simulation
./run_sim.sh

# View waveforms
gtkwave sim_build/waveform.vcd

Deploy to SD Card

./build_sdcard.sh /path/to/sdcard

Research Foundations

This project builds on established research in binary neural networks and edge AI acceleration:

Ternary Quantization: {-1, 0, +1} weights enabling 1.58-bit per parameter
Binary Neural Networks (XNOR-Net): Courbariaux et al., 2016
Gated Delta Networks: Low-bandwidth recurrent architectures
Edge AI Compilation: FPGA-accelerated inference on resource-constrained devices

See RESEARCH_BIBLIOGRAPHY.md for full citation list.

Project Structure

imp/
├── neuralcore.ebv       # FPGA neural engine (Brief)
├── kernel.ebv           # ARM kernel specification (Brief)
├── hardware.toml        # KV260 memory map & constraints
├── arm/
│   ├── kernel.rs        # ARM bare-metal implementation
│   └── memory.ld        # Linker script (DDR4 at 0x0)
├── generated/           # Compiled SystemVerilog
├── run_sim.sh           # Verilator simulation
├── build_sdcard.sh      # SD card builder
├── SPEC_v0.1.md         # Full architecture specification
└── Imp.svg              # Project logo

Documentation

Document	Purpose
SPEC_v0.1.md	Full architecture specification
INFERENCE_GUIDE.md	Usage and TCP protocol
Build_Workflow.md	Build pipeline
FLASH_GUIDE.md	SD card deployment
Hardware_Config_Guide.md	hardware.toml reference

Known Limitations

DMA Transfer: Weights currently stream via AXI4-Lite MMIO (CPU bound). High-speed AXI-DMA (AXI4-Full) is required for full performance.
Tokenizer: Simple fallback (no BPE vocabulary loaded)
TCP Stack: Stub (needs LwIP integration)
FPGA Synthesis: Not yet run through Vivado
Physical Testing: No hardware validation performed

License

See LICENSE file for details.

Acknowledgments

This project was made possible by the foundational research of many intelligent individuals. See RESEARCH_BIBLIOGRAPHY.md for the papers and resources that informed this work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMP - Inference Machine Pipeline

⚠️ DEVELOPMENT STATUS - UNTESTED ON PHYSICAL HARDWARE ⚠️

The IMP Project

Architecture

Quick Start

Prerequisites

Build & Simulate

Deploy to SD Card

Research Foundations

Project Structure

Documentation

Known Limitations

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
arm		arm
boot		boot
brief-compiler		brief-compiler
generated		generated
.gitignore		.gitignore
Build_Workflow.md		Build_Workflow.md
FLASH_GUIDE.md		FLASH_GUIDE.md
Hardware_Config_Guide.md		Hardware_Config_Guide.md
INFERENCE_GUIDE.md		INFERENCE_GUIDE.md
Imp.svg		Imp.svg
LICENSE		LICENSE
README.md		README.md
RESEARCH.md		RESEARCH.md
RESEARCH_BIBLIOGRAPHY.md		RESEARCH_BIBLIOGRAPHY.md
SPEC_v0.1.md		SPEC_v0.1.md
build_imp.tcl		build_imp.tcl
build_sdcard.sh		build_sdcard.sh
hardware.toml		hardware.toml
kernel.ebv		kernel.ebv
kernel.toml		kernel.toml
neuralcore.ebv		neuralcore.ebv
neuralcore_axi.sv		neuralcore_axi.sv
run_sim.sh		run_sim.sh
verify_sv.sh		verify_sv.sh

Folders and files

Latest commit

History

Repository files navigation

IMP - Inference Machine Pipeline

⚠️ DEVELOPMENT STATUS - UNTESTED ON PHYSICAL HARDWARE ⚠️

The IMP Project

Architecture

Quick Start

Prerequisites

Build & Simulate

Deploy to SD Card

Research Foundations

Project Structure

Documentation

Known Limitations

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages