Skip to content

Ln11211/sys

Repository files navigation

Adaptive Edge AI Accelerator

Real-time anomaly detection engine for streaming sensor data, implemented in SystemVerilog and targeting the Digilent Nexys A7-100T FPGA board.

The accelerator processes a stream of 16-bit signed samples through a parallel array of Processing Elements (PEs) and produces a scalar anomaly score together with a binary flag all within a compact, memory-mapped register interface that is driven directly from the Vivado Hardware Manager over USB-JTAG with no external controller required.


Feature Overview

Feature Detail
Target device Xilinx Artix-7 xc7a100tcsg324-1 (Nexys A7-100T)
Clock 100 MHz on-board oscillator
Data format Q8.8 signed fixed-point (16-bit samples, 16-bit coefficients)
Accumulator 32-bit signed
Processing Elements 4 × parallel PEs
Stream buffer depth 256 samples
Temporal window 16 samples (shift-register)
Host interface 16-bit address, 32-bit data, synchronous read/write
Debug interface Xilinx VIO (Virtual I/O) over USB-JTAG
Utilisation (baseline) ~1.8 % LUTs, ~1.2 % FFs, 4 × DSP48
Power (total on-chip) ~101 mW (97 mW device static + ~4 mW dynamic)

Operating Modes

Mode Encoding (CTRL[3:2]) Operation
0 — Dense MAC 2'b00 score += Σ(sample × coeff) across all 4 PEs per sample
1 — Threshold 2'b01 score += 1 for each sample where sample > threshold
2 — Temporal 2'b10 score += Σ(current − previous) ; flags rapid change

Repository Layout

.
├── rtl/
│   ├── adaptive_edge_ai_params_pkg.sv  # Shared parameter package
│   ├── processing_element.sv           # Core compute (MAC / thresh / temporal)
│   ├── pe_array.sv                     # 4× PE instantiation
│   ├── decision_engine.sv              # PE output reduction (combinational)
│   ├── stream_input_buffer.sv          # 256×16-bit sample store (LUT-RAM)
│   ├── window_buffer.sv                # 16-sample shift-register + sum
│   ├── mode_controller.sv              # 5-state FSM + score accumulator
│   ├── reg_if.sv                       # Memory-mapped register interface
│   ├── adaptive_edge_ai_top.sv         # Design top (SystemVerilog)
│   ├── nexys_a7_top.v                  # Board wrapper + VIO (plain Verilog)
│   ├── nexys_a7_vio.xdc                # Pin / IOSTANDARD constraints
│   ├── tb_adaptive_edge_ai_top.sv      # Self-checking functional testbench
│   └── tb_edge_cases_500.sv            # Extended edge-case testbench
├── vio_test_a_mode0.tcl                # VIO TCL script — Mode 0 only
├── vio_test_all_modes.tcl              # VIO TCL script — all three modes
├── on_board_testing_guide.md           # Step-by-step board testing guide
├── adaptive_edge_ai_report.md          # Full design specification
├── power_report.txt                    # Vivado power analysis baseline
└── DEPS.yml                            # Simulation dependency file

Architecture

         ┌─────────────────────────────────────────────────────┐
         │                 adaptive_edge_ai_top                │
         │                                                     │
Host ───►│  reg_if ──► stream_input_buffer                     │
  bus    │     │              │ stream_rd_data                 │
         │     │       mode_controller ──────────► pe_array    │
         │     │         │        │  pe_enable     (4× PE)     │
         │     │    window_buffer │                    │       │
         │     │   current/prev   │              decision_eng. │
         │     │                  │                    │       │
         │     └──── status ◄─────┘         score/flag│        │
         │              ▲                              │       │
         │              └──────────────────────────────┘       │
         └─────────────────────────────────────────────────────┘

On Board Device (Nexys A7 FPGA) Summary

Version 1 Baseline (Unoptimized)

Summary: image

Power consumption: image

Schematic: image

Version 2 Optimized for On Board Power consumption

Summary: image

power consumption: image

Schematic: image

Module Responsibilities

Module Role
reg_if Decodes host reads/writes; holds coefficients, thresholds, samples, and control bits; exposes status
stream_input_buffer Single-port LUT-RAM; host writes samples, FSM reads them sequentially
window_buffer Shift-register storing the 16 most-recent samples; provides current_sample, previous_sample, and window_sum
mode_controller 5-state FSM (IDLE → VALIDATE → LOAD_SAMPLE → PROCESS → UPDATE); drives pe_enable and accumulates score
pe_array Instantiates 4 identical processing_element blocks; fans out sample/coeff/threshold inputs
processing_element Fully combinational; computes mode-specific value and flag; DSP inputs are zero-masked when disabled
decision_engine Combinational reduction of 4 PE values/flags to a single score and anomaly flag
adaptive_edge_ai_top Wires all sub-modules; exposes busy output for clock-gate enable
nexys_a7_top Plain-Verilog board wrapper; instantiates VIO IP and BUFGCE clock gate

Register Map

List of configurable registers

Register Offset Width Access Reset Description
CONTROL 0x0000 32 WO 0 [0] start, [1] clear, [3:2] mode
STATUS 0x0004 32 RO 0 [0] done, [1] busy, [2] valid, [3] anomaly_flag, [4] error
STREAM_LENGTH 0x0008 8 RW 0 Number of samples to process (1–256)
THRESHOLD_0–3 0x000C0x0018 16 RW 0 Signed threshold per PE
COEFF_0–3 0x001C0x0028 16 RW 0 Signed coefficient per PE (Mode 0)
RESULT_SCORE 0x002C 32 RO 0 Final accumulated anomaly score
SAMPLE[n] 0x1000 + n×4 16 WO 0 Stream sample at index n (0–255)

CONTROL Register Encoding

Operation Write value
Clear (resets FSM + window) 0x00000002
Set mode 0 (MAC) 0x00000000
Set mode 1 (Threshold) 0x00000004
Set mode 2 (Temporal) 0x00000008
Start mode 0 0x00000001
Start mode 1 0x00000005
Start mode 2 0x00000009

Important: always write mode and start as two separate transactions. The mode field must be stable before the start_pulse is captured by the FSM on the following clock edge.


Power Optimisations

Two RTL-level changes were applied to reduce dynamic power from the baseline report:

1. DSP Operand Isolation (processing_element.sv)

The mac_a and mac_b inputs to each DSP48 are driven to zero whenever enable = 0. The PE is only enabled for one cycle per sample (the PROCESS state), so this reduces the effective DSP toggle rate by ~90 %, targeting the 4 mW DSP row in the power report.

always_comb begin
    mac_a = enable ? current_sample : '0;  // freeze DSP inputs when idle
    mac_b = enable ? coeff          : '0;
end
always_comb mac_product = mac_a * mac_b;

2. BUFGCE Clock Gate (nexys_a7_top.v)

The 100 MHz clock is gated to the DUT when the design is completely idle. The gate enable is: host_wr_en | host_rd_en | dut_busy | ~rst_n.

wire clk_dut_en = host_wr_en | host_rd_en | dut_busy | (~rst_n);
BUFGCE u_clk_gate (.I(clk), .CE(clk_dut_en), .O(clk_dut));

The VIO core retains the free-running clk; only u_dut uses the gated clock. Expected combined saving: ~7–9 mW off the 14 mW dynamic total.


Simulation

The project uses a DEPS.yml dependency file compatible with the OpenCOS EDA simulation flow.

# Run all tests with Verilator
eda sim --tool verilator --waves bench_adaptive_edge_ai

The self-checking testbench (tb_adaptive_edge_ai_top.sv) covers:

  • Mode 0 MAC with known coefficients and samples (expected score = 40)
  • Mode 1 threshold comparator (expected score = 2)
  • Mode 2 temporal difference detector (expected score = 60)
  • Error conditions: reserved mode (mode 3), zero stream length

All 9 test cases must pass with RESULT: PASS before generating a bitstream.


Building and Programming

Prerequisites

  • Vivado 2020.x or later (free WebPACK edition covers Artix-7)
  • Digilent Nexys A7-100T board
  • USB-A cable to the PROG/UART port on the board

Step 1 — Create the Vivado Project

  1. File → Project → New
  2. Part: xc7a100tcsg324-1
  3. Add Sources — add all rtl/*.sv and rtl/nexys_a7_top.v
  4. Add Constraints — add rtl/nexys_a7_vio.xdc
  5. Set nexys_a7_top as the top module

Step 2 — Generate the VIO IP

  1. IP Catalog → VIO (Debug) — name it vio_0
  2. Input probes: probe_in0 width=32, probe_in1 width=1
  3. Output probes: probe_out0 width=1, probe_out1 width=1, probe_out2 width=16, probe_out3 width=16
  4. Generate and let Vivado synthesise the stub

Step 3 — Run Implementation

Flow Navigator → Run Synthesis → Run Implementation → Generate Bitstream

Check that WNS ≥ 0 in the timing summary before generating the bitstream. Typical resource usage on xc7a100t:

Resource Used Available Utilisation
LUT ~530 63 400 ~0.8 %
FF ~310 126 800 ~0.2 %
DSP48 4 240 1.7 %
BRAM 0 135 0 %

Step 4 — Program the Board

  1. Power on the Nexys A7 (SW16 → ON)
  2. Open Hardware Manager → Auto Connect
  3. You should see xc7a100t_0
  4. Program Device → select adaptive_edge_ai.runs/impl_1/nexys_a7_top.bit
  5. The green DONE LED illuminates when programming succeeds

On-Board Testing

Interactive testing is done entirely through the Vivado Hardware Manager using the VIO dashboard or the provided TCL scripts — no Pmod wiring or external microcontroller required.

Quick Test via TCL Script

With the board programmed and Hardware Manager connected:

# In the Vivado TCL console:
source {D:/sys/vio_test_all_modes.tcl}

The script runs Tests A, B, and C automatically and prints pass/fail for each expected result.

Manual Test — Mode 0 (expected RESULT_SCORE = 40)

Write 0x0002 → 0x0000   clear
Write 0x0001 → 0x001C   COEFF_0 = 1
Write 0x0001 → 0x0020   COEFF_1 = 1
Write 0x0001 → 0x0024   COEFF_2 = 1
Write 0x0001 → 0x0028   COEFF_3 = 1
Write 0x0001 → 0x1000   sample[0] = 1
Write 0x0002 → 0x1004   sample[1] = 2
Write 0x0003 → 0x1008   sample[2] = 3
Write 0x0004 → 0x100C   sample[3] = 4
Write 0x0004 → 0x0008   stream_length = 4
Write 0x0000 → 0x0000   mode = 0
Write 0x0001 → 0x0000   START
Poll  0x0004             wait for bit[0] = 1 (done)
Read  0x002C             RESULT_SCORE → expect 0x00000028 (= 40)

LED pattern after reading STATUS:

LED[4..0] = 0_1_1_0_1   (error=0, anomaly=1, valid=1, busy=0, done=1)

See on_board_testing_guide.md for the complete test procedures for all three modes and error conditions.


LED Status Reference

The lower 8 bits of host_rdata are always wired to LED[7:0]:

┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐
│LED[7]│LED[6]│LED[5]│LED[4]│LED[3]│LED[2]│LED[1]│LED[0]│
│  —   │  —   │  —   │error │anom. │valid │busy  │done  │
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘

Read the STATUS register (0x0004) to update the LEDs:

State LED pattern Meaning
Processing complete, anomaly detected 0x0D done=1, valid=1, anomaly=1
Processing complete, no anomaly 0x05 done=1, valid=1, anomaly=0
Actively processing 0x02 busy=1
Error (bad mode / zero length) 0x11 done=1, error=1

Troubleshooting

Symptom Likely cause Fix
DRC NSTD-1 / UCIO-1 at bitstream LED constraints not applied Ensure nexys_a7_vio.xdc is in Constraint Sources and targets Implementation
RESULT_SCORE = 0 after done Mode not written before start Write mode in a separate CONTROL write, then write start
done never asserts stream_length = 0 or not written Write STREAM_LENGTH ≥ 1 before START
Wrong Mode 2 score Window buffer not cleared Issue Write 0x0002 → 0x0000 (clear) before every Mode 2 run
VIO probe names not found by TCL Probe names differ by Vivado version Run get_hw_probes -of_objects $vio to list actual names
Board not detected Wrong USB port or power off Use the PROG USB port (left side); verify SW16 is ON

About

This project is entirely vibe coded

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors