Real-time anomaly detection engine for streaming sensor data, implemented in SystemVerilog and targeting the Digilent Nexys A7-100T FPGA board.
The accelerator processes a stream of 16-bit signed samples through a parallel array of Processing Elements (PEs) and produces a scalar anomaly score together with a binary flag all within a compact, memory-mapped register interface that is driven directly from the Vivado Hardware Manager over USB-JTAG with no external controller required.
| Feature | Detail |
|---|---|
| Target device | Xilinx Artix-7 xc7a100tcsg324-1 (Nexys A7-100T) |
| Clock | 100 MHz on-board oscillator |
| Data format | Q8.8 signed fixed-point (16-bit samples, 16-bit coefficients) |
| Accumulator | 32-bit signed |
| Processing Elements | 4 × parallel PEs |
| Stream buffer depth | 256 samples |
| Temporal window | 16 samples (shift-register) |
| Host interface | 16-bit address, 32-bit data, synchronous read/write |
| Debug interface | Xilinx VIO (Virtual I/O) over USB-JTAG |
| Utilisation (baseline) | ~1.8 % LUTs, ~1.2 % FFs, 4 × DSP48 |
| Power (total on-chip) | ~101 mW (97 mW device static + ~4 mW dynamic) |
| Mode | Encoding (CTRL[3:2]) |
Operation |
|---|---|---|
| 0 — Dense MAC | 2'b00 |
score += Σ(sample × coeff) across all 4 PEs per sample |
| 1 — Threshold | 2'b01 |
score += 1 for each sample where sample > threshold |
| 2 — Temporal | 2'b10 |
score += Σ(current − previous) ; flags rapid change |
.
├── rtl/
│ ├── adaptive_edge_ai_params_pkg.sv # Shared parameter package
│ ├── processing_element.sv # Core compute (MAC / thresh / temporal)
│ ├── pe_array.sv # 4× PE instantiation
│ ├── decision_engine.sv # PE output reduction (combinational)
│ ├── stream_input_buffer.sv # 256×16-bit sample store (LUT-RAM)
│ ├── window_buffer.sv # 16-sample shift-register + sum
│ ├── mode_controller.sv # 5-state FSM + score accumulator
│ ├── reg_if.sv # Memory-mapped register interface
│ ├── adaptive_edge_ai_top.sv # Design top (SystemVerilog)
│ ├── nexys_a7_top.v # Board wrapper + VIO (plain Verilog)
│ ├── nexys_a7_vio.xdc # Pin / IOSTANDARD constraints
│ ├── tb_adaptive_edge_ai_top.sv # Self-checking functional testbench
│ └── tb_edge_cases_500.sv # Extended edge-case testbench
├── vio_test_a_mode0.tcl # VIO TCL script — Mode 0 only
├── vio_test_all_modes.tcl # VIO TCL script — all three modes
├── on_board_testing_guide.md # Step-by-step board testing guide
├── adaptive_edge_ai_report.md # Full design specification
├── power_report.txt # Vivado power analysis baseline
└── DEPS.yml # Simulation dependency file
┌─────────────────────────────────────────────────────┐
│ adaptive_edge_ai_top │
│ │
Host ───►│ reg_if ──► stream_input_buffer │
bus │ │ │ stream_rd_data │
│ │ mode_controller ──────────► pe_array │
│ │ │ │ pe_enable (4× PE) │
│ │ window_buffer │ │ │
│ │ current/prev │ decision_eng. │
│ │ │ │ │
│ └──── status ◄─────┘ score/flag│ │
│ ▲ │ │
│ └──────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
| Module | Role |
|---|---|
reg_if |
Decodes host reads/writes; holds coefficients, thresholds, samples, and control bits; exposes status |
stream_input_buffer |
Single-port LUT-RAM; host writes samples, FSM reads them sequentially |
window_buffer |
Shift-register storing the 16 most-recent samples; provides current_sample, previous_sample, and window_sum |
mode_controller |
5-state FSM (IDLE → VALIDATE → LOAD_SAMPLE → PROCESS → UPDATE); drives pe_enable and accumulates score |
pe_array |
Instantiates 4 identical processing_element blocks; fans out sample/coeff/threshold inputs |
processing_element |
Fully combinational; computes mode-specific value and flag; DSP inputs are zero-masked when disabled |
decision_engine |
Combinational reduction of 4 PE values/flags to a single score and anomaly flag |
adaptive_edge_ai_top |
Wires all sub-modules; exposes busy output for clock-gate enable |
nexys_a7_top |
Plain-Verilog board wrapper; instantiates VIO IP and BUFGCE clock gate |
List of configurable registers
| Register | Offset | Width | Access | Reset | Description |
|---|---|---|---|---|---|
| CONTROL | 0x0000 |
32 | WO | 0 |
[0] start, [1] clear, [3:2] mode |
| STATUS | 0x0004 |
32 | RO | 0 |
[0] done, [1] busy, [2] valid, [3] anomaly_flag, [4] error |
| STREAM_LENGTH | 0x0008 |
8 | RW | 0 |
Number of samples to process (1–256) |
| THRESHOLD_0–3 | 0x000C–0x0018 |
16 | RW | 0 |
Signed threshold per PE |
| COEFF_0–3 | 0x001C–0x0028 |
16 | RW | 0 |
Signed coefficient per PE (Mode 0) |
| RESULT_SCORE | 0x002C |
32 | RO | 0 |
Final accumulated anomaly score |
| SAMPLE[n] | 0x1000 + n×4 |
16 | WO | 0 |
Stream sample at index n (0–255) |
| Operation | Write value |
|---|---|
| Clear (resets FSM + window) | 0x00000002 |
| Set mode 0 (MAC) | 0x00000000 |
| Set mode 1 (Threshold) | 0x00000004 |
| Set mode 2 (Temporal) | 0x00000008 |
| Start mode 0 | 0x00000001 |
| Start mode 1 | 0x00000005 |
| Start mode 2 | 0x00000009 |
Important: always write mode and start as two separate transactions. The mode field must be stable before the
start_pulseis captured by the FSM on the following clock edge.
Two RTL-level changes were applied to reduce dynamic power from the baseline report:
The mac_a and mac_b inputs to each DSP48 are driven to zero whenever
enable = 0. The PE is only enabled for one cycle per sample (the
PROCESS state), so this reduces the effective DSP toggle rate by ~90 %,
targeting the 4 mW DSP row in the power report.
always_comb begin
mac_a = enable ? current_sample : '0; // freeze DSP inputs when idle
mac_b = enable ? coeff : '0;
end
always_comb mac_product = mac_a * mac_b;The 100 MHz clock is gated to the DUT when the design is completely idle.
The gate enable is: host_wr_en | host_rd_en | dut_busy | ~rst_n.
wire clk_dut_en = host_wr_en | host_rd_en | dut_busy | (~rst_n);
BUFGCE u_clk_gate (.I(clk), .CE(clk_dut_en), .O(clk_dut));The VIO core retains the free-running clk; only u_dut uses the gated
clock. Expected combined saving: ~7–9 mW off the 14 mW dynamic total.
The project uses a DEPS.yml dependency file compatible with the
OpenCOS EDA simulation flow.
# Run all tests with Verilator
eda sim --tool verilator --waves bench_adaptive_edge_ai
The self-checking testbench (tb_adaptive_edge_ai_top.sv) covers:
- Mode 0 MAC with known coefficients and samples (expected score = 40)
- Mode 1 threshold comparator (expected score = 2)
- Mode 2 temporal difference detector (expected score = 60)
- Error conditions: reserved mode (mode 3), zero stream length
All 9 test cases must pass with RESULT: PASS before generating a
bitstream.
- Vivado 2020.x or later (free WebPACK edition covers Artix-7)
- Digilent Nexys A7-100T board
- USB-A cable to the PROG/UART port on the board
- File → Project → New
- Part:
xc7a100tcsg324-1 - Add Sources — add all
rtl/*.svandrtl/nexys_a7_top.v - Add Constraints — add
rtl/nexys_a7_vio.xdc - Set
nexys_a7_topas the top module
- IP Catalog → VIO (Debug) — name it
vio_0 - Input probes:
probe_in0width=32,probe_in1width=1 - Output probes:
probe_out0width=1,probe_out1width=1,probe_out2width=16,probe_out3width=16 - Generate and let Vivado synthesise the stub
Flow Navigator → Run Synthesis → Run Implementation → Generate Bitstream
Check that WNS ≥ 0 in the timing summary before generating the bitstream. Typical resource usage on xc7a100t:
| Resource | Used | Available | Utilisation |
|---|---|---|---|
| LUT | ~530 | 63 400 | ~0.8 % |
| FF | ~310 | 126 800 | ~0.2 % |
| DSP48 | 4 | 240 | 1.7 % |
| BRAM | 0 | 135 | 0 % |
- Power on the Nexys A7 (SW16 → ON)
- Open Hardware Manager → Auto Connect
- You should see
xc7a100t_0 - Program Device → select
adaptive_edge_ai.runs/impl_1/nexys_a7_top.bit - The green DONE LED illuminates when programming succeeds
Interactive testing is done entirely through the Vivado Hardware Manager using the VIO dashboard or the provided TCL scripts — no Pmod wiring or external microcontroller required.
With the board programmed and Hardware Manager connected:
# In the Vivado TCL console:
source {D:/sys/vio_test_all_modes.tcl}The script runs Tests A, B, and C automatically and prints pass/fail for each expected result.
Write 0x0002 → 0x0000 clear
Write 0x0001 → 0x001C COEFF_0 = 1
Write 0x0001 → 0x0020 COEFF_1 = 1
Write 0x0001 → 0x0024 COEFF_2 = 1
Write 0x0001 → 0x0028 COEFF_3 = 1
Write 0x0001 → 0x1000 sample[0] = 1
Write 0x0002 → 0x1004 sample[1] = 2
Write 0x0003 → 0x1008 sample[2] = 3
Write 0x0004 → 0x100C sample[3] = 4
Write 0x0004 → 0x0008 stream_length = 4
Write 0x0000 → 0x0000 mode = 0
Write 0x0001 → 0x0000 START
Poll 0x0004 wait for bit[0] = 1 (done)
Read 0x002C RESULT_SCORE → expect 0x00000028 (= 40)
LED pattern after reading STATUS:
LED[4..0] = 0_1_1_0_1 (error=0, anomaly=1, valid=1, busy=0, done=1)
See on_board_testing_guide.md for the
complete test procedures for all three modes and error conditions.
The lower 8 bits of host_rdata are always wired to LED[7:0]:
┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐
│LED[7]│LED[6]│LED[5]│LED[4]│LED[3]│LED[2]│LED[1]│LED[0]│
│ — │ — │ — │error │anom. │valid │busy │done │
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘
Read the STATUS register (0x0004) to update the LEDs:
| State | LED pattern | Meaning |
|---|---|---|
| Processing complete, anomaly detected | 0x0D |
done=1, valid=1, anomaly=1 |
| Processing complete, no anomaly | 0x05 |
done=1, valid=1, anomaly=0 |
| Actively processing | 0x02 |
busy=1 |
| Error (bad mode / zero length) | 0x11 |
done=1, error=1 |
| Symptom | Likely cause | Fix |
|---|---|---|
| DRC NSTD-1 / UCIO-1 at bitstream | LED constraints not applied | Ensure nexys_a7_vio.xdc is in Constraint Sources and targets Implementation |
| RESULT_SCORE = 0 after done | Mode not written before start | Write mode in a separate CONTROL write, then write start |
| done never asserts | stream_length = 0 or not written | Write STREAM_LENGTH ≥ 1 before START |
| Wrong Mode 2 score | Window buffer not cleared | Issue Write 0x0002 → 0x0000 (clear) before every Mode 2 run |
| VIO probe names not found by TCL | Probe names differ by Vivado version | Run get_hw_probes -of_objects $vio to list actual names |
| Board not detected | Wrong USB port or power off | Use the PROG USB port (left side); verify SW16 is ON |





