Skip to content

rombarak/FIR-Filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

16-Tap FIR Filter (Verilog HDL)

This repository contains a fully-pipelined, synthesizable 16-tap FIR (Finite Impulse Response) filter implemented in Verilog HDL.
The design demonstrates fixed-point DSP techniques, deterministic pipelining, a balanced adder tree, and full cycle-accurate verification using a self-checking testbench.


Table of Contents


Author

Rom Barak
B.Sc. Electrical Engineering, Bar-Ilan University
Focus: Nanoelectronics and Communication Systems


Introduction

This project implements a complete fixed-point low-pass FIR filter using:

  • 16-tap input shift register
  • Constant Q1.15 coefficients
  • Two-stage pipelined MAC core
  • Balanced 36-bit adder tree
  • Fully self-checking testbench

The design outputs one filtered sample per clock with a deterministic two-cycle latency
and is fully synthesizable for FPGA/ASIC integration.


What Is Fixed-Point Arithmetic?

Fixed-point (Q-format) is widely used in hardware DSP due to:

  • Low area and power
  • Deterministic rounding/overflow
  • Reproducible bit-accurate behavior
  • Efficient synthesis and timing closure

Q1.15 Format (Used Here)

  • 1 sign bit
  • 15 fractional bits
  • Range: −1.0 ≤ x < +1.0
  • Resolution ≈ 3.05×10⁻⁵

Multiplications:

  • 16-bit × 16-bit → 32-bit
  • Summing 16 products → extended to 36 bits to avoid overflow.

FIR Filter Theory

A 16-tap FIR computes:

[ y[n] = \sum_{k=0}^{15} h[k] \cdot x[n-k] ]

FIR properties:

  • Always stable
  • Linear phase (symmetric coefficients)
  • Deterministic timing
  • Ideal for audio, comms, and sensor filtering

Coefficient Design

The filter uses the following symmetric low-pass FIR coefficients:

[-84, -53, 120, 240, 350, 420, 450, 460,
 460, 450, 420, 350, 240, 120, -53, -84]

Why they were chosen:

  • Symmetry → linear phase
  • Smooth shape → good attenuation
  • Normalized → avoids internal overflow
  • 16 taps → optimal area/latency tradeoff

Architecture Overview

Block Description
Shift Register Stores the last 16 samples
Coefficient ROM Provides Q1.15 taps
MAC Core 16 multipliers + 36-bit adder tree
Top-Level Wiring + control
Testbench Full cycle-accurate verification

Latency: 2 cycles
Throughput: 1 output/clock


🧩 Module Descriptions


Shift Register

Stores 16 sequential samples (tap0 = newest).
Provides a stable 256-bit packed bus to the MAC core.

Waveform:
Shift Register

Shows correct shifting of samples each cycle and proper reset initialization.


Coefficient ROM

Provides 16 constant signed Q1.15 coefficients.
Zero latency, fully synthesizable.


MAC Core

Stage 1 — Parallel Multipliers

Each tap × coefficient pair is multiplied in parallel
and stored in 32-bit registers.

Waveforms:
MAC Products 1
MAC Products 2
MAC Products 3

Shows correct multiplier timing and stable registered outputs.


Stage 2 — 36-bit Adder Tree

A balanced four-stage adder tree reduces:

16 → 8 → 4 → 2 → 1

Waveforms:
MAC Part 1
MAC Part 2
MAC Part 3

Shows progressive accumulation of partial sums without overflow.


Final Pipeline Stage

Registers the final 36-bit result → ensures deterministic 2-cycle latency.

Waveform:
Output Latency

Shows exact 2-cycle delay between input activity and final output.


Top-Level

Connects the entire FIR structure: sample_in ──► shift_reg ──► samples_flat
(samples_flat, coeffs) ──► MAC ──► y_out

Clock/reset propagate synchronously to ensure stable timing.


Self-Checking Testbench

Features:

  • Software reference FIR
  • 2-cycle aligned comparison
  • Automatic mismatch detection
  • Stimuli: ramp, step, alternating, random
  • Generates dump.vcd for waveform inspection

Possible Improvements

  • Programmable coefficients
  • 32/64/128-tap variants
  • Deeper pipelining (>500 MHz)
  • SIMD multi-channel FIR
  • AXI-Stream interface
  • Half-band / polyphase design

Conclusion

This project demonstrates a modular, synthesizable, pipelined FIR filter with:

  • Q-format arithmetic
  • Parallel multipliers
  • Balanced adder tree
  • Deterministic timing
  • Self-checking verification

Ideal for FPGA, ASIC, and DSP learning environments.


License

Open for academic and educational use.
Modifications are welcome with credit.

About

16-tap fixed-point FIR low-pass filter in Verilog, with pipelined MAC datapath, coefficient ROM and shift-register taps. Includes a self-checking testbench with built-in reference model and waveform generation for robust functional verification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors