An Educational Reference & Complete Manual for FPGA-Based Image Processing. Inspired from https://github.com/Gowtham1729/Image-Processing
This project demonstrates real-time image processing operations implemented on the Zedboard FPGA platform with live VGA display output. It's designed as an educational resource, providing insights into hardware-software co-design, digital signal processing, and FPGA development using Vivado.
- What is an FPGA? (Educational Overview)
- Project Overview
- Features & Image Processing Operations
- Hardware Architecture
- Project Architecture
- Prerequisites & Setup
- Getting Started
- Detailed Usage Instructions
- Image Processing Operations Reference
- COE File Generation: In Depth
- Technical Deep Dive
- Python Utility Scripts
- Troubleshooting & Common Issues
- Performance Metrics
- Contributing
- License
- References & Further Reading
FPGA = Field Programmable Gate Array
An FPGA is a semiconductor device that contains an array of programmable logic blocks and interconnects. Unlike traditional microprocessors that execute instructions sequentially, FPGAs allow you to create custom hardware configurations tailored to your specific application.
A typical FPGA consists of several key components:
- Contain lookup tables (LUTs), multiplexers, and flip-flops
- LUTs implement combinational logic functions
- Flip-flops store state for sequential logic
- Configurable to implement any Boolean logic function
- Embedded memory blocks within the FPGA fabric
- Typically 36 Kbits (or 18 Kbits) per block
- Can be configured as various widths and depths
- Enables efficient data storage for algorithms like ours (image pixel storage)
- In this project: We use BRAM to store image data efficiently
- Programmable routing channels connecting logic blocks
- Metal tracks at various hierarchical levels
- Programmable switches at intersection points
- Enables flexible data flow between components
- Bidirectional buffers connecting FPGA to external world
- Programmable voltage standards (LVCMOS33, LVCMOS18, etc.)
- In this project: VGA output pins, clock input, switch inputs
- Dedicated DSP slices (for multiplications)
- Phase-locked loops (PLLs) for clock generation
- Block memories (BRAM)
- Zedboard includes ARM Cortex-A9 processors (not used in this project)
Programming Flow:
├─ Design → Specify logic in HDL (Verilog/VHDL)
├─ Synthesis → Convert HDL to logic gates
├─ Place & Route → Map logic to physical FPGA resources
├─ Bitstream Generation → Create configuration file
└─ Programming → Load bitstream into FPGA
Key Concept: The FPGA is "programmable in the field" - you can reprogram it with different logic designs without physically replacing hardware.
-
Massive Parallelism
- Traditional CPUs process pixels sequentially
- FPGAs process multiple pixels in parallel
- Can achieve real-time processing of high-resolution video
-
Hardware Customization
- Tailor bit-widths, precision, and operations to your needs
- Avoid generic processor overhead
- Optimize memory access patterns
-
Low Latency
- No operating system overhead
- Hardware processes data combinatorially
- Deterministic behavior
-
Power Efficiency
- No instruction fetch/decode cycles
- Only active logic consumes power
- Ideal for embedded systems
| Aspect | CPU | FPGA |
|---|---|---|
| Processing | Sequential (pipeline) | Parallel (hardware) |
| Latency | High (cycles for each operation) | Very Low (combinatorial) |
| Throughput | Good for general workloads | Excellent for data-parallel tasks |
| Power | Higher per operation | Lower for specialized tasks |
| Development | Easier (C/C++) | Harder (Verilog/VHDL) |
| Flexibility | High (any algorithm) | Lower (must fit hardware) |
This project implements real-time image processing on a Zedboard FPGA with live output to a VGA monitor. The key innovation is that all processing happens in hardware—no processor involvement—enabling real-time performance even with complex convolution operations.
- 16 Selectable Operations - Pixel-level and convolution-based filters
- Live VGA Output - 640×480 @ 60Hz real-time display
- Hardware-Accelerated Processing - All operations in FPGA fabric
- Single Python Script - Elegant
coe_generator.pyhandles all image conversion - Educational Design - Well-commented Verilog code with clear architecture
The system supports 16 different image processing operations, selectable via 4 DIP switches (SW0-SW3) on the Zedboard:
| Operation | Sel Module | Description |
|---|---|---|
| RGB to Grayscale | 0000 |
Convert color image to grayscale using luminance formula |
| Increase Brightness | 0001 |
Amplify pixel values (clip at 255) |
| Decrease Brightness | 0010 |
Reduce pixel values (clip at 0) |
| Color Inversion | 0011 |
Invert all color channels (255 - value) |
| Red Filter | 0100 |
Isolate red channel, suppress green/blue |
| Green Filter | 0110 |
Isolate green channel, suppress red/blue |
| Blue Filter | 0101 |
Isolate blue channel, suppress red/green |
| Original Image | 0111 |
Display original image unchanged |
These operations process each pixel using values from 3×3 neighborhood:
Kernel Layout (pixel positions):
[TL] [T] [TR]
[L] [C] [R]
[BL] [B] [BR]
| Operation | Sel Module | Kernel | Purpose |
|---|---|---|---|
| Average Blur | 1000 |
[1 1 1; 1 1 1; 1 1 1] / 9 |
Smoothing/noise reduction |
| Sobel Edge | 1001 |
Sobel operators | Edge detection with gradient |
| Edge Detection | 1010 |
[-1 -1 -1; -1 8 -1; -1 -1 -1] |
Detect rapid intensity changes |
| Emboss | 1100 |
[-2 -1 0; -1 1 1; 0 1 2] |
Create 3D embossed effect |
| Sharpen | 1101 |
[0 -1 0; -1 5 -1; 0 -1 0] |
Enhance edges and details |
| Motion Blur (XY) | 1011 |
[1 0 0; 0 1 0; 0 0 1] / 3 |
Blur diagonally (TL to BR) |
| Motion Blur (Y) | 1110 |
[1 0 0; 1 0 0; 1 0 0] / 3 |
Blur vertically (top to down) |
| Gaussian Blur | 1111 |
[1 2 1; 2 4 2; 1 2 1] / 16 |
Smooth with Gaussian weighting |
The Zedboard is an ARM+FPGA embedded development platform featuring:
-
Zynq-7000 SoC (XC7Z020)
- Dual-core ARM Cortex-A9 processors (not used in this project)
- Artix-7 FPGA fabric (280,000 logic cells)
- Block RAM: 2.4 Mb total
- 560 DSP slices
-
I/O Connectivity
- 4-bit VGA output (12-bit RGB: 4R + 4G + 4B)
- 4 DIP switches (for operation selection)
- 100 MHz system clock
- USB programming interface
Our design uses the following pins (from const1.xdc):
VGA Output (Bank 33 - 3.3V)
Clock Input (GCLK): Y9 (100 MHz)
VGA Red[0-3]: V20, U20, V19, V18
VGA Green[0-3]: AB22, AA22, AB21, AA21
VGA Blue[0-3]: Y21, Y20, AB20, AB19
VGA Hsync: AA19
VGA Vsync: Y19
Control Inputs (Bank 35 - 1.8V)
sel_module[0-3] (SW0-3): F22, G22, H22, F21
reset (SW7): M15
The Video Graphics Array (VGA) standard defines timing for analog video output:
Horizontal Timing (in pixel clocks):
├─ Visible pixels: 640
├─ Front porch: 16
├─ Hsync pulse: 96
└─ Back porch: 48
Total: 800 pixel clocks per line
Vertical Timing (in scan lines):
├─ Visible lines: 480
├─ Front porch: 10
├─ Vsync pulse: 2
└─ Back porch: 33
Total: 525 lines per frame
- Hsync = 0 during 96-pixel pulse, 1 otherwise
- Vsync = 0 during 2-line pulse, 1 otherwise
- RGB data valid only when not in blanking interval
- Blanking Interval = H or V porch/sync period
Our Verilog implementation generates these timing signals with hardware counters.
We use Xilinx Blk_Mem_Gen IP core with the following configuration:
Memory Type: Single Port RAM
Width: 96 bits per word
Depth: 32,768 words (for 160×200 image with 9-pixel data)
Memory Size: ~49 MB effective storage
Address: 15-bit (0 to 32,767)
Read Latency: 1 cycle (registered output)
Operating Mode: WRITE_FIRST
Initial File: .coe file (COE format)
Data Organization (96 bits):
Bits 95-88: Blue value from top-left neighbor (leftup)
Bits 87-80: Green value from left neighbor
Bits 79-72: Red value from right neighbor
Bits 71-64: Blue value from top neighbor (up)
Bits 63-56: Blue value from bottom neighbor (down)
Bits 55-48: Blue value from top-left neighbor (leftup)
Bits 47-40: Blue value from bottom-left neighbor (leftdown)
Bits 39-32: Blue value from top-right neighbor (rightup)
Bits 31-24: Blue value from bottom-right neighbor (rightdown)
Bits 23-16: Blue channel of current pixel
Bits 15-8: Green channel of current pixel
Bits 7-0: Red channel of current pixel
Pixel Layout (as matrix):
[leftup] [up] [rightup]
[left] [center] [right]
[leftdown][down] [rightdown]
This clever organization allows single-cycle access to all 9 neighborhood pixels plus the center pixel!
Input Image (BMP/JPG/PNG)
↓
coe_generator.py Script
↓
COE File (Binary pixel data)
↓
FPGA BRAM Initialization
↓
[FPGA Pipeline] ←← Clock signal (100 MHz)
├─ Address Counter (generates pixel addresses)
├─ BRAM Output (96-bit pixel data with neighbors)
├─ Pixel Processing (convolution or simple operation)
├─ Output Formatter (4-bit RGB for VGA)
└─ VGA Controller (generates sync signals & RGB data)
↓
VGA Monitor Output (Real-time display @60Hz)
For convolution operations, the system:
- Reads pixel data from BRAM in parallel (96-bit word)
- Extracts 9 pixel neighborhood from the 96-bit word
- Applies kernel weights (sum of weighted products)
- Clamps result to valid range (0-255 or 0-1024 depending on operation)
- Quantizes to 4-bit per channel (0-15 for VGA display)
- Outputs to VGA in real-time at pixel clock rate
Performance:
- Pixel clock: ~25 MHz (VGA 640×480 @ 60Hz)
- Processing: Fully combinatorial (single-cycle)
- Real-time capability: Yes, can display full frames at 60 FPS
-
Xilinx Vivado Design Suite (2018.3 or later)
- Free WebPACK version sufficient for XC7Z020
- Download: https://www.xilinx.com/support/download.html
- Install with Zynq-7000 and simulation tools support
-
Python 3.6+
pip install opencv-python
-
Git (for cloning repository)
-
Text Editor or IDE (for editing Verilog)
- Vivado IDE included
- VSCode with Verilog extensions (optional)
-
Zedboard FPGA Development Board
- Zynq-7020 FPGA
- 512MB DDR3 RAM
- Micro-USB for programming
- ~$200-300
-
VGA Monitor
- Standard 640×480 or higher resolution
- VGA connector (D-sub 15)
- Any modern monitor with VGA adapter
-
USB Cable
- Micro-USB (for FPGA programming)
- Included with Zedboard
-
Power Supply
- 12V, 2.5A minimum
- Included with Zedboard
-
Computer
- Windows 10/11, Linux, or macOS
- 50 GB free disk space (for Vivado)
- 8 GB RAM minimum (16 GB recommended)
# Download from: https://www.xilinx.com/support/download.html
# WebPACK version (free) is sufficient
# Follow installer prompts:
# - Select "Vivado Design Suite"
# - Select "Zynq-7000" device support
# - Select "Install for Linux/Windows"# Clone or download this repository
git clone https://github.com/yourusername/Zedboard-Image-Processing-FPGA.git
cd Zedboard-Image-Processing-FPGA
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install opencv-pythonXilinx provides board definition files for automatic constraint generation:
# Download Zedboard board files from:
# https://github.com/Xilinx/XilinxBoardStore
# Or manually add to Vivado:
# ~/.Xilinx/Vivado/2024.1/data/boards/board_files/zedboard/Your image should ideally be 160×200 pixels for optimal display. The script accepts any resolution, but will adjust accordingly.
# Resize an existing image to 160×200 using Python
python3 << 'EOF'
from PIL import Image
import sys
input_path = sys.argv[1] if len(sys.argv) > 1 else "input.jpg"
output_path = sys.argv[2] if len(sys.argv) > 2 else "resized.bmp"
img = Image.open(input_path)
img = img.resize((160, 200), Image.Resampling.LANCZOS)
img.save(output_path, format='BMP')
print(f"Resized image saved to {output_path}")
EOF
# Example:
# python3 << 'EOF' ... EOF < your_image.jpg > test_images/my_image.bmpImportant Notes:
- Use 24-bit BMP, PNG, or JPG formats (avoids palette issues)
- Larger images work but will be cropped/scaled during display
- 160×200 is the reference size used in this project
- The script handles any resolution automatically
The coe_generator.py script is your single utility for image conversion:
# Basic usage
python scripts/coe_generator.py input_image.jpg output_image.coe
# Examples:
python scripts/coe_generator.py test_images/flower.bmp coe_files/flower.coe
python scripts/coe_generator.py test_images/photo.png coe_files/photo.coe
# With full paths
python scripts/coe_generator.py /path/to/image.jpg /path/to/output.coe
# Show help
python scripts/coe_generator.py --help
python scripts/coe_generator.py --versionOutput:
Starting conversion: 'test_images/flower.bmp' -> 'coe_files/flower.coe'
Successfully wrote COE file to 'coe_files/flower.coe' with 32000 pixel entries.
Generated File Format:
memory_initialization_radix=2;
memory_initialization_vector=
000000000000000000000000000000000000000000000000000000000000000010101010010101101100101,
000000000000000000000000000000000000000000000000000000000000000010101111010110101110101,
...
000000000000000000000000000000000000000000000000000000000000000011001100110011001100110;
Each line represents one pixel in 96-bit binary format.
# On Linux/macOS:
vivado fpga_design/VGA_1.xpr &
# On Windows:
# Double-click fpga_design/VGA_1.xprIn Vivado:
-
Open IP Sources
- In the Design Sources panel (left)
- Expand: Design Sources → IP → image
- Double-click
image.xci(Block RAM IP)
-
Configure Block RAM
- Click "Edit"
- In IP Customization window:
- Find parameter: "Coe_File"
- Set value to:
path/to/your/coe_files/flower.coe - Click "OK" to save
-
Regenerate IP
- Right-click on
image.xciin IP Sources - Select "Regenerate"
- Wait for completion
- Right-click on
Constraints are already defined in const1.xdc:
# VGA Output pins (verify in Device window):
set_property PACKAGE_PIN Y21 [get_ports {blue[0]}]; # VGA-B1
set_property PACKAGE_PIN Y20 [get_ports {blue[1]}]; # VGA-B2
... (all VGA pins defined)
# DIP Switch inputs (verify):
set_property PACKAGE_PIN F22 [get_ports {sel_module[0]}]; # SW0
set_property PACKAGE_PIN G22 [get_ports {sel_module[1]}]; # SW1
set_property PACKAGE_PIN H22 [get_ports {sel_module[2]}]; # SW2
set_property PACKAGE_PIN F21 [get_ports {sel_module[3]}]; # SW3
set_property PACKAGE_PIN M15 [get_ports {reset}]; # SW7In Vivado:
-
Run Synthesis
- Click: Flow → Run Synthesis
- Wait ~2-5 minutes
- Should complete without errors
- Review synthesis log for warnings
-
Resolve Common Issues
- Unused logic warnings: Normal (not all filters used)
- Undriven net warnings: Check port connections
- Signal widths: Verify RTL schematic
-
Run Implementation
- Click: Flow → Run Implementation
- Wait ~3-10 minutes
- Review placement and routing utilization
-
Check Resource Usage
- Expected utilization:
- LUTs: ~20%
- Registers: ~15%
- BRAM: ~90% (mostly image storage)
- DSPs: <5%
- Expected utilization:
- Generate Bitstream
- Click: Flow → Generate Bitstream
- Wait ~2-3 minutes
- Creates:
fpga_design/VGA_1.runs/impl_1/vga_syncIndex.bit
-
Connect Zedboard
- Micro-USB to computer (labeled USB PROG)
- Power on (green LED indicates 3.3V)
- Connect VGA monitor to FPGA VGA port
-
Program Device
- In Vivado: Tools → Program Device
- Select: Zedboard XC7Z020
- Select bitstream:
vga_syncIndex.bit - Click "Program"
- Wait ~30 seconds for completion
- Monitor should show processed image
Once programmed, the system responds to physical switches:
| Switch | Port | Function |
|---|---|---|
| SW0 | sel_module[0] | Operation selection bit 0 |
| SW1 | sel_module[1] | Operation selection bit 1 |
| SW2 | sel_module[2] | Operation selection bit 2 |
| SW3 | sel_module[3] | Operation selection bit 3 |
| SW7 | reset | Reset signal (active low) |
Combine SW0-SW3 to select operation:
Binary (SW3:SW0) → Operation
0000 → RGB to Grayscale
0001 → Increase Brightness
0010 → Decrease Brightness
0011 → Color Inversion
0100 → Red Filter
0101 → Blue Filter
0110 → Green Filter
0111 → Original Image
1000 → Average Blur
1001 → Sobel Edge Detection
1010 → Edge Detection
1011 → Motion Blur (XY)
1100 → Emboss
1101 → Sharpen
1110 → Motion Blur (Y)
1111 → Gaussian Blur
# To change operation:
1. Toggle switches SW0-SW3 to desired binary value
2. Watch VGA monitor - image updates within 1 frame (~16ms)
3. No system reset needed between operations
4. Processing happens in real-time at 60 FPS
# To reset processing:
1. Toggle SW7 (reset switch)
2. Hold for 100ms
3. Processing resumesExample:
To select "Gaussian Blur" (1111):
- Set SW0, SW1, SW2, SW3 = ON (up position)
- Result: 4-bit value = 1111 (binary) = 15 (decimal)
- Monitor displays Gaussian-blurred image in real-time
These operations process each pixel independently without neighboring pixel information.
Formula:
Gray = 0.299 * R + 0.587 * G + 0.114 * B
Implementation:
red_o = (tred >> 2) + (tred >> 5) + (tgreen >> 1) +
(tgreen >> 4) + (tblue >> 4) + (tblue >> 5);Use cases: Black & white processing, edge detection preprocessing
Red Filter:
red_o = tred; // Keep red channel
green_o = 0; // Zero green
blue_o = 0; // Zero blueGreen/Blue: Similar channel isolation
Use cases: Color segmentation, channel extraction, debugging
These use 3×3 kernel convolution - weighted sum of 9-pixel neighborhood:
Output = Σ(Kernel[i,j] * Pixel[i,j]) for i,j ∈ {-1,0,1}
Kernel:
[ 1 1 1 ]
[ 1 1 1 ] / 9
[ 1 1 1 ]
Effect: Smooths image by averaging neighbors, reduces noise
Implementation:
r = (gray + left + right + up + down + leftup +
leftdown + rightup + rightdown) / 9;Sobel Gx (vertical edges):
[-1 0 1]
[-2 0 2]
[-1 0 1]
Sobel Gy (horizontal edges):
[-1 -2 -1]
[ 0 0 0]
[ 1 2 1]
Magnitude: Sqrt(Gx² + Gy²) (approximated in hardware)
Kernel:
[ 0 -1 0]
[-1 5 -1]
[ 0 -1 0]
Effect: Enhances edges and details by subtracting blurred version
Kernel:
[ 1 2 1 ]
[ 2 4 2 ] / 16
[ 1 2 1 ]
Effect: Smoother blur than average filter, weighted toward center pixel
The coe_generator.py script is the single point of entry for converting images to FPGA-compatible binary format. It elegantly handles all image types and sizes through a clean, well-designed pipeline.
User Command Line
↓
argparse validates inputs
↓
image_to_coe() function
↓
├─ cv2.imread() loads image in BGR order
├─ Iterates through every pixel
├─ Extracts B, G, R channels
├─ Converts each to 8-bit binary string
├─ Pads with 72 zero bits (for future neighbor data)
├─ Formats as Xilinx COE standard
└─ Writes to output file
↓
Xilinx-compatible COE file
1. _convert_channel_to_8bit_binary(channel_value: int) -> str
Converts a single color channel (0-255) to 8-bit binary:
def _convert_channel_to_8bit_binary(channel_value: int) -> str:
if not 0 <= channel_value <= 255:
raise ValueError(f"Channel value {channel_value} must be between 0 and 255.")
return bin(channel_value)[2:].zfill(8)
# Examples:
_convert_channel_to_8bit_binary(0) # '00000000'
_convert_channel_to_8bit_binary(255) # '11111111'
_convert_channel_to_8bit_binary(128) # '10000000'
_convert_channel_to_8bit_binary(42) # '00101010'2. image_to_coe(image_path: PathLike, coe_path: PathLike) -> None
Main conversion function with comprehensive error handling:
def image_to_coe(image_path, coe_path):
# 1. Validate input file exists
if not os.path.exists(str(image_path)):
raise FileNotFoundError(...)
# 2. Load image with OpenCV
image = cv2.imread(img_path_str) # Returns BGR (not RGB!)
# 3. Process each pixel
for row_idx, row in enumerate(image):
for col_idx, pixel_bgr in enumerate(row):
b_channel, g_channel, r_channel = pixel_bgr
# Convert each channel to 8-bit binary
binary_b = _convert_channel_to_8bit_binary(int(b_channel))
binary_g = _convert_channel_to_8bit_binary(int(g_channel))
binary_r = _convert_channel_to_8bit_binary(int(r_channel))
# Combine: BGR order (as loaded by OpenCV)
combined_pixel_binary = binary_b + binary_g + binary_r
# Pad with 72 zero bits
padded_binary_pixel = '0' * 72 + combined_pixel_binary
pixel_binary_strings.append(padded_binary_pixel)
# 4. Write to COE file with proper formatting
with open(coe_path, "w") as coe_file:
coe_file.write("memory_initialization_radix=2;\n")
coe_file.write("memory_initialization_vector=\n")
for i, binary_string in enumerate(pixel_binary_strings):
coe_file.write(binary_string)
if i < len(pixel_binary_strings) - 1:
coe_file.write(',\n')
else:
coe_file.write(';\n') # Last entry ends with semicolon3. main_cli()
Command-line interface with argument parsing:
python scripts/coe_generator.py input.jpg output.coe --versionFeatures:
- Input validation with helpful error messages
- Output path verification
- Version tracking (
--versionflag) - Detailed exception handling
Each pixel becomes a 96-bit binary string:
Bit positions: 95-88 87-80 79-72 71-64 63-56 55-48 47-40 39-32 31-24 23-16 15-8 7-0
Data: [Padding ] [B8] [G8] [R8]
Description: 72 zero bits (reserved for future neighbor pixel data) | Blue Green Red
└─ 24 bits of pixel color (BGR order)
Example Conversion:
Input Pixel: RGB = (255, 128, 64) [Red=255, Green=128, Blue=64]
Step 1: Convert channels to binary
Red = 255 → 11111111
Green = 128 → 10000000
Blue = 64 → 01000000
Step 2: Reorder to BGR (OpenCV format)
BGR order = 01000000 10000000 11111111
Step 3: Pad with 72 zeros
96-bit = 000000000000000000000000000000000000000000000000000000000000000001000000100000001111111
Step 4: Format for COE
Line in .coe = 000000000000000000000000000000000000000000000000000000000000000001000000100000001111111,
The 72-bit padding is reserved for future extensions that could include:
- 8 neighboring pixel values (8 channels × 8 bits = 64 bits)
- Additional metadata or processing flags (8 bits)
This design allows seamless upgrades to kernel-based operations without changing the core architecture.
# Type checking
if not hasattr(pixel_bgr, '__len__') or len(pixel_bgr) != 3:
raise ValueError(f"Pixel has {len(pixel_bgr)} components, expected 3")
# Value range checking
if not 0 <= channel_value <= 255:
raise ValueError(f"Channel value {channel_value} out of range")The script provides meaningful error messages at each stage:
Error Types:
├─ FileNotFoundError
│ └─ Input image doesn't exist or can't be read
├─ IOError
│ └─ Output COE file can't be written
├─ ValueError
│ └─ Pixel data is invalid (out of range, wrong format)
└─ Exception
└─ Unexpected OpenCV or processing errors
Example Error Output:
Error: Input image file not found at '/path/to/image.jpg'
Please ensure the input file path is correct and the file exists.
Data Error: Pixel at (10,20) has 4 components, expected 3 (BGR)
Please ensure the input image is a valid BGR image and pixel values are correct.
File I/O Error: Permission denied: '/output/flower.coe'
Please ensure you have write permissions for the output path and the path is valid.
For multiple images:
#!/bin/bash
# Convert all BMP files to COE
for image in test_images/*.bmp; do
output="coe_files/$(basename "$image" .bmp).coe"
python scripts/coe_generator.py "$image" "$output"
doneOur implementation generates VGA 640×480 @ 60Hz timing:
// Horizontal counter increments every pixel clock
always @(posedge pixel_clk)
hc <= hreset ? 0 : hc + 1;
// hsync pulses from pixel 655-751
assign hsyncon = (hc == 655);
assign hsyncoff = (hc == 751);
// hblank for front/back porch and sync
always @(posedge pixel_clk)
hblank <= hreset ? 0 : hblankon ? 1 : hblank;
// Similar logic for vertical timingTiming Diagram:
Horizontal (one line):
0────639│640─654│655─────751│752─799│
Active │Front │ Hsync │ Back │
Pixels │Porch │ (pulse) │ Porch │
↑ hblankon ↑ hreset
Frame Rate:
= Pixel Clock / (800 × 525 pixels per frame)
= 25.175 MHz / 420,000 ≈ 59.94 Hz
The Coefficient (COE) file is Xilinx's ASCII format for initializing BRAM:
memory_initialization_radix=2;
memory_initialization_vector=
011101010110100101100101...; (space-separated 96-bit binary words)
Our 96-bit Format:
[72 zero bits for padding][24-bit BGR pixel]
Bits 95-88: Padding (zeros)
Bits 87-80: Padding (zeros)
Bits 79-72: Padding (zeros)
Bits 71-64: Padding (zeros)
Bits 63-56: Padding (zeros)
Bits 55-48: Padding (zeros)
Bits 47-40: Padding (zeros)
Bits 39-32: Padding (zeros)
Bits 31-24: Padding (zeros)
Bits 23-16: Blue channel (8-bit)
Bits 15-8: Green channel (8-bit)
Bits 7-0: Red channel (8-bit)
File Size Calculation:
For 160×200 image:
- Total pixels = 32,000
- BRAM words needed = 32,000 (1 word per pixel)
- Bits per word = 96
- File size ≈ 32,000 × 12 bytes = 384 KB
The main Verilog module vga_syncIndex.v contains:
module vga_syncIndex(
input clock, // 100 MHz system clock
input reset, // Active high reset
input[3:0] sel_module, // Operation selector
output reg hsync, // VGA horizontal sync
output reg vsync, // VGA vertical sync
output reg [3:0] red, // 4-bit red output
output reg [3:0] green, // 4-bit green output
output reg [3:0] blue // 4-bit blue output
);// Divide 100 MHz system clock to ~25 MHz pixel clock
reg clk;
always@(posedge clock) clk <= ~clk; // Divide by 2 to 50 MHz
reg pcount;
wire pixel_clk;
always @ (posedge clk) pcount <= ~pcount;
assign pixel_clk = (pcount == 0); // Further divide to ~25 MHzreg [9:0] hc, vc; // Horizontal and vertical counters
reg hblank, vblank;
// Reset counters at end of line/frame
assign hreset = ec & (hc == 799); // 800 pixels per line
assign vreset = hreset & (vc == 523); // 525 lines per frame// Instantiate Block RAM IP core
image inst1(
.clka(clk),
.wea(read), // Write enable (always 0 in this design)
.addra(addra), // Address (incremented each pixel)
.dina(in1), // Input data (unused)
.douta(out2) // Output: 96-bit pixel + neighbors data
);else if(sel_module == 4'b1111) begin // Gaussian blur
if(reset) begin
red = 0; green = 0; blue = 0;
end else begin
// Extract all 9 pixels from 96-bit BRAM word
r = (rightup + (2*up) + leftup +
(2*right) + (4*gray) + (2*left) +
rightdown + (2*down) + leftdown) / 16;
// Quantize to 4-bit for VGA
red_o = r / 16;
blue_o = r / 16;
green_o = r / 16;
red = {red_o[3:0]};
green = {green_o[3:0]};
blue = {blue_o[3:0]};
end
endThe memory organization enables single-cycle parallel access to all pixels:
Image Storage:
Address 0: Pixel (0,0) with padding
Address 1: Pixel (0,1) with padding
...
Address 159: Pixel (0,159) with padding
Address 160: Pixel (1,0) with padding
...
Pixel (i,j) stored at address = i*160 + j
Each address word contains:
- 72 zero bits (padding)
- 24 bits RGB pixel data (BGR order)
Usage:
python scripts/coe_generator.py <input_image> <output_coe> [--version]Parameters:
input_image(positional): Path to input image (JPG, PNG, BMP, etc.)output_coe(positional): Path for output COE file--version: Display script version
Features:
- ✅ Supports all common image formats (OpenCV compatible)
- ✅ Automatic BGR → 96-bit conversion
- ✅ Comprehensive error handling with helpful messages
- ✅ Progress reporting with pixel count
- ✅ Type hints for code clarity
- ✅ Scalable to any image resolution
Example Usage:
# Single image conversion
python scripts/coe_generator.py flower.jpg coe_files/flower.coe
# Batch conversion with shell script
for img in *.jpg; do
python scripts/coe_generator.py "$img" "coe_files/${img%.jpg}.coe"
done
# With different paths
python scripts/coe_generator.py /input/path/image.png /output/path/image.coeReturn Values & Exit Codes:
Exit Code 0: Success - COE file generated
Exit Code 1: Error - File not found, permission, or data error
Possible Causes:
-
Monitor not powered on
- Solution: Check monitor power and VGA cable connection
-
FPGA not programmed
- Check LED indicators:
- Green: Power OK
- Red: Programming progress
- Green again: Programming complete
- Reprogram device in Vivado
- Check LED indicators:
-
Wrong COE file loaded
- Verify in Vivado project
- Check path in
image.xciIP configuration - Regenerate IP core
-
VGA cable issues
- Test monitor with another computer
- Try different VGA cable
- Check pin connector for bent/broken pins
Diagnostic Steps:
# In Vivado, check:
1. Bitstream generation completed without errors
2. Device programmed (green LED confirms)
3. No timing violations in implementation report
4. BRAM initialization file exists and is validPossible Causes:
-
Image dimensions wrong
- Expected: 160×200 pixels
- Solution: Verify image size or resize
python3 -c "from PIL import Image; print(Image.open('image.bmp').size)" -
COE file corrupted
- Regenerate with Python script
- Verify file size: ~384 KB for 160×200 image
- Check first and last lines of COE file
-
Memory address exceeds bounds
- Check image size doesn't exceed 32,768 pixels
- For 160×200: 32,000 pixels (OK)
- For 200×200: 40,000 pixels (TOO LARGE - adjust)
-
Incorrect filter selected
- Verify switch positions match intended operation
- Check Verilog sel_module mapping
Common Errors:
FileNotFoundError: Error: Input image file not found at '...'
→ Solution: Check file path is correct and file exists
python scripts/coe_generator.py --help
IOError: File I/O Error: Permission denied
→ Solution: Check write permissions on output directory
chmod 755 coe_files/
ValueError: Data Error: Pixel at (0,0) has 4 components
→ Solution: Image might be RGBA instead of RGB
Convert with: python3 << 'EOF'
from PIL import Image
img = Image.open('image.png').convert('RGB')
img.save('image_fixed.bmp')
EOF
Common errors:
ERROR: Unresolved reference to 'image'
→ Solution: Regenerate IP core (right-click → Regenerate)
ERROR: Port 'sel_module[4]' not found
→ Solution: sel_module is 4-bit [3:0], not [4:0]
ERROR: Unknown module 'vga_syncIndex'
→ Solution: Ensure VGA.v is added to project sources
Causes:
- Timing violation - Logic running slower than clock
- Oscillation - Unresolved feedback loop
- High resource utilization - Design too complex
Solution:
# Check timing report:
# In Vivado: Report → Timing Summary
# Look for "Worst Negative Slack" (should be > 0)
# If fails:
1. Increase clock period (reduce frequency)
2. Use pipelining for deep logic
3. Reduce logic complexity| Metric | Value |
|---|---|
| System Clock | 100 MHz |
| Pixel Clock | ~25.175 MHz |
| Pixels per Frame | 640 × 480 = 307,200 |
| Frame Rate | 59.94 FPS (~60 Hz) |
| Processing Latency | <1 pixel clock (~40 ns) - combinatorial |
| Image Update Rate | 16.67 ms per frame |
| Resource | Used | Available | % |
|---|---|---|---|
| LUT | ~5,600 | 28,800 | 19% |
| Registers | ~4,200 | 57,600 | 7% |
| BRAM | 43 × 36K | 48 | 90% |
| DSP | 0 | 560 | 0% |
BRAM Read Latency: 1 cycle (~10 ns at 100 MHz clock)
BRAM Throughput: 1 word per cycle @ 100 MHz
= ~9.6 GB/s effective bandwidth
Pixel Throughput:
- 640 × 480 @ 60 Hz = 18.43 MP/s
- Each pixel reads 1 BRAM word
- Limited by VGA output rate, not FPGA processing
We welcome contributions! Areas for enhancement:
-
Additional Filters
- Bilateral filtering
- Median filtering
- Morphological operations
-
Performance Improvements
- Pipeline architecture
- Parallel multi-pixel processing
- Custom memory hierarchies
-
User Interface
- Add buttons for brightness adjustment
- LED indicators for operation status
- On-screen display of current operation
-
Documentation
- Detailed kernel mathematics
- Verilog optimization techniques
- Advanced FPGA concepts
To Contribute:
# Fork repository
git clone https://github.com/yourusername/fork.git
cd fork
# Create feature branch
git checkout -b feature/awesome-filter
# Make changes
# Commit with clear messages
git commit -am "Add awesome-filter operation"
# Push and create Pull Request
git push origin feature/awesome-filterThis project is licensed under the Apache License 2.0. See LICENSE file for details.
You are free to:
- Use commercially
- Modify the code
- Distribute
- Use privately
You must:
- Include license and copyright notice
- State significant changes made
-
"Digital Design and Computer Architecture" by Harris & Harris
- Comprehensive HDL design fundamentals
- Timing analysis and optimization
-
"FPGA Prototyping by Verilog Examples" by Chu
- Practical Verilog examples
- Real hardware implementations
-
Xilinx Documentation
- Vivado User Guide: https://docs.xilinx.com/
- Zynq-7000 Technical Reference Manual
-
"Digital Image Processing" by Gonzalez & Woods
- Convolution theory and applications
- Filter design mathematics
-
OpenCV Documentation
- Standard filter implementations
- Algorithm references
-
VGA Timing Specifications
- https://en.wikipedia.org/wiki/Video_Graphics_Array
- Detailed timing diagrams
-
VESA Standards (Video Electronics Standards Association)
- Official VGA timing specifications
-
Zedboard Community Wiki
- https://www.zedboard.org
- Getting started guides
-
Zynq-7000 User Guide
- https://docs.xilinx.com/
- Detailed hardware specifications
- Python Image Libraries
- OpenCV: https://opencv.org/
- PIL/Pillow: https://python-pillow.org/
- NumPy: https://numpy.org/
Original Author: https://github.com/Gowtham1729
This version (modified): https://github.com/infinitecoder1729
Project Base: Image Processing Toolbox (Basys 3 adaptation)
Zedboard Adaptation: Modified for Zynq-7000 platform with enhanced VGA support
Credits:
- Zedboard community resources
- Xilinx education materials
- OpenCV algorithm references
Q: Can I use a different image size?
A: Yes! The COE generator handles any resolution. For optimal display, resize to 160×200.
Q: Can I add real-time parameter adjustment?
A: Yes! Use additional switches or buttons to pass parameters to processing modules. Add an input bus for brightness, blur radius, etc.
Q: What if I don't have a Zedboard?
A: This project can adapt to any Xilinx FPGA with:
- ≥ 24 KB BRAM
- ≥ 10,000 logic cells
- VGA output pins available
- Similar development tools (Vivado)
Q: Can I use this for video processing?
A: Yes! Stream frames continuously by updating BRAM contents and maintaining VGA timing synchronization.
Q: How can I optimize further?
A: Consider pipelining, parallel pixel processing, or custom DSP implementations for specific filters.
Q: Does the COE generator support different image formats?
A: Yes! OpenCV supports JPG, PNG, BMP, TIFF, and most common formats automatically.
For issues, questions, or suggestions:
-
GitHub Issues: Create an issue with:
- Problem description
- Steps to reproduce
- System information (Vivado version, OS, board)
- Error messages/logs
-
Documentation: Check docs/ folder first
-
Zedboard Forums: https://www.zedboard.org/forums
Last Updated: December 2025
Project Status: Active & Maintained
Vivado Compatibility: 2018.3+
Python Version: 3.6+
Happy FPGA Development! 🚀