A complete end-to-end system for recognizing maze images, extracting maze structure, and finding the optimal path from start to exit using computer vision and path-finding algorithms.
This project implements a complete maze solver system with the following capabilities:
- Image Acquisition: Read maze images from files or camera input
- Image Preprocessing: Enhance and clean the image to extract maze structure
- Structure Extraction: Convert the maze into a graph representation with nodes and edges
- Path Finding: Use A* algorithm to find the optimal path from start to exit
- Visualization: Display the solved maze overlaid on the original image
- Static Images: PNG, JPG, JPEG, BMP
- Camera Capture: With perspective correction
The system is divided into 4 main pipelines:
Input Image
↓
[Acquisition Pipeline] → Image Validation & Loading
↓
[Preprocessing Pipeline] → Image Enhancement & Binary Conversion
↓
[Structure Extraction Pipeline] → Graph Generation
↓
[Pathfinding Module] → A* Search Algorithm
↓
[Visualization] → Solution Output
- Loads images from file paths
- Validates image format and existence
- Returns image as numpy array
- Grayscale Conversion: Converts RGB to grayscale
- Perspective Correction: Corrects skewed images from camera capture
- Denoising: Median filtering to remove salt-and-pepper noise
- Thresholding: Otsu's method to create binary image
- Morphological Operations: Dilation and erosion for edge enhancement
- Background Removal: Isolates maze region from borders
- Endpoint Detection: Identifies maze start and exit points
- Skeletonization: Creates a 1-pixel wide representation of maze paths
- Cell Segmentation: Divides skeleton into grid cells
- Node Detection: Identifies junction points and endpoints
- Graph Construction: Builds adjacency list of connected nodes
- A Algorithm*: Finds optimal path using Manhattan distance heuristic
- Returns ordered list of nodes from start to goal
- Purpose: Convert grayscale image to binary (black/white)
- Method: Automatically finds optimal threshold value by minimizing within-class variance
- Implementation:
src/preprocessing/thresholding.py - Reference: Otsu, N. (1979). A threshold selection method from gray-level histograms
- Erosion: Shrinks white regions to remove thin noise
- Dilation: Expands white regions to fill small gaps
- Purpose: Clean binary image and connect broken paths
- Implementation:
src/preprocessing/morphology.py
- Purpose: Reduce maze paths to 1-pixel width while preserving connectivity
- Algorithm: Used from scikit-image (medial axis transform)
- Implementation:
src/structure_extraction/sturcture_extraction_pipeline.py
- Purpose: Detect boundaries in perspective correction
- Implementation:
src/preprocessing/edge_detection.py - Parameters: Sigma=2.0 with adaptive thresholding
- Purpose: Divide maze into regular grid cells
- Method: Fixed cell size (9x9 pixels) with threshold-based cell activation
- Implementation:
src/structure_extraction/sturcture_extraction_pipeline.py
- Purpose: Find optimal path from start to goal node
- Heuristic: Manhattan distance (|x1-x2| + |y1-y2|)
- Time Complexity: O(b^d) where b is branching factor, d is depth
- Space Complexity: O(b^d)
- Implementation:
src/shortest-path/AStar.py - Key Advantage: Faster than Dijkstra with admissible heuristic
- Purpose: Correct skewed images from camera capture
- Method: Detects paper corners and applies perspective transform
- Implementation:
src/preprocessing/extract_paper.py
- Purpose: Find maze start and exit points
- Method: Scans borders for gaps in skeleton, finds 2 farthest gaps
- Implementation:
src/preprocessing/detect_endpoints.py
| Library | Version | Purpose |
|---|---|---|
| NumPy | >=1.20.0 | Numerical computations and array operations |
| scikit-image | >=0.18.0 | Image processing (thresholding, morphology, skeletonization, edge detection) |
| Matplotlib | >=3.3.0 | Image visualization and result display |
| Python | >=3.8 | Core language |
pip install numpy scikit-image matplotlibconda install numpy scikit-image matplotlib- Python 3.8 or higher
- pip or conda package manager
git clone <repository-url>
cd maze-solver# Using venv
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Using conda
conda create -n maze-solver python=3.9
conda activate maze-solverpip install numpy scikit-image matplotlibOr create a requirements file:
pip freeze > requirements.txt
pip install -r requirements.txtpython -c "import numpy, skimage, matplotlib; print('All libraries installed successfully!')"For camera input, set the flag to y:
- Enables perspective correction for skewed images
- Use for images taken with phone cameras at angles
For static images, set the flag to n:
- Skips perspective correction
- Use for scanned or properly aligned images
maze-solver/
├── solve_maze.py # Main solver with visualization
├── README.md # This file
├── Phase 1 Proposal.pdf # Original project proposal
│
├── src/
│ ├── acquisition/ # Image input handling
│ │ ├── __init__.py
│ │ ├── acquisition_pipeline.py # Main acquisition logic
│ │ └── validate_image.py # Image validation
│ │
│ ├── preprocessing/ # Image enhancement & binary conversion
│ │ ├── __init__.py
│ │ ├── preporcess_pipeline.py # Main preprocessing flow
│ │ ├── detect_endpoints.py # Start/end point detection
│ │ ├── edge_detection.py # Canny edge detection
│ │ ├── extract_paper.py # Perspective correction
│ │ ├── filters.py # Median filtering
│ │ ├── histogram.py # Histogram equalization
│ │ ├── localization.py # Maze localization
│ │ ├── morphology.py # Erosion & dilation
│ │ ├── remove_background.py # Background removal
│ │ └── thresholding.py # Otsu thresholding
│ │
│ ├── structure_extraction/ # Maze graph generation
│ │ ├── __init__.py
│ │ ├── sturcture_extraction_pipeline.py # Main extraction logic
│ │ └── find_nearest_node.py # Node mapping utility
│ │
│ ├── shortest-path/ # Pathfinding algorithm
│ │ ├── AStar.py # A* implementation
│ │ └── test.py # Algorithm tests
│ │
│ ├── visualization/ # Result visualization
│ │ ├── visualize_on_image.py # Overlay visualization
│ │ └── test.png # Test visualization output
│ │
│ └── utils/ # Utility functions
│ ├── __init__.py
│ └── show.py # Image display utilities
│
├── samples/ # Test images
├── easy/ # Simple mazes
│ ├── image.png
│ ├── image2.png
│ ├── image8.png
│ ├── image10.png
│ └── image12.png
├── mobile/ # Mobile camera captures
│ └── image15.png
└── no_path/ # Mazes with no solution
└── image.png
- Description: Simple, well-defined maze structures
- Expected Behavior: Quick solution with clear paths
- Files: image.png, image2.png, image8.png, image10.png, image12.png
- Test Purpose: Verify basic functionality and correctness
Example Test:
python solve_maze.py
# Enter: easy/image.png
# Enter: n
# Expected: Solution found with few nodes- Description: Mazes photographed with mobile camera (perspective skew)
- Expected Behavior: Perspective correction handles skew
- Files: image15.png
- Test Purpose: Validate perspective correction and camera robustness
Example Test:
python solve_maze.py
# Enter: mobile/image15.png
# Enter: y
# Expected: Perspective corrected, solution found- Description: Disconnected maze with no valid solution
- Expected Behavior: Return None or "No path found" message
- Files: image.png
- Test Purpose: Verify error handling and edge case management
Example Test:
python solve_maze.py
# Enter: no_path/image.png
# Enter: n
# Expected: "No path found" message- ✅ Robust preprocessing: Handles various image qualities
- ✅ Optimal path-finding: A* guarantees shortest path
- ✅ Flexible input: Works with both static images and camera captures
- ✅ Clear visualization: Easy to interpret results
- ✅ Modular design: Easy to modify individual components
⚠️ Perspective correction: Works best with clear paper boundaries⚠️ Mobile images: Requirescamera=yflag for best results⚠️ Complex mazes: Processing time increases with larger images⚠️ Endpoint detection: May fail with unusual maze configurations⚠️ Fixed cell size: 9x9 pixel cells work best for 100-200 DPI images
-
Image Resolution Dependent: Cell segmentation uses fixed 9x9 pixel cells
- Solution: Adjust
cell_sizeparameter insturcture_extraction_pipeline.py
- Solution: Adjust
-
Perspective Correction: Requires clear paper boundaries and good lighting
- Solution: Use well-lit images with visible paper edges
-
Endpoint Detection: Assumes exactly 2 endpoints (start and exit)
- Failure Case: Multi-exit mazes or unclear borders
-
Fixed Grid Structure: Works best with regular maze layouts
- Failure Case: Circular or hexagonal mazes
-
No Maze Validation: Doesn't verify maze structure before solving
- Impact: Invalid mazes may produce unexpected results
| Alyaa Ali | Shady Mohamed | Hagar Abdelsalam | Saleh Ahmed |
