A lightweight computer vision project that stitches multiple images into a single panorama. This implementation builds the full pipeline from scratch using feature matching, homography estimation, and blending techniquesβwithout relying on high-level stitching APIs.
This project constructs a multi-image stitching system that takes a sequence of overlapping images and produces a unified panoramic result.
Instead of using built-in stitching functions, the pipeline explicitly implements:
- Feature Detection (ORB)
- Feature Matching (KNN + Ratio Test)
- Robust Homography Estimation (RANSAC)
- sequential global alignment via accumulated pairwise homographies
- Distance-based Blending
- ORB is used for fast and efficient feature extraction
- KNN matching with Loweβs Ratio Test helps improve matching reliability
- RANSAC removes outliers
- Inlier ratio is used to reject unreliable transformations (no fallback strategy)
- A central reference frame is selected
- Homographies are accumulated left and right to align all images
- Weighted blending using distance transform
- Reduces visible seams compared to naive averaging, but does not handle exposure or color differences
- Canvas is estimated from warped boundaries with additional padding
Image_stitching/
βββ main.py # Main stitching pipeline
βββ capture.py # Image capture utility
βββ images/ # Input images
βββ panorama_result.jpg # Output panorama
python main.py-
Place input images in
./images/. or usecapture.pyto capture images from a webcam. -
Output:
panorama_result.jpg(full resolution)- Resized preview window
Load Images
β
Feature Extraction (ORB)
β
Feature Matching (KNN + Ratio Test)
β
Homography Estimation (RANSAC)
β
Global Homography Accumulation
β
Warping
β
Distance-based Blending
β
Final Panorama
- Ratio test threshold: 0.7
- Top 100 matches used to reduce noise
inlier_ratio = np.sum(mask) / len(mask)
if inlier_ratio < 0.3:
return Nonedist = cv.distanceTransform(mask, cv.DIST_L2, 5)
weight = dist / dist.max()β Pixels closer to the image center are weighted more heavily.
Based on the implementation and observed results, several practical limitations were identified:
If the camera undergoes translation instead of pure rotation:
- Near and far objects shift differently
- A single homography cannot model the scene
Result:
- Misalignment in overlapping regions
- Ghosting artifacts
- Walls and ceilings lack texture
- Repetitive structures (e.g., bookshelves)
Result:
- Incorrect feature matches
- Unstable homography estimation
Current method:
- Distance-based feathering
Limitations:
- No exposure compensation
- Cannot fully eliminate seams with misalignment
- Homography assumes planar scenes
- Real-world environments are 3D
Result:
- Warping distortion, especially at edges
- Warping introduces empty regions
- Cropping reduces final field of view
This implementation assumes that input images are already arranged in the correct sequential order.
- Images are stitched in a linear chain (i β i+1)
- No global matching or graph-based ordering is performed
- No similarity-based reordering is applied
Implications:
- Incorrect input order leads to:
- Matching failure
- Invalid homography estimation
- Severe distortion or complete stitching breakdown
Requirement:
- Images must be captured and stored in order (e.g., left-to-right or right-to-left)
- File naming should reflect capture sequence
- Keep camera position fixed
- Rotate around optical center
- Maintain 50β60% overlap
- Separates low/high frequency components
- Produces seamless transitions
- Balances brightness across images
- Reduces distortion for rotational panoramas
- Bundle Adjustment (global optimization)
- Graph-based stitching
- Learning-based feature extraction (e.g., SuperPoint)
- Python
- OpenCV
- NumPy
