A Python-based document scanner that automatically detects document boundaries, applies perspective correction, and generates high-quality scanned images using OpenCV.
- Automatic Document Detection: Uses advanced contour detection to identify document boundaries
- Perspective Correction: Applies four-point transformation for proper document alignment
- Multiple Quality Options: Generates 7 different processing versions for optimal results
- Enhanced Image Processing: Includes noise reduction, contrast enhancement, and sharpening
- Flexible Output: Supports custom output directories with organized file structure
- Debug Mode: Optional intermediate image saving for troubleshooting
- Command Line Interface: Easy-to-use CLI with comprehensive options
- Robust Fallbacks: Avoids blank outputs by using a full-frame fallback when the detected region is too small
- RECOMMENDED Control: Choose which processed variant is saved as RECOMMENDED via
--prefer
- Profiles: Bias auto selection for tables vs text via
--doc-type
- Printer-Friendly Conversion: Remove background colors to save ink while preserving text and images
- Python 3.7 or higher
- OpenCV (cv2)
- NumPy
- Clone the repository:
git clone https://github.com/LiteObject/doc-scanner.git
cd doc-scanner
- Create a virtual environment (recommended):
python -m venv .venv
# On Windows:
.venv\Scripts\activate
# On macOS/Linux:
source .venv/bin/activate
- Install dependencies:
pip install opencv-python numpy
python scanner.py input_image.jpg
Remove background colors to save printer ink while keeping text and images readable:
# Basic usage - auto-detect and remove background
python make_printable.py document.jpg
# Use aggressive background removal for heavily colored documents
python make_printable.py document.jpg --method aggressive
# Uneven lighting or shaded pages
python make_printable.py document.jpg --method adaptive
# Pages with strong colored backgrounds
python make_printable.py document.jpg --method color
# Set custom threshold for fine control
python make_printable.py document.jpg --method custom --threshold 210
# Skip enhancement for faster processing
python make_printable.py document.jpg --no-enhance
# Enable debug mode to see before/after comparison
python make_printable.py document.jpg --debug
# Specify output directory
python make_printable.py document.jpg --output ./printable_docs
# Specify custom output directory
python scanner.py document.jpg --output ./my_scans
# Enable debug mode
python scanner.py document.jpg --debug
# Combine options
python scanner.py document.jpg --output ./scans --debug
# Tune detection thresholds
python scanner.py document.jpg --min-area 1500 --fallback-min-area 600 --min-area-frac 0.05
# Choose the RECOMMENDED variant
# Force clean binary (default behavior)
python scanner.py document.jpg --prefer combined
# Use enhanced grayscale as recommended
python scanner.py document.jpg --prefer grayscale
# Let the tool auto-select based on content scoring
python scanner.py document.jpg --prefer auto
# Bias auto selection for tables (crisp B/W lines) or text (smoother)
python scanner.py document.jpg --prefer auto --doc-type table
python scanner.py document.jpg --prefer auto --doc-type text
input_file
: Path to the input image file (required)--output, -o
: Custom output directory path (optional)--debug
: Enable debug mode to save intermediate processing images (optional)--min-area
: Minimum contour area (in resized-pixels) to accept as the document (default: 1000)--fallback-min-area
: Minimum area to allow a fallback quadrilateral if no primary match is found (default: 500)--min-area-frac
: If the selected quadrilateral covers less than this fraction of the resized image, use full-frame fallback (default: 0.04 = 4%)--prefer
: Which variant to save as RECOMMENDED. Options:combined
(default),grayscale
,original
,otsu
,adaptive-mean
,adaptive-gaussian
,niblack
,auto
(score-based)--doc-type
: Biases auto selection. Options:auto
(default),table
(favor crisp B/W and structured edges),text
(favor smoother grayscale)--help, -h
: Show help message and usage examples
The scanner creates an organized directory structure with multiple quality options:
output_directory/
├── scanned_output_YYYYMMDD_HHMMSS/
│ ├── RECOMMENDED_scanned_document.jpg # Main result
│ ├── GRAYSCALE_enhanced.jpg # Enhanced grayscale version
│ ├── quality_comparison/ # All processing versions
│ │ ├── 00_original_perspective_corrected.jpg
│ │ ├── 01_enhanced_grayscale.jpg
│ │ ├── 02_otsu_threshold.jpg
│ │ ├── 03_adaptive_mean.jpg
│ │ ├── 04_adaptive_gaussian.jpg
│ │ ├── 05_niblack_local.jpg
│ │ ├── 06_combined_optimized.jpg
│ │ └── README.txt # Selection guide
│ └── debug_processing/ # Debug images (if --debug enabled)
│ ├── debug_01_resized.jpg
│ ├── debug_02_gray.jpg
│ ├── debug_03_edges.jpg
│ ├── debug_04_warped.jpg
│ ├── debug_detected_contour.jpg
│ ├── debug_alt_edges_*.jpg
│ └── debug_region_info.txt # Area stats and fallback info
The scanner generates multiple versions using different image processing techniques:
- Original Perspective Corrected: Document after perspective transformation only
- Enhanced Grayscale: Noise reduction, contrast enhancement, and sharpening applied
- Otsu Threshold: Black and white using automatic threshold detection
- Adaptive Mean: Black and white using adaptive mean thresholding
- Adaptive Gaussian: Black and white using adaptive Gaussian thresholding
- Niblack Local: Black and white using Niblack-like local thresholding
- Combined Optimized: Recommended version with morphological cleanup
- Preprocessing: Resize, convert to grayscale, apply Gaussian blur
- Edge Detection: Canny edge detection with multiple parameter sets
- Contour Detection: Find and analyze document boundaries
- Perspective Correction: Four-point transformation to correct document perspective
- Quality Enhancement: Apply denoising, CLAHE, and unsharp masking
- Thresholding: Multiple techniques for optimal text/background separation
- Post-processing: Morphological operations for cleanup
The scanner includes comprehensive error handling for:
- File not found errors
- Invalid image formats
- Image loading failures
- Contour detection issues
- Perspective transformation problems
- File I/O errors
Enable debug mode with --debug
to save intermediate processing images:
- Resized input image, grayscale, and initial edges
- Alternative edge images across multiple Canny thresholds
- Detected contour overlay
- Warped image (or full-frame fallback) and region stats
When the detected quadrilateral covers less than a configurable fraction of the resized image (--min-area-frac
, default 4%), the scanner skips perspective warp and processes the full original frame. This prevents blank or near-blank outputs from tiny/noisy contours.
- OpenCV (cv2): Computer vision and image processing
- NumPy: Array operations and mathematical computations
- argparse: Command line argument parsing
- datetime: Timestamp generation for output folders
- os/sys: File system operations and system interactions
- Four-Point Transformation: Perspective correction using homography
- Adaptive Thresholding: Multiple techniques for varying lighting conditions
- Non-Local Means Denoising: Advanced noise reduction
- CLAHE: Contrast Limited Adaptive Histogram Equalization
- Morphological Operations: Image cleanup and enhancement
- "No contours found": Ensure the document has clear edges and good contrast
- "No rectangular contour found": Try with better lighting or clearer document boundaries
- Poor scan quality or harsh-looking “recommended”:
- The scanner scores all variants and skips near-blank candidates.
- If the recommended looks too bold/harsh, try
--prefer grayscale
or keep--prefer combined
and adjust brightness/contrast in a viewer. - If results still look weak, raise
--min-area-frac
(e.g., 0.06–0.1) to force full-frame processing more often, or increase--min-area
(e.g., 1500–3000).
Note: The console prints which variant was saved as RECOMMENDED (for example, Variant used: combined
).
4. File permission errors: Ensure write permissions for the output directory
- Use good lighting with minimal shadows
- Ensure the document has clear, straight edges
- Place the document on a contrasting background
- Keep the camera/phone steady when taking the photo
- Avoid reflections and glare on the document surface
The main document scanner that detects boundaries and applies perspective correction.
Prepares documents for printing by:
- Removing background colors to save ink
- Preserving text and image quality
- Creating multiple versions (color w/ white background, text-optimized, grayscale, B&W)
- Estimating ink savings
- Providing enhanced versions for better print quality
input_file
: Path to the input image file (required)--output, -o
: Output directory (default: ./output)--method, -m
: Background removal methodauto
: Automatically determine threshold based on backgroundlight
: Light background removal (keeps more detail)aggressive
: Aggressive removal (maximum ink saving)adaptive
: Adaptive thresholding for uneven lightingcolor
: HSV color-based removal for colored backgroundscustom
: Use custom threshold value
--threshold, -t
: Custom threshold value (0-255) for method=custom--no-enhance
: Skip enhancement step for faster processing--debug
: Save debug images including before/after comparison
output_directory/
├── printable_YYYYMMDD_HHMMSS/
│ ├── PRINTABLE_document.jpg # Recommended for printing
│ ├── versions/
│ │ ├── 01_document_background_removed.jpg
│ │ ├── 02_document_enhanced.jpg # If enhancement enabled
│ │ ├── 03_document_text_optimized.jpg
│ │ ├── 04_document_black_white.jpg # Maximum ink saving (adaptive)
│ │ └── 05_document_grayscale.jpg
│ ├── debug/ # If --debug enabled
│ │ ├── original.jpg
│ │ ├── mask.jpg
│ │ └── before_after_comparison.jpg
│ └── README.txt # Usage guide
This project is open source. Please check the license file for specific terms.
Potential improvements for future versions:
- Batch processing for multiple documents
- GUI interface for easier use
- Additional image enhancement algorithms
- Support for different output formats (PDF, TIFF)
- Configuration file support
- Performance optimizations for large images