An intelligent screen magnification application that combines eye tracking, optical character recognition (OCR), and importance-based magnification to provide adaptive screen zoom functionality.
Context Magnifier is a sophisticated desktop application that intelligently determines the most relevant areas of your screen to magnify based on multiple input sources and content analysis. The system uses computer vision, eye tracking, and screen content analysis to provide an enhanced accessibility and productivity tool.
The application consists of several interconnected modules organized in a multi-process architecture:
- Purpose: Central coordination system that manages input from multiple sources
- Key Features:
- Multiprocessing-based shared memory for eye tracking coordinates
- Weighted coordinate selection from mouse position, eye tracking, and importance mapping
- Continuous screen analysis updates with configurable intervals
- Thread-safe calibration data management
- Technical Details:
- Uses
multiprocessing.Valuewithctypes.c_doublefor shared memory - Implements weighted average for importance-based coordinate selection
- Supports dynamic importance threshold filtering (configurable, default: 0.7)
- Uses
- Purpose: Real-time screen magnification with adaptive positioning
- Key Features:
- Frameless, always-on-top magnification window (1000x562 default resolution)
- Variable zoom levels with smooth scaling (2.5x default, 0.1x increments)
- Context menu for feature toggles and settings
- Multiple positioning modes: follow-mouse, fixed position, importance-based
- Technical Implementation:
- Built on PySide6 with 30ms update intervals
- Uses QTimer for real-time screen capture and magnification
- Signal-slot architecture for inter-component communication
- Purpose: Configuration interface for system parameters
- Key Features:
- Real-time parameter adjustment for OCR analysis
- Eye tracking calibration file management
- Grid-based settings layout with custom typography
- Cross-process communication via multiprocessing queues
- Purpose: Webcam-based gaze detection and calibration
- Key Features:
- 13-point calibration system (center, corners, edges, mid-points)
- Persistent calibration data storage in JSON format
- Real-time gaze coordinate calculation with configurable FPS (4-10 FPS)
- Integration with GazeTracking library for pupil detection
- Calibration Process:
- Multi-point screen calibration with tkinter GUI
- Bilinear interpolation for gaze-to-screen mapping
- Support for pre-saved calibration profiles
- Purpose: Intelligent content analysis for importance mapping
- Key Features:
- Grid-based screen segmentation (configurable: 7x14 to 16x9)
- Multi-factor importance scoring system
- OCR-based text detection and classification
- Weighted importance calculation for UI elements
The system uses a sophisticated scoring algorithm that considers multiple factors:
- Font Size Analysis: Inverse relationship scoring (smaller text = higher importance)
- Base size: 20px, with 1.0-4.0x multiplier range
- Text Confidence: Pytesseract confidence threshold filtering (default: 25)
- Text Classification: Contextual importance based on content type:
- Confirmation text ("OK", "Submit", "Accept"): 3.0x weight
- Error messages: 2.5x weight
- Titles and headers: 1.5x weight
- Buttons: 3.0x weight
- Input fields: 2.0x weight
- Checkboxes: 1.0x weight
- Text Length: Longer text receives higher importance (1.5x weight)
- Content Density: Character density per grid cell (0.2x weight)
importance_score = (
size_factor * confidence_weight *
classification_weight * length_weight *
density_weight
)- Process 1: Main magnification window and coordinate management
- Process 2: Settings interface and configuration management
- Communication: Thread-safe multiprocessing queues for settings and commands
- Eye Tracking Thread: Dedicated thread for webcam processing and gaze calculation
- Continuous Update Thread: Background thread for periodic importance map updates
- UI Thread: Main Qt application thread for magnification display
- Shared Memory:
multiprocessing.Valuefor eye tracking coordinates - Grid Caching: Numpy arrays for importance matrices and cell data
- Resource Cleanup: Proper thread termination and webcam resource management
- PySide6: Qt-based GUI framework for cross-platform desktop applications
- OpenCV (cv2): Computer vision library for image processing and webcam handling
- NumPy: Numerical computing for matrix operations and coordinate calculations
- Pillow (PIL): Image processing library for screen capture functionality
- Pytesseract: OCR engine for text recognition and analysis
- dlib 19.24.4: Machine learning library for facial landmark detection
- Custom GazeTracking: Modified gaze tracking implementation for pupil detection
- Python: 3.8+ (tested with 3.12.10)
- Operating System: Cross-platform (Windows, macOS, Linux)
- Hardware: Webcam required for eye tracking functionality
- Memory: Minimum 4GB RAM (8GB+ recommended for real-time processing)
The screen analysis grid can be adjusted for different screen resolutions and use cases:
- Default: 7x14 grid (98 cells)
- High Resolution: 16x9 grid (144 cells)
- Performance: 5x10 grid (50 cells)
All importance factors are configurable through the settings interface:
- Button importance: 3.0 (default)
- Input field importance: 2.0 (default)
- Error text importance: 2.5 (default)
- Confirmation text importance: 3.0 (default)
- Title importance: 1.5 (default)
- Magnification: 30ms (33 FPS)
- Importance Map: 5 seconds (configurable)
- Eye Tracking: 100-250ms (4-10 FPS)
- Screen Capture: ~10ms (1920x1080)
- OCR Analysis: 2-4 seconds (full screen)
- Importance Calculation: ~50ms
- Gaze Detection: ~25ms per frame
- Base Application: ~50MB
- With Eye Tracking: ~100MB
- With Continuous Updates: ~150MB
- Peak (during OCR): ~300MB
- Idle: 2-5% (single core)
- Active Magnification: 5-10% (single core)
- With Eye Tracking: 10-15% (single core)
- During OCR Analysis: 25-50% (multi-core)
- Python 3.8 or higher
- Webcam (for eye tracking)
- Tesseract OCR engine
-
Clone the repository:
git clone <repository-url> cd context-magnifier
-
Install Python dependencies:
pip install PySide6 opencv-python numpy pillow pytesseract
-
Install GazeTracking dependencies:
cd facial_recognition/GazeTracking pip install -r requirements.txt -
Install Tesseract OCR:
- macOS:
brew install tesseract - Ubuntu:
sudo apt-get install tesseract-ocr - Windows: Download from official Tesseract repository
- macOS:
-
Start the application:
python main.py
-
Configure settings (optional):
- Adjust grid dimensions for your screen resolution
- Modify importance weights based on your workflow
- Set update intervals based on performance requirements
-
Calibrate eye tracking (optional):
- Enable eye tracking from the context menu
- Follow the 13-point calibration process
- Save calibration data for future sessions
- Launch: Run
python main.pyfrom the project directory - Magnify: The magnification window automatically follows important screen areas
- Configure: Right-click the magnification window for options
- Settings: Use the separate settings window for advanced configuration
- Right-click menu: Toggle eye tracking, importance mapping, and fixed positioning
- Mouse wheel: Adjust zoom level (when magnification window is focused)
- Settings window: Real-time parameter adjustment and calibration management
- Continuous Updates: Automatic importance map regeneration at configurable intervals
- Multi-source Coordination: Intelligent blending of mouse, eye tracking, and importance data
- Calibration Persistence: Save and load eye tracking calibration profiles
- OCR not working: Ensure Tesseract is installed and in system PATH
- Eye tracking fails: Check webcam permissions and lighting conditions
- High CPU usage: Reduce update frequencies or grid resolution
- Memory leaks: Restart application periodically during intensive use
- Reduce grid dimensions for better performance
- Increase update intervals for lower CPU usage
- Disable continuous updates when not needed
- Use fixed positioning mode for consistent performance
context-magnifier/
├── main.py # Application entry point
├── coordinate_manager.py # Core coordinate management
├── calibration_data.json # Eye tracking calibration data
├── app/ # GUI components
│ ├── main_window.py # Settings interface
│ ├── zoom_window.py # Magnification window
│ └── core.py # Shared utilities
├── facial_recognition/ # Eye tracking system
│ ├── main.py # Eye tracking implementation
│ ├── calibrate.py # Calibration utilities
│ └── GazeTracking/ # Modified GazeTracking library
├── ocr/ # Screen analysis
│ └── main.py # OCR and importance calculation
├── fonts/ # Custom typography
└── assets/ # Static resources
- Custom Importance Algorithms: Modify
ocr/main.pyscoring functions - Additional Input Sources: Extend
CoordinateManagerwith new coordinate sources - UI Customization: Modify Qt stylesheets and layouts in
app/directory - Analysis Plugins: Add new content analysis modules to the OCR system
This project incorporates modified components from the GazeTracking library and uses various open-source dependencies. Please refer to individual component licenses for specific terms and conditions.