Skip to content

safegergis/context-magnifier

Repository files navigation

Context Magnifier

An intelligent screen magnification application that combines eye tracking, optical character recognition (OCR), and importance-based magnification to provide adaptive screen zoom functionality.

Overview

Context Magnifier is a sophisticated desktop application that intelligently determines the most relevant areas of your screen to magnify based on multiple input sources and content analysis. The system uses computer vision, eye tracking, and screen content analysis to provide an enhanced accessibility and productivity tool.

Architecture

Core Components

The application consists of several interconnected modules organized in a multi-process architecture:

1. Coordinate Management (coordinate_manager.py)

  • Purpose: Central coordination system that manages input from multiple sources
  • Key Features:
    • Multiprocessing-based shared memory for eye tracking coordinates
    • Weighted coordinate selection from mouse position, eye tracking, and importance mapping
    • Continuous screen analysis updates with configurable intervals
    • Thread-safe calibration data management
  • Technical Details:
    • Uses multiprocessing.Value with ctypes.c_double for shared memory
    • Implements weighted average for importance-based coordinate selection
    • Supports dynamic importance threshold filtering (configurable, default: 0.7)

2. Screen Magnification (app/zoom_window.py)

  • Purpose: Real-time screen magnification with adaptive positioning
  • Key Features:
    • Frameless, always-on-top magnification window (1000x562 default resolution)
    • Variable zoom levels with smooth scaling (2.5x default, 0.1x increments)
    • Context menu for feature toggles and settings
    • Multiple positioning modes: follow-mouse, fixed position, importance-based
  • Technical Implementation:
    • Built on PySide6 with 30ms update intervals
    • Uses QTimer for real-time screen capture and magnification
    • Signal-slot architecture for inter-component communication

3. Settings Interface (app/main_window.py)

  • Purpose: Configuration interface for system parameters
  • Key Features:
    • Real-time parameter adjustment for OCR analysis
    • Eye tracking calibration file management
    • Grid-based settings layout with custom typography
    • Cross-process communication via multiprocessing queues

4. Eye Tracking System (facial_recognition/main.py)

  • Purpose: Webcam-based gaze detection and calibration
  • Key Features:
    • 13-point calibration system (center, corners, edges, mid-points)
    • Persistent calibration data storage in JSON format
    • Real-time gaze coordinate calculation with configurable FPS (4-10 FPS)
    • Integration with GazeTracking library for pupil detection
  • Calibration Process:
    • Multi-point screen calibration with tkinter GUI
    • Bilinear interpolation for gaze-to-screen mapping
    • Support for pre-saved calibration profiles

5. Screen Analysis Engine (ocr/main.py)

  • Purpose: Intelligent content analysis for importance mapping
  • Key Features:
    • Grid-based screen segmentation (configurable: 7x14 to 16x9)
    • Multi-factor importance scoring system
    • OCR-based text detection and classification
    • Weighted importance calculation for UI elements

Importance Scoring Algorithm

The system uses a sophisticated scoring algorithm that considers multiple factors:

Text-Based Factors

  • Font Size Analysis: Inverse relationship scoring (smaller text = higher importance)
    • Base size: 20px, with 1.0-4.0x multiplier range
  • Text Confidence: Pytesseract confidence threshold filtering (default: 25)
  • Text Classification: Contextual importance based on content type:
    • Confirmation text ("OK", "Submit", "Accept"): 3.0x weight
    • Error messages: 2.5x weight
    • Titles and headers: 1.5x weight
    • Buttons: 3.0x weight
    • Input fields: 2.0x weight
    • Checkboxes: 1.0x weight

Content Density Factors

  • Text Length: Longer text receives higher importance (1.5x weight)
  • Content Density: Character density per grid cell (0.2x weight)

Final Score Calculation

importance_score = (
    size_factor * confidence_weight * 
    classification_weight * length_weight * 
    density_weight
)

System Architecture Patterns

Multi-Process Communication

  • Process 1: Main magnification window and coordinate management
  • Process 2: Settings interface and configuration management
  • Communication: Thread-safe multiprocessing queues for settings and commands

Thread Management

  • Eye Tracking Thread: Dedicated thread for webcam processing and gaze calculation
  • Continuous Update Thread: Background thread for periodic importance map updates
  • UI Thread: Main Qt application thread for magnification display

Memory Management

  • Shared Memory: multiprocessing.Value for eye tracking coordinates
  • Grid Caching: Numpy arrays for importance matrices and cell data
  • Resource Cleanup: Proper thread termination and webcam resource management

Technical Dependencies

Core Dependencies

  • PySide6: Qt-based GUI framework for cross-platform desktop applications
  • OpenCV (cv2): Computer vision library for image processing and webcam handling
  • NumPy: Numerical computing for matrix operations and coordinate calculations
  • Pillow (PIL): Image processing library for screen capture functionality
  • Pytesseract: OCR engine for text recognition and analysis

Eye Tracking Dependencies

  • dlib 19.24.4: Machine learning library for facial landmark detection
  • Custom GazeTracking: Modified gaze tracking implementation for pupil detection

System Requirements

  • Python: 3.8+ (tested with 3.12.10)
  • Operating System: Cross-platform (Windows, macOS, Linux)
  • Hardware: Webcam required for eye tracking functionality
  • Memory: Minimum 4GB RAM (8GB+ recommended for real-time processing)

Configuration and Customization

Grid Configuration

The screen analysis grid can be adjusted for different screen resolutions and use cases:

  • Default: 7x14 grid (98 cells)
  • High Resolution: 16x9 grid (144 cells)
  • Performance: 5x10 grid (50 cells)

Importance Weights

All importance factors are configurable through the settings interface:

  • Button importance: 3.0 (default)
  • Input field importance: 2.0 (default)
  • Error text importance: 2.5 (default)
  • Confirmation text importance: 3.0 (default)
  • Title importance: 1.5 (default)

Update Intervals

  • Magnification: 30ms (33 FPS)
  • Importance Map: 5 seconds (configurable)
  • Eye Tracking: 100-250ms (4-10 FPS)

Performance Characteristics

Processing Times

  • Screen Capture: ~10ms (1920x1080)
  • OCR Analysis: 2-4 seconds (full screen)
  • Importance Calculation: ~50ms
  • Gaze Detection: ~25ms per frame

Memory Usage

  • Base Application: ~50MB
  • With Eye Tracking: ~100MB
  • With Continuous Updates: ~150MB
  • Peak (during OCR): ~300MB

CPU Usage

  • Idle: 2-5% (single core)
  • Active Magnification: 5-10% (single core)
  • With Eye Tracking: 10-15% (single core)
  • During OCR Analysis: 25-50% (multi-core)

Installation and Setup

Prerequisites

  1. Python 3.8 or higher
  2. Webcam (for eye tracking)
  3. Tesseract OCR engine

Installation Steps

  1. Clone the repository:

    git clone <repository-url>
    cd context-magnifier
  2. Install Python dependencies:

    pip install PySide6 opencv-python numpy pillow pytesseract
  3. Install GazeTracking dependencies:

    cd facial_recognition/GazeTracking
    pip install -r requirements.txt
  4. Install Tesseract OCR:

First Run

  1. Start the application:

    python main.py
  2. Configure settings (optional):

    • Adjust grid dimensions for your screen resolution
    • Modify importance weights based on your workflow
    • Set update intervals based on performance requirements
  3. Calibrate eye tracking (optional):

    • Enable eye tracking from the context menu
    • Follow the 13-point calibration process
    • Save calibration data for future sessions

Usage

Basic Operation

  1. Launch: Run python main.py from the project directory
  2. Magnify: The magnification window automatically follows important screen areas
  3. Configure: Right-click the magnification window for options
  4. Settings: Use the separate settings window for advanced configuration

Feature Controls

  • Right-click menu: Toggle eye tracking, importance mapping, and fixed positioning
  • Mouse wheel: Adjust zoom level (when magnification window is focused)
  • Settings window: Real-time parameter adjustment and calibration management

Advanced Features

  • Continuous Updates: Automatic importance map regeneration at configurable intervals
  • Multi-source Coordination: Intelligent blending of mouse, eye tracking, and importance data
  • Calibration Persistence: Save and load eye tracking calibration profiles

Troubleshooting

Common Issues

  1. OCR not working: Ensure Tesseract is installed and in system PATH
  2. Eye tracking fails: Check webcam permissions and lighting conditions
  3. High CPU usage: Reduce update frequencies or grid resolution
  4. Memory leaks: Restart application periodically during intensive use

Performance Optimization

  • Reduce grid dimensions for better performance
  • Increase update intervals for lower CPU usage
  • Disable continuous updates when not needed
  • Use fixed positioning mode for consistent performance

Development

Code Structure

context-magnifier/
├── main.py                 # Application entry point
├── coordinate_manager.py   # Core coordinate management
├── calibration_data.json  # Eye tracking calibration data
├── app/                   # GUI components
│   ├── main_window.py     # Settings interface
│   ├── zoom_window.py     # Magnification window
│   └── core.py            # Shared utilities
├── facial_recognition/    # Eye tracking system
│   ├── main.py           # Eye tracking implementation
│   ├── calibrate.py      # Calibration utilities
│   └── GazeTracking/     # Modified GazeTracking library
├── ocr/                  # Screen analysis
│   └── main.py          # OCR and importance calculation
├── fonts/               # Custom typography
└── assets/             # Static resources

Extension Points

  • Custom Importance Algorithms: Modify ocr/main.py scoring functions
  • Additional Input Sources: Extend CoordinateManager with new coordinate sources
  • UI Customization: Modify Qt stylesheets and layouts in app/ directory
  • Analysis Plugins: Add new content analysis modules to the OCR system

License and Attribution

This project incorporates modified components from the GazeTracking library and uses various open-source dependencies. Please refer to individual component licenses for specific terms and conditions.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages