Skip to content

BackProAI/ui_version_value_creator

Repository files navigation

Document Processor Pro - AI-Powered Document Analysis & Modification

A state-of-the-art GUI application that uses OpenAI's GPT-4o vision model to analyze handwritten annotations on PDF documents and intelligently apply those changes to Word documents.

πŸš€ Key Features

Advanced AI Analysis

  • GPT-4o Vision Integration: Sophisticated analysis of handwritten annotations, markings, and corrections
  • Intelligent Relationship Detection: Identifies connections between arrows, strikethroughs, and corrections
  • Context-Aware Processing: Understands the intent behind different types of markings

Smart Document Modification

  • Handwriting Recognition: Converts handwritten text to digital text and places it appropriately
  • Strikethrough Processing: Detects crossed-out text and applies replacements or deletions
  • Cross Mark Analysis: Distinguishes between small item deletions and large section removals
  • Arrow Following: Tracks arrows to understand correction relationships
  • Proximity Analysis: Uses spatial relationships to determine annotation intent

Modern GUI Interface

  • Intuitive Design: Clean, professional interface with real-time preview
  • Progress Tracking: Live progress bars and detailed logging
  • Results Visualization: Comprehensive analysis results with confidence scoring
  • Batch Processing: Handle multiple documents efficiently
  • Export Options: Save results in multiple formats (Word, Excel, JSON)

πŸ› οΈ Quick Start

Option 1: Easy Windows Launch

  1. Double-click start.bat to launch the application
  2. The setup wizard will guide you through initial configuration
  3. Add your OpenAI API key when prompted

Option 2: Manual Setup

  1. Install Python 3.8+ from python.org

  2. Install Dependencies:

    pip install -r requirements.txt
  3. Configure API Key:

    • Edit .env file and add your OpenAI API key:
    OPENAI_API_KEY=your-actual-api-key-here
    
  4. Launch Application:

    python launcher.py

πŸ“‹ How It Works

Step 1: Document Analysis

  1. Upload PDF: Select the PDF with handwritten annotations
  2. Upload Word Document: Choose the Word document to modify
  3. Configure Settings: Adjust detection settings and confidence thresholds

Step 2: AI Processing

  1. PDF Chunking: Breaks PDF into analyzable image chunks
  2. GPT-4o Analysis: Each chunk is analyzed for:
    • Handwritten text and corrections
    • Strikethrough text and deletions
    • Cross marks (small deletions vs. large section removal)
    • Arrows connecting annotations to text
    • Highlighting and emphasis marks
    • Margin notes and annotations

Step 3: Intelligent Matching

  1. Text Matching: AI matches PDF annotations to Word document content
  2. Relationship Analysis: Determines relationships between markings:
    • Handwriting near strikethrough = replacement
    • Arrows pointing from corrections to text
    • Large crosses = delete entire paragraphs
    • Small crosses = delete specific items

Step 4: Document Modification

  1. Smart Application: Applies changes based on detected intent:
    • Replace: Strikethrough text replaced with handwritten corrections
    • Delete: Crossed-out sections removed completely
    • Insert: Handwritten additions placed appropriately
    • Highlight: Important sections emphasized
    • Comment: Uncertain annotations added as reviewable comments

🎯 Use Cases

  • Document Review: Apply handwritten feedback to digital documents
  • Manuscript Editing: Convert paper edits to digital format
  • Form Processing: Digitize handwritten form corrections
  • Contract Modification: Apply legal annotations and changes
  • Academic Papers: Process reviewer comments and corrections
  • Business Documents: Convert meeting notes and annotations

πŸ“ Project Structure

β”œβ”€β”€ gui_main.py                    # Main GUI application
β”œβ”€β”€ launcher.py                    # Setup wizard and launcher
β”œβ”€β”€ document_parser.py             # Core document processing engine
β”œβ”€β”€ chunk_analyzer.py              # GPT-4o vision analysis
β”œβ”€β”€ advanced_word_processor.py     # Intelligent Word document modification
β”œβ”€β”€ document_preprocessor.py       # PDF to image conversion and chunking
β”œβ”€β”€ word_processor.py              # Basic Word document operations
β”œβ”€β”€ config.yaml                    # Configuration settings
β”œβ”€β”€ .env                          # API keys and environment variables
β”œβ”€β”€ requirements.txt              # Python dependencies
β”œβ”€β”€ setup.py                      # Initial setup script
β”œβ”€β”€ example_usage.py              # Usage examples
β”œβ”€β”€ start.bat                     # Windows launcher script
└── README.md                     # This file

βš™οΈ Configuration Options

Analysis Settings (config.yaml)

analysis:
  detect_handwriting: true          # Detect handwritten text
  detect_strikethrough: true        # Detect crossed-out text
  detect_crosses: true              # Detect X marks and crosses
  detect_arrows: true               # Detect arrows and connections
  detect_highlighting: true         # Detect highlighted text
  confidence_threshold: 0.7         # Minimum confidence for detections

Document Processing

document:
  chunk_size: 512                   # Image chunk size in pixels
  overlap: 64                       # Overlap between chunks
  supported_formats: [".pdf", ".png", ".jpg", ".jpeg", ".tiff"]

Word Document Output

word_processing:
  preserve_formatting: true         # Keep original formatting
  highlight_detected_items: true    # Highlight changes in output
  add_comments: true               # Add comments for uncertain items
  track_changes: false             # Enable Word change tracking

πŸ”§ Advanced Features

Custom AI Prompts

The system uses sophisticated prompts to guide GPT-4o analysis:

  • Relationship detection between markings
  • Size-based deletion scope determination
  • Context-aware text placement
  • Confidence-based action selection

Batch Processing

  • Process multiple PDF-Word document pairs
  • Consistent settings across all documents
  • Detailed processing reports
  • Error handling and recovery

Export Options

  • Modified Word Documents: Final documents with applied changes
  • Analysis Reports: Detailed breakdown of all detections
  • Excel Summaries: Tabular data for further analysis
  • JSON Results: Raw analysis data for custom processing

🚨 Requirements

  • Python: 3.8 or higher
  • OpenAI API Key: GPT-4o vision access required
  • Operating System: Windows 10/11 (primary), macOS, Linux
  • Memory: 4GB RAM minimum, 8GB recommended
  • Storage: 1GB free space for processing

πŸ’‘ Tips for Best Results

  1. High-Quality PDFs: Use PDFs with clear, readable text and annotations
  2. Consistent Handwriting: Clear, legible handwriting improves accuracy
  3. Good Contrast: Ensure annotations are clearly visible against the background
  4. Logical Annotations: Use consistent marking patterns (arrows, crosses, etc.)
  5. API Limits: Be mindful of OpenAI API usage limits for large documents

πŸ› Troubleshooting

Common Issues

  • API Key Error: Verify your OpenAI API key in the .env file
  • Dependencies Missing: Run pip install -r requirements.txt
  • PDF Processing Error: Ensure PDF is not password-protected
  • Memory Error: Reduce chunk size in configuration for large documents

Getting Help

  1. Check the processing log in the GUI for detailed error information
  2. Verify all dependencies are properly installed
  3. Ensure your OpenAI API key has GPT-4o vision access
  4. Try processing a smaller test document first

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.


Document Processor Pro - Bridging the gap between handwritten annotations and digital documents with AI-powered intelligence.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published