# Credit OCR System - Application Setup

This notebook demonstrates the simplest way to start and use the Credit OCR System.

## Overview

The Credit OCR System is designed to be started with a single command that handles all the complexity of:
- Infrastructure setup (Docker containers)
- AI model downloads
- Service orchestration
- Web interface launch


## 1. Prerequisites Check

Before starting, let's verify the prerequisites are available.


In [None]:
import subprocess
import sys
from pathlib import Path

# Check if we're in the right directory
project_root = Path.cwd()
if project_root.name != 'credit-ocr-system':
    project_root = project_root.parent
    if project_root.name != 'credit-ocr-system':
        print("Please run this notebook from the credit-ocr-system directory")
        sys.exit(1)

print(f"Project root: {project_root}")

# Check Docker
try:
    result = subprocess.run(['docker', '--version'], capture_output=True, text=True)
    if result.returncode == 0:
        print(f"Docker: {result.stdout.strip()}")
    else:
        print("Docker not available")
except FileNotFoundError:
    print("Docker not found")

# Check Python version
python_version = sys.version_info
if python_version >= (3, 9):
    print(f"Python: {python_version.major}.{python_version.minor}")
else:
    print(f"Python 3.9+ required, found {python_version.major}.{python_version.minor}")

# Check if startup script exists
startup_script = project_root / "start_credit_ocr.py"
if startup_script.exists():
    print(f"Startup script: {startup_script}")
else:
    print("start_credit_ocr.py not found")


## 2. Start the System

**Note**: This will start the system in the background. The first run may take 5-10 minutes to download the AI model.

**Option 1: Start from Terminal** (Recommended)
```bash
python3 start_credit_ocr.py
```

**Option 2: Start from Notebook** (Advanced)
Run the cell below if you want to start from the notebook:


In [None]:
# UNCOMMENT THE LINES BELOW TO START FROM NOTEBOOK
# WARNING: This will run in the background and may be harder to stop

# import subprocess
# import os

# print("Starting Credit OCR System...")
# print("This may take 5-10 minutes on first run (downloading AI model)")
# print("")

# # Change to project root
# os.chdir(project_root)

# # Start the system
# process = subprocess.Popen(
#     [sys.executable, "start_credit_ocr.py"],
#     stdout=subprocess.PIPE,
#     stderr=subprocess.STDOUT,
#     text=True,
#     bufsize=1,
#     universal_newlines=True
# )

# print("System is starting in the background...")
# print("Check the terminal output or wait for the next cell to verify it's ready.")

print("For best experience, start the system from terminal:")
print("   python3 start_credit_ocr.py")
print("")
print("Then continue with the next cells to verify it's running.")


## 3. Verify System is Running

Let's check if the system is up and running by testing the health endpoint.


In [None]:
import requests
import time

API_BASE_URL = "http://127.0.0.1:8000"

def check_system_health():
    """Check if the system is running and healthy."""
    try:
        response = requests.get(f"{API_BASE_URL}/api/v1/health", timeout=5)
        if response.status_code == 200:
            health_data = response.json()
            print(f"System Status: {health_data.get('status', 'unknown')}")
            print("\nService Status:")
            services = health_data.get('services', {})
            for service, status in services.items():
                status_icon = "OK" if "healthy" in status else "STARTING" if "starting" in status or "downloading" in status else "ERROR"
                print(f"   {status_icon} {service}: {status}")
            return health_data.get('status') == 'healthy'
        else:
            print(f"Health check failed: HTTP {response.status_code}")
            return False
    except requests.exceptions.ConnectionError:
        print("Cannot connect to the system. Is it running?")
        return False
    except Exception as e:
        print(f"Health check error: {e}")
        return False

# Check system health
is_healthy = check_system_health()

if is_healthy:
    print("\nSystem is ready!")
    print(f"Web Interface: {API_BASE_URL}/")
    print(f"API Docs: {API_BASE_URL}/docs")
else:
    print("\nSystem may still be starting up...")
    print("If you just started it, wait a few minutes and run this cell again.")


## 4. Open Web Interface

Once the system is healthy, you can access the web interface.


In [None]:
import webbrowser

# Open the web interface
web_url = f"{API_BASE_URL}/"
print(f"Opening web interface: {web_url}")

try:
    webbrowser.open(web_url)
    print("Web browser should open automatically")
except Exception as e:
    print(f"Could not open browser automatically: {e}")
    print(f"Please manually open: {web_url}")

print("\nIn the web interface you can:")
print("   • Upload PDF documents")
print("   • Monitor processing progress")
print("   • View extracted fields")
print("   • See OCR visualizations")
print("   • Check service status")


## 5. Next Steps

**Congratulations!** Your Credit OCR System is now running.

### What to do next:

1. **Use the Web Interface** at http://127.0.0.1:8000/
   - Upload PDF documents
   - Watch real-time processing
   - View extracted data and visualizations

2. **Explore the API** at http://127.0.0.1:8000/docs
   - Interactive API documentation
   - Test endpoints directly

### Stopping the System:

When you're done, stop the system by pressing `Ctrl+C` in the terminal where you started it.

---

**Happy document processing!**
