# 🧮 Handwritten Equation Solver - Google Colab Training

**High-accuracy CNN ensemble for handwritten arithmetic recognition**

Expected Performance:
- Digit Recognition: 98-99%
- Operator Recognition: 95-98%
- Overall System: 92-98%

⚡ **GPU Recommended** - Training time: ~15-20 minutes with GPU, ~60+ minutes with CPU

## 📋 Step 1: Environment Setup

First, let's check GPU availability and install dependencies.

In [None]:
# Check GPU and system info
!nvidia-smi

import torch
print(f'PyTorch version: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')
if torch.cuda.is_available():
    print(f'GPU: {torch.cuda.get_device_name(0)}')
    print(f'GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')
else:
    print('⚠️ No GPU detected. Training will be slower on CPU.')

In [None]:
# Install required packages
!pip install torch torchvision opencv-python-headless pillow numpy scipy scikit-image gradio requests torchmetrics

# Verify installations
import cv2, numpy as np, PIL
print('✅ All packages installed successfully')

## 📁 Step 2: Upload Project Files

Upload your project files or clone from GitHub.

In [None]:
# Option 1: Clone from GitHub (replace with your repo URL)
# !git clone https://github.com/your-username/handwritten-equation-solver.git
# %cd handwritten-equation-solver

# Option 2: Upload files manually
from google.colab import files
import zipfile
import os

print('📁 Upload your project ZIP file:')
uploaded = files.upload()

# Extract uploaded ZIP
for filename in uploaded.keys():
    if filename.endswith('.zip'):
        with zipfile.ZipFile(filename, 'r') as zip_ref:
            zip_ref.extractall('.')
        print(f'✅ Extracted {filename}')
        break

# Navigate to project directory
if os.path.exists('hes'):
    %cd hes
elif os.path.exists('handwritten-equation-solver'):
    %cd handwritten-equation-solver

!ls -la

## 🏗️ Step 3: Create Project Structure

If files weren't uploaded, we'll create the project from scratch.

In [None]:
# Create project structure
import os

# Create directories
os.makedirs('models', exist_ok=True)
os.makedirs('data/operators', exist_ok=True)
os.makedirs('data/mnist', exist_ok=True)
os.makedirs('data/emnist', exist_ok=True)

print('📁 Project structure created:')
!tree . -L 2 2>/dev/null || find . -type d -name '.*' -prune -o -type d -print | head -10

## 🎯 Step 4: Generate Training Data

Generate realistic handwritten-style operator data.

In [None]:
%%time
# Generate operator training data
exec(open('data_gen.py').read()) if os.path.exists('data_gen.py') else None

# Generate data
try:
    generate_operators(samples_per_class=1000)
    print('✅ Operator data generated successfully')
except Exception as e:
    print(f'❌ Error generating data: {e}')

## 🚀 Step 5: Train Models

Train both digit and operator recognition models with ensemble approach.

In [None]:
%%time
# Train digit recognition model (EMNIST dataset)
print('🔢 Training digit recognition model...')

try:
    exec(open('train_digits.py').read()) if os.path.exists('train_digits.py') else None
    train_digits()
    print('✅ Digit model training completed')
except Exception as e:
    print(f'❌ Digit training failed: {e}')
    import traceback
    traceback.print_exc()

In [None]:
%%time
# Train operator recognition model
print('➕ Training operator recognition model...')

try:
    exec(open('train_operators.py').read()) if os.path.exists('train_operators.py') else None
    train_operators()
    print('✅ Operator model training completed')
except Exception as e:
    print(f'❌ Operator training failed: {e}')
    import traceback
    traceback.print_exc()

## 🧪 Step 6: Test the Models

Test the trained models with sample equations.

In [None]:
# Test the trained models
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw, ImageFont
import numpy as np

# Load solver
try:
    exec(open('predict.py').read()) if os.path.exists('predict.py') else None
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    solver = Solver(device=device)
    print('✅ Solver loaded successfully')
except Exception as e:
    print(f'❌ Failed to load solver: {e}')
    solver = None

In [None]:
# Create test images
def create_test_image(text, size=(200, 80)):
    img = Image.new('L', size, color=255)
    draw = ImageDraw.Draw(img)
    
    try:
        font = ImageFont.truetype('/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf', 40)
    except:
        try:
            font = ImageFont.truetype('arial.ttf', 40)
        except:
            font = ImageFont.load_default()
    
    # Center text
    bbox = draw.textbbox((0, 0), text, font=font)
    w, h = bbox[2] - bbox[0], bbox[3] - bbox[1]
    x = (size[0] - w) // 2
    y = (size[1] - h) // 2
    
    draw.text((x, y), text, fill=0, font=font)
    return img

# Test equations
test_equations = ['2+3', '7-4', '5×2', '8÷2', '9+1']

if solver:
    fig, axes = plt.subplots(1, len(test_equations), figsize=(15, 3))
    
    for i, eq in enumerate(test_equations):
        # Create test image
        test_img = create_test_image(eq)
        
        # Predict
        try:
            expression, result = solver.predict_image(test_img)
            title = f'Input: {eq}\nDetected: {expression}\nResult: {result}'
        except Exception as e:
            title = f'Input: {eq}\nError: {str(e)[:20]}'
        
        # Display
        axes[i].imshow(test_img, cmap='gray')
        axes[i].set_title(title, fontsize=8)
        axes[i].axis('off')
    
    plt.tight_layout()
    plt.show()
else:
    print('❌ Cannot test - solver not loaded')

## 🌐 Step 7: Launch Web Interface

Launch Gradio interface for interactive testing.

In [None]:
# Launch Gradio interface
try:
    exec(open('app.py').read()) if os.path.exists('app.py') else None
    print('🌐 Launching web interface...')
    # The interface will launch automatically
except Exception as e:
    print(f'❌ Failed to launch interface: {e}')
    print('You can still download the models and use them locally.')

## 📥 Step 8: Download Trained Models

Download the trained models to use locally.

In [None]:
# Check available models
import os

print('📁 Available model files:')
model_files = []
for file in os.listdir('models'):
    if file.endswith(('.pth', '.json')):
        size = os.path.getsize(f'models/{file}') / (1024*1024)
        print(f'  {file} ({size:.1f} MB)')
        model_files.append(f'models/{file}')

print(f'\n📊 Total models: {len(model_files)}')

In [None]:
# Download models
from google.colab import files
import zipfile

# Create ZIP of all models
with zipfile.ZipFile('trained_models.zip', 'w') as zipf:
    for file in model_files:
        if os.path.exists(file):
            zipf.write(file, os.path.basename(file))
            print(f'✅ Added {file} to ZIP')

print('\n📦 Downloading trained models...')
files.download('trained_models.zip')

print('✅ Download complete!')

## 📊 Training Summary

### Models Trained:
- **Digit Recognition**: EfficientCNN on EMNIST dataset
- **Operator Recognition**: EfficientCNN on synthetic handwritten data
- **Ensemble Models**: Multiple architectures for higher accuracy

### Expected Performance:
- **Digit Accuracy**: 98-99%
- **Operator Accuracy**: 95-98%
- **Overall System**: 92-98%

### Key Features:
- Stroke Width Transform segmentation
- Ensemble + Test-Time Augmentation
- Context-aware prediction
- Grammar correction

### Usage:
1. Download the models
2. Use with the Gradio interface or integrate into your application
3. For local use: `python app.py`