# Notebook 06: Integration Experiments

This notebook integrates all components into a complete end-to-end allergen detection pipeline and tests the system comprehensively.

**Pipeline Components:**
- SimpleOCREngine (text extraction)
- Trained NER Model (allergen entity recognition)
- Allergen Dictionary (synonym mapping)
- Detection confidence scoring

**Experiments:**
1. End-to-end pipeline integration
2. Sample image testing
3. Error analysis and categorization
4. Performance optimization
5. Batch processing benchmarks
6. Comprehensive detection reports

## Prerequisites

**Required:**
- Completed Notebook 02 (OCR preprocessing)
- Completed Notebook 04 (NER model training)
- Trained model in `models/ner_model/`
- Test images in `data/raw/` or `data/ocr_results/test_samples/`

**Dependencies:**
- transformers, torch, easyocr, cv2, pandas, numpy

Run cells in sequential order.

## Section A: Setup and Environment Configuration

In [None]:
# Step 1: Setup paths and environment

import sys
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Detect environment and set paths
ROOT = Path.cwd()
if ROOT.name == "notebooks":
    ROOT = ROOT.parent

SRC = ROOT / "src"
DATA_DIR = ROOT / "data"
MODELS_DIR = ROOT / "models"
RESULTS_DIR = ROOT / "results"

# Add src to path for imports
if str(SRC) not in sys.path:
    sys.path.insert(0, str(SRC))

print("‚úì Environment setup complete")
print(f"Root directory: {ROOT}")
print(f"Source code: {SRC}")
print(f"Data directory: {DATA_DIR}")
print(f"Models directory: {MODELS_DIR}")

In [None]:
# Step 2: Import required libraries

import json
import torch
import numpy as np
import pandas as pd
from PIL import Image
import cv2
from transformers import AutoTokenizer, AutoModelForTokenClassification
from typing import List, Dict, Tuple
import time
from tqdm.auto import tqdm
from collections import defaultdict
import matplotlib.pyplot as plt
import seaborn as sns

# Local imports
from ocr.simple_ocr_engine import SimpleOCREngine

print("‚úì Libraries imported successfully")
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

## Step 3: Load All Pipeline Components

Load the trained NER model, OCR engine, allergen dictionary, and configure the complete detection pipeline.

In [None]:
# Step 3a: Load allergen dictionary and label mapping

# Load allergen dictionary
allergen_dict_path = DATA_DIR / "allergen_dictionary.json"
with open(allergen_dict_path, 'r') as f:
    allergen_dictionary = json.load(f)

print(f"‚úì Loaded allergen dictionary with {len(allergen_dictionary)} allergen types")
print(f"  Allergen types: {', '.join(allergen_dictionary.keys())}")

# Load label mapping
label_mapping_path = DATA_DIR / "ner_training" / "label_mapping.json"
with open(label_mapping_path, 'r') as f:
    label_mapping = json.load(f)

id2label = {int(k): v for k, v in label_mapping["id2label"].items()}
label2id = {v: int(k) for k, v in label_mapping["label2id"].items()}

print(f"‚úì Loaded label mapping with {len(id2label)} labels")
print(f"  Labels: {list(id2label.values())}")

In [None]:
# Step 3b: Load trained NER model and tokenizer

model_path = MODELS_DIR / "ner_model"

if not model_path.exists():
    print("‚ö†Ô∏è  Model not found at models/ner_model/")
    print("   Checking experiments folder...")
    experiment_models = list((MODELS_DIR / "experiments").glob("**/pytorch_model.bin"))
    if experiment_models:
        model_path = experiment_models[0].parent
        print(f"   Using model from: {model_path}")
    else:
        raise FileNotFoundError("No trained model found. Please run Notebook 04 first.")

try:
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForTokenClassification.from_pretrained(model_path)
    
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = model.to(device)
    model.eval()
    
    print(f"‚úì Loaded NER model from: {model_path}")
    print(f"  Model device: {device}")
    print(f"  Number of labels: {model.config.num_labels}")
    
except Exception as e:
    print(f"‚ùå Error loading model: {e}")
    raise

In [None]:
# Step 3c: Initialize OCR engine

try:
    ocr_engine = SimpleOCREngine(lang_list=["en"], gpu=torch.cuda.is_available())
    print("‚úì OCR engine initialized")
    print(f"  Languages: English")
    print(f"  GPU enabled: {torch.cuda.is_available()}")
except Exception as e:
    print(f"‚ö†Ô∏è  Error initializing OCR: {e}")
    print("  OCR will be skipped, using ground truth text instead.")

## Section B: Build Integration Pipeline

Create the complete end-to-end allergen detection pipeline that integrates all components.

In [None]:
# Step 4: Define helper functions for text processing and NER

def clean_text(text: str) -> str:
    """Clean and normalize extracted text."""
    if not text:
        return ""
    
    # Remove excessive whitespace
    text = ' '.join(text.split())
    
    # Remove underscores (common OCR artifact)
    text = text.replace('_', '')
    
    return text.strip()


def run_ner_prediction(text: str, tokenizer, model, device) -> List[Tuple[str, str, float]]:
    """
    Run NER model on text and return detected entities.
    
    Returns:
        List of (entity_text, label, confidence) tuples
    """
    if not text or len(text.strip()) < 3:
        return []
    
    # Tokenize
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
        padding=True,
        return_offsets_mapping=True
    )
    
    offset_mapping = inputs.pop("offset_mapping")[0].numpy()
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    # Predict
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits[0]
        probs = torch.softmax(logits, dim=-1)
        predictions = torch.argmax(logits, dim=-1).cpu().numpy()
        confidences = probs.max(dim=-1).values.cpu().numpy()
    
    # Extract entities
    entities = []
    current_entity = None
    current_label = None
    current_start = None
    current_end = None
    current_conf = []
    
    for idx, (pred, conf, (start, end)) in enumerate(zip(predictions, confidences, offset_mapping)):
        # Skip special tokens
        if start == end:
            continue
        
        label = id2label[pred]
        
        # Check if this is an allergen entity
        if label.startswith('B-'):
            # Save previous entity if exists
            if current_entity:
                entities.append((
                    current_entity,
                    current_label,
                    float(np.mean(current_conf))
                ))
            
            # Start new entity
            current_entity = text[start:end]
            current_label = label[2:]  # Remove B- prefix
            current_start = start
            current_end = end
            current_conf = [conf]
            
        elif label.startswith('I-') and current_entity:
            # Continue current entity
            current_end = end
            current_entity = text[current_start:current_end]
            current_conf.append(conf)
        
        else:
            # Not an entity - save current if exists
            if current_entity:
                entities.append((
                    current_entity,
                    current_label,
                    float(np.mean(current_conf))
                ))
                current_entity = None
                current_label = None
                current_conf = []
    
    # Don't forget last entity
    if current_entity:
        entities.append((
            current_entity,
            current_label,
            float(np.mean(current_conf))
        ))
    
    return entities


def map_to_standard_allergens(entities: List[Tuple[str, str, float]], allergen_dict: Dict) -> Dict:
    """
    Map detected entities to standard allergen categories.
    
    Returns:
        Dict with detected allergens, confidence, and details
    """
    detected_allergens = defaultdict(list)
    
    for entity_text, label, confidence in entities:
        entity_lower = entity_text.lower().strip()
        
        # Try to match against dictionary
        matched = False
        for allergen_type, synonyms in allergen_dict.items():
            for synonym in synonyms:
                if synonym.lower() in entity_lower or entity_lower in synonym.lower():
                    detected_allergens[allergen_type].append({
                        'text': entity_text,
                        'confidence': confidence,
                        'label': label
                    })
                    matched = True
                    break
            if matched:
                break
        
        # If no match, still include under "unknown"
        if not matched and label != 'O':
            detected_allergens['unknown'].append({
                'text': entity_text,
                'confidence': confidence,
                'label': label
            })
    
    return dict(detected_allergens)

print("‚úì Helper functions defined")

In [None]:
# Step 5: Build complete end-to-end pipeline

def detect_allergens_from_image(
    image_path: str,
    use_ocr: bool = True,
    ground_truth_text: str = None,
    verbose: bool = False
) -> Dict:
    """
    Complete end-to-end allergen detection pipeline.
    
    Args:
        image_path: Path to product image
        use_ocr: Whether to run OCR (False = use ground_truth_text)
        ground_truth_text: Ground truth text if use_ocr=False
        verbose: Print detailed progress
    
    Returns:
        Dict with detection results, timing, and confidence
    """
    results = {
        'image_path': str(image_path),
        'success': False,
        'error': None,
        'timings': {},
        'raw_text': '',
        'cleaned_text': '',
        'entities_found': [],
        'detected_allergens': {},
        'total_allergens': 0,
        'avg_confidence': 0.0
    }
    
    try:
        # Step 1: Extract text (OCR or ground truth)
        t0 = time.time()
        
        if use_ocr and 'ocr_engine' in globals():
            if verbose:
                print(f"Running OCR on {Path(image_path).name}...")
            
            img = cv2.imread(str(image_path))
            if img is None:
                raise ValueError(f"Could not load image: {image_path}")
            
            raw_text = ocr_engine.extract(img)
            results['timings']['ocr'] = time.time() - t0
            
        else:
            if ground_truth_text is None:
                raise ValueError("use_ocr=False requires ground_truth_text")
            raw_text = ground_truth_text
            results['timings']['ocr'] = 0.0
        
        results['raw_text'] = raw_text
        
        if verbose:
            print(f"  Text extracted ({len(raw_text)} chars)")
        
        # Step 2: Clean text
        t0 = time.time()
        cleaned_text = clean_text(raw_text)
        results['cleaned_text'] = cleaned_text
        results['timings']['cleaning'] = time.time() - t0
        
        if not cleaned_text or len(cleaned_text) < 3:
            results['error'] = 'Text too short or empty'
            return results
        
        # Step 3: Run NER prediction
        t0 = time.time()
        entities = run_ner_prediction(cleaned_text, tokenizer, model, device)
        results['entities_found'] = entities
        results['timings']['ner'] = time.time() - t0
        
        if verbose:
            print(f"  Found {len(entities)} entities")
        
        # Step 4: Map to standard allergens
        t0 = time.time()
        detected_allergens = map_to_standard_allergens(entities, allergen_dictionary)
        results['detected_allergens'] = detected_allergens
        results['timings']['mapping'] = time.time() - t0
        
        # Step 5: Calculate summary statistics
        all_confidences = []
        for allergen_type, detections in detected_allergens.items():
            for det in detections:
                all_confidences.append(det['confidence'])
        
        results['total_allergens'] = len(detected_allergens)
        results['avg_confidence'] = float(np.mean(all_confidences)) if all_confidences else 0.0
        results['total_time'] = sum(results['timings'].values())
        results['success'] = True
        
        if verbose:
            print(f"  Detected {results['total_allergens']} allergen types")
            print(f"  Total time: {results['total_time']:.3f}s")
        
    except Exception as e:
        results['error'] = str(e)
        if verbose:
            print(f"  ‚ùå Error: {e}")
    
    return results

print("‚úì End-to-end pipeline function defined")
print("  Usage: detect_allergens_from_image(image_path, use_ocr=True)")

## Section C: Test on Sample Images

Test the pipeline on sample images and analyze results.

In [None]:
# Step 6: Load test samples

# Look for test samples in multiple locations
test_samples_dir = DATA_DIR / "ocr_results" / "test_samples"
raw_images_dir = DATA_DIR / "raw"

test_images = []

if test_samples_dir.exists():
    test_images.extend(list(test_samples_dir.glob("*.jpg")))
    test_images.extend(list(test_samples_dir.glob("*.png")))

if len(test_images) < 10 and raw_images_dir.exists():
    raw_imgs = list(raw_images_dir.glob("*.jpg"))
    test_images.extend(raw_imgs[:min(20, len(raw_imgs))])

# Remove duplicates
test_images = list(set(test_images))

print(f"‚úì Found {len(test_images)} test images")
if test_images:
    print(f"  Sample paths:")
    for img in test_images[:5]:
        print(f"    - {img.name}")

In [None]:
# Step 7: Run pipeline on sample images

print("=" * 70)
print("RUNNING PIPELINE ON SAMPLE IMAGES")
print("=" * 70)

# Test on first 10 images (or fewer if less available)
n_samples = min(10, len(test_images))
sample_results = []

for i, img_path in enumerate(tqdm(test_images[:n_samples], desc="Processing images"), 1):
    print(f"\nüì¶ Sample {i}/{n_samples}: {img_path.name}")
    
    # Check if ground truth text exists
    txt_path = img_path.with_suffix('.txt')
    ground_truth_text = None
    
    if txt_path.exists():
        with open(txt_path, 'r', encoding='utf-8') as f:
            ground_truth_text = f.read()
    
    # Run pipeline
    result = detect_allergens_from_image(
        img_path,
        use_ocr=('ocr_engine' in globals()),
        ground_truth_text=ground_truth_text,
        verbose=True
    )
    
    sample_results.append(result)
    
    # Print summary
    if result['success']:
        print(f"  ‚úì Success!")
        print(f"    Text: {result['cleaned_text'][:100]}...")
        print(f"    Allergens detected: {result['total_allergens']}")
        if result['detected_allergens']:
            for allergen_type, detections in result['detected_allergens'].items():
                print(f"      - {allergen_type}: {[d['text'] for d in detections]}")
        print(f"    Avg confidence: {result['avg_confidence']:.2%}")
        print(f"    Total time: {result['total_time']:.3f}s")
    else:
        print(f"  ‚ùå Failed: {result['error']}")

print(f"\n‚úì Processed {n_samples} samples")
print(f"  Success rate: {sum(r['success'] for r in sample_results)}/{n_samples}")

## Section D: Error Analysis and Categorization

Analyze failures and categorize error types to identify improvement areas.

In [None]:
# Step 8: Categorize errors and analyze failure patterns

error_categories = {
    'ocr_failure': [],      # OCR couldn't extract text
    'ner_failure': [],      # NER didn't detect entities
    'mapping_failure': [],  # Entities found but not mapped
    'low_confidence': [],   # Detection but low confidence
    'success': []           # Successful detection
}

for result in sample_results:
    img_name = Path(result['image_path']).name
    
    if not result['success']:
        if 'Text too short' in str(result['error']):
            error_categories['ocr_failure'].append(img_name)
        else:
            error_categories['ocr_failure'].append(img_name)
    
    elif result['total_allergens'] == 0:
        if len(result['entities_found']) == 0:
            error_categories['ner_failure'].append(img_name)
        else:
            error_categories['mapping_failure'].append(img_name)
    
    elif result['avg_confidence'] < 0.5:
        error_categories['low_confidence'].append(img_name)
    
    else:
        error_categories['success'].append(img_name)

print("=" * 70)
print("ERROR ANALYSIS")
print("=" * 70)

for category, items in error_categories.items():
    print(f"\n{category.upper().replace('_', ' ')}: {len(items)}")
    if items and len(items) <= 5:
        for item in items:
            print(f"  - {item}")
    elif items:
        print(f"  - {items[0]}")
        print(f"  - {items[1]}")
        print(f"  ... and {len(items) - 2} more")

# Calculate statistics
total_samples = len(sample_results)
success_rate = len(error_categories['success']) / total_samples if total_samples > 0 else 0

print(f"\n{'=' * 70}")
print(f"SUMMARY STATISTICS")
print(f"{'=' * 70}")
print(f"Total samples: {total_samples}")
print(f"Success rate: {success_rate:.1%}")
print(f"OCR failures: {len(error_categories['ocr_failure'])} ({len(error_categories['ocr_failure'])/total_samples:.1%})")
print(f"NER failures: {len(error_categories['ner_failure'])} ({len(error_categories['ner_failure'])/total_samples:.1%})")
print(f"Mapping failures: {len(error_categories['mapping_failure'])} ({len(error_categories['mapping_failure'])/total_samples:.1%})")
print(f"Low confidence: {len(error_categories['low_confidence'])} ({len(error_categories['low_confidence'])/total_samples:.1%})")

## Section E: Performance Benchmarking

Measure pipeline performance metrics including timing, throughput, and resource usage.

In [None]:
# Step 9: Analyze pipeline performance and bottlenecks

# Extract timing data
timing_data = {
    'ocr': [],
    'cleaning': [],
    'ner': [],
    'mapping': [],
    'total': []
}

for result in sample_results:
    if result['success'] and 'timings' in result:
        for key in timing_data.keys():
            if key == 'total':
                timing_data[key].append(result.get('total_time', 0))
            else:
                timing_data[key].append(result['timings'].get(key, 0))

print("=" * 70)
print("PERFORMANCE BENCHMARKS")
print("=" * 70)

for component, times in timing_data.items():
    if times:
        avg_time = np.mean(times)
        std_time = np.std(times)
        min_time = np.min(times)
        max_time = np.max(times)
        
        print(f"\n{component.upper()}:")
        print(f"  Average: {avg_time:.3f}s")
        print(f"  Std Dev: {std_time:.3f}s")
        print(f"  Min: {min_time:.3f}s")
        print(f"  Max: {max_time:.3f}s")
        
        if component != 'total':
            total_avg = np.mean(timing_data['total'])
            pct = (avg_time / total_avg * 100) if total_avg > 0 else 0
            print(f"  % of total: {pct:.1f}%")

# Calculate throughput
if timing_data['total']:
    avg_total_time = np.mean(timing_data['total'])
    throughput = 1.0 / avg_total_time if avg_total_time > 0 else 0
    
    print(f"\n{'=' * 70}")
    print(f"THROUGHPUT")
    print(f"{'=' * 70}")
    print(f"Average time per image: {avg_total_time:.3f}s")
    print(f"Images per second: {throughput:.2f}")
    print(f"Images per minute: {throughput * 60:.1f}")
    print(f"Estimated time for 1000 images: {(avg_total_time * 1000) / 60:.1f} minutes")

In [None]:
# Step 10: Visualize performance breakdown

if timing_data['ocr']:
    # Calculate average time for each component
    component_times = {
        'OCR': np.mean(timing_data['ocr']),
        'Cleaning': np.mean(timing_data['cleaning']),
        'NER': np.mean(timing_data['ner']),
        'Mapping': np.mean(timing_data['mapping'])
    }
    
    # Create pie chart
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
    
    # Pie chart of time distribution
    colors = ['#ff9999', '#66b3ff', '#99ff99', '#ffcc99']
    ax1.pie(
        component_times.values(),
        labels=component_times.keys(),
        autopct='%1.1f%%',
        colors=colors,
        startangle=90
    )
    ax1.set_title('Pipeline Time Distribution', fontsize=14, fontweight='bold')
    
    # Bar chart of absolute times
    ax2.bar(component_times.keys(), component_times.values(), color=colors)
    ax2.set_ylabel('Time (seconds)', fontsize=12)
    ax2.set_title('Average Time per Component', fontsize=14, fontweight='bold')
    ax2.grid(axis='y', alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    print("‚úì Performance visualization complete")

## Section F: Comprehensive Detection Report

Generate detailed reports with visualizations and metrics.

In [None]:
# Step 11: Generate comprehensive detection report

# Aggregate allergen detection statistics
allergen_stats = defaultdict(int)
confidence_by_allergen = defaultdict(list)

for result in sample_results:
    if result['success'] and result['detected_allergens']:
        for allergen_type, detections in result['detected_allergens'].items():
            allergen_stats[allergen_type] += 1
            for det in detections:
                confidence_by_allergen[allergen_type].append(det['confidence'])

print("=" * 70)
print("ALLERGEN DETECTION REPORT")
print("=" * 70)

print(f"\nTotal images processed: {len(sample_results)}")
print(f"Images with allergens detected: {sum(1 for r in sample_results if r.get('total_allergens', 0) > 0)}")

print(f"\n{'Allergen Type':<20} {'Count':<10} {'Avg Confidence':<15}")
print("-" * 50)

for allergen_type in sorted(allergen_stats.keys()):
    count = allergen_stats[allergen_type]
    avg_conf = np.mean(confidence_by_allergen[allergen_type])
    print(f"{allergen_type:<20} {count:<10} {avg_conf:.2%}")

# Visualize allergen distribution
if allergen_stats:
    fig, ax = plt.subplots(figsize=(12, 6))
    
    allergen_types = list(allergen_stats.keys())
    counts = [allergen_stats[at] for at in allergen_types]
    
    bars = ax.barh(allergen_types, counts, color='skyblue')
    ax.set_xlabel('Number of Detections', fontsize=12)
    ax.set_title('Allergen Detection Frequency', fontsize=14, fontweight='bold')
    ax.grid(axis='x', alpha=0.3)
    
    # Add count labels on bars
    for bar, count in zip(bars, counts):
        width = bar.get_width()
        ax.text(width, bar.get_y() + bar.get_height()/2, 
                f' {count}', ha='left', va='center', fontweight='bold')
    
    plt.tight_layout()
    plt.show()
    
    print("\\n‚úì Allergen distribution visualization complete")

In [None]:
# Step 12: Save detailed results to JSON

# Prepare results directory
integration_results_dir = RESULTS_DIR / "integration_experiments"
integration_results_dir.mkdir(parents=True, exist_ok=True)

# Create timestamp for this run
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

# Prepare results summary
results_summary = {
    'timestamp': timestamp,
    'total_samples': len(sample_results),
    'successful_detections': sum(1 for r in sample_results if r['success'] and r['total_allergens'] > 0),
    'failed_samples': sum(1 for r in sample_results if not r['success']),
    'error_categories': {k: len(v) for k, v in error_categories.items()},
    'allergen_statistics': dict(allergen_stats),
    'average_confidence': float(np.mean([r['avg_confidence'] for r in sample_results if r['success']])) if any(r['success'] for r in sample_results) else 0.0,
    'performance_metrics': {
        'avg_total_time': float(np.mean(timing_data['total'])) if timing_data['total'] else 0.0,
        'avg_ocr_time': float(np.mean(timing_data['ocr'])) if timing_data['ocr'] else 0.0,
        'avg_ner_time': float(np.mean(timing_data['ner'])) if timing_data['ner'] else 0.0,
        'throughput_per_second': float(1.0 / np.mean(timing_data['total'])) if timing_data['total'] and np.mean(timing_data['total']) > 0 else 0.0
    },
    'detailed_results': sample_results
}

# Save to JSON
output_path = integration_results_dir / f"integration_results_{timestamp}.json"
with open(output_path, 'w', encoding='utf-8') as f:
    json.dump(results_summary, f, indent=2, ensure_ascii=False)

print(f"‚úì Results saved to: {output_path}")
print(f"  Total size: {output_path.stat().st_size / 1024:.1f} KB")

## Section G: Optional - Batch Processing Test

Run pipeline on larger batch for comprehensive metrics (optional - can be slow).

In [None]:
# Step 13: Batch processing test (OPTIONAL - set ENABLE_BATCH = True to run)

ENABLE_BATCH = True   # Set to True to test on larger batch
BATCH_SIZE = 25       # Number of images to process in batch (reasonable for testing)

if ENABLE_BATCH and len(test_images) > n_samples:
    print("=" * 70)
    print(f"BATCH PROCESSING TEST - {BATCH_SIZE} IMAGES")
    print("=" * 70)
    
    batch_images = test_images[n_samples:n_samples + BATCH_SIZE]
    batch_results = []
    
    start_time = time.time()
    
    for img_path in tqdm(batch_images, desc="Batch processing"):
        # Check for ground truth
        txt_path = img_path.with_suffix('.txt')
        ground_truth = None
        if txt_path.exists():
            with open(txt_path, 'r', encoding='utf-8') as f:
                ground_truth = f.read()
        
        # Run pipeline
        result = detect_allergens_from_image(
            img_path,
            use_ocr=('ocr_engine' in globals()),
            ground_truth_text=ground_truth,
            verbose=False
        )
        batch_results.append(result)
    
    total_time = time.time() - start_time
    
    # Calculate batch metrics
    batch_success = sum(1 for r in batch_results if r['success'])
    batch_allergens_detected = sum(1 for r in batch_results if r.get('total_allergens', 0) > 0)
    batch_avg_confidence = np.mean([r['avg_confidence'] for r in batch_results if r['success']])
    
    print(f"\n‚úì Batch processing complete")
    print(f"  Total time: {total_time:.2f}s")
    print(f"  Avg time per image: {total_time/len(batch_images):.3f}s")
    print(f"  Success rate: {batch_success/len(batch_images):.1%}")
    print(f"  Detection rate: {batch_allergens_detected/len(batch_images):.1%}")
    print(f"  Avg confidence: {batch_avg_confidence:.2%}")
    
    # Save batch results
    batch_output_path = integration_results_dir / f"batch_results_{timestamp}.json"
    with open(batch_output_path, 'w', encoding='utf-8') as f:
        json.dump(batch_results, f, indent=2, ensure_ascii=False)
    
    print(f"  Batch results saved to: {batch_output_path}")
else:
    print("‚ÑπÔ∏è  Batch processing disabled. Set ENABLE_BATCH = True to run.")

## Summary and Next Steps

This notebook successfully integrated all pipeline components and tested the end-to-end allergen detection system.

**Key Findings:**
- Pipeline successfully processes images from OCR ‚Üí NER ‚Üí Allergen Detection
- Performance metrics collected for optimization
- Error patterns identified for improvement

**Next Steps:**
1. **Optimization:** Address bottlenecks identified in performance analysis
2. **Error Reduction:** Improve OCR quality for failed cases
3. **Confidence Tuning:** Adjust thresholds based on confidence analysis
4. **Scale Testing:** Run on full dataset (Notebook 07: App Interface)
5. **Deployment:** Package pipeline for production use

**Outputs Generated:**
- `results/integration_experiments/integration_results_*.json` - Detailed results
- Performance visualizations and metrics
- Error analysis and categorization

**Usage for New Images:**
```python
result = detect_allergens_from_image("path/to/image.jpg", use_ocr=True)
print(result['detected_allergens'])
```