# Describe My Environment - Core Functionality Demo

This notebook demonstrates the core machine learning pipeline of the **Describe My Environment** project.

**Pipeline Steps:**
1. **Input:** Load a test image
2. **Reflex Loop:** Detect and track objects (YOLO11)
3. **Cognitive Loop:** Generate scene caption (BLIP) and narration (Llama 3.2)
4. **Output:** Visualize results and narration

In [None]:
# Setup and Imports
import os
import sys
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

# Add project root to path
project_root = os.path.abspath(os.getcwd())
if project_root not in sys.path:
    sys.path.append(project_root)

# Import project modules
from src.reflex_loop.tracker import YOLOTracker
from src.cognitive_loop.scene_composer import SceneComposer
from src.cognitive_loop.narrator import LLMNarrator
from src.config import NARRATION_CONFIG

print("✅ Imports complete")

## 1. Load Test Image

In [None]:
# Load a sample image
image_path = os.path.join("test_images", "test_image_0.jpg")

if not os.path.exists(image_path):
    print(f"❌ Image not found: {image_path}")
else:
    # Read with OpenCV (BGR)
    frame = cv2.imread(image_path)
    # Convert to RGB for display
    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    
    plt.figure(figsize=(10, 6))
    plt.imshow(frame_rgb)
    plt.axis('off')
    plt.title("Input Image")
    plt.show()

## 2. Reflex Loop: Object Detection
We use **YOLO11n** to detect objects in the scene. This simulates the "Reflex Loop" which provides low-latency awareness.

In [None]:
# Initialize Tracker
print("⏳ Initializing YOLO tracker...")
tracker = YOLOTracker(model_path="yolo11n.pt")

# Run detection
detections = tracker.process_frame(frame)
print(f"✅ Detected {len(detections)} objects")

# Visualize detections
annotated_frame = frame_rgb.copy()

for det in detections:
    box = det['box']
    x1, y1, x2, y2 = map(int, box)
    label = f"{det['class_name']} {det['confidence']:.2f}"
    
    # Draw box
    cv2.rectangle(annotated_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
    # Draw label
    cv2.putText(annotated_frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

plt.figure(figsize=(10, 6))
plt.imshow(annotated_frame)
plt.axis('off')
plt.title("Reflex Loop: Detections")
plt.show()

## 3. Cognitive Loop: Scene Understanding
We use **BLIP** to generate a descriptive caption of the scene.

In [None]:
# Initialize Scene Composer (BLIP)
print("⏳ Initializing Scene Composer (BLIP)...")
scene_composer = SceneComposer()

# Generate caption
caption = scene_composer.describe_scene(frame)
print(f"\n📝 BLIP Caption: \"{caption}\"")

## 4. Cognitive Loop: AI Narration
We use **Llama 3.2 (via Ollama)** to synthesize the detection data and scene caption into a helpful narration for the user.

In [None]:
# Initialize Narrator
print("⏳ Initializing Narrator (Ollama)...")
narrator = LLMNarrator()

# Check if Ollama is running
if not narrator.check_connection():
    print("⚠️  Ollama is not running. Please start it with 'ollama serve' to see narration.")
else:
    # Prepare context
    context = {
        "scene_description": caption,
        "objects": [
            {"class_name": d['class_name'], "confidence": d['confidence'], "box": d['box']} 
            for d in detections
        ],
        "hazards": [] # No hazards for this static image test
    }
    
    print("\n🤖 Generating Narration...")
    narration = narrator.generate_narration(context)
    
    print("\n" + "="*50)
    print("💬 FINAL NARRATION:")
    print("="*50)
    print(narration)
    print("="*50)