# Video Processing - Pothole Detection
## Frame-by-Frame Image Processing + Deduplication

**PENTING:** Ini **BUKAN** video processing! Ini **image processing** pada setiap frame.

**Argumentasi untuk Dosen:**
> "Video adalah sequence of images. Kami memproses setiap frame secara independen menggunakan
> image processing techniques (Canny edge detection), kemudian melakukan deduplication untuk
> menghilangkan redundansi. Ini tetap IMAGE PROCESSING, bukan video processing. Video processing
> melibatkan temporal analysis seperti optical flow dan motion tracking, yang tidak kami gunakan."

In [1]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from collections import defaultdict
import os

print("‚úÖ Libraries loaded")

‚úÖ Libraries loaded


---
## üìπ STEP 1: VIDEO INFO
Ekstrak info dasar dari video

In [2]:
def get_video_info(video_path):
    """
    Get basic video information
    
    Returns:
    - fps: frames per second
    - total_frames: total number of frames
    - duration: duration in seconds
    - width, height: video dimensions
    """
    cap = cv2.VideoCapture(video_path)
    
    if not cap.isOpened():
        raise ValueError(f"Cannot open video: {video_path}")
    
    fps = int(cap.get(cv2.CAP_PROP_FPS))
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    
    duration = total_frames / fps if fps > 0 else 0
    
    cap.release()
    
    return {
        'fps': fps,
        'total_frames': total_frames,
        'duration': duration,
        'width': width,
        'height': height
    }

# Example usage (uncomment when you have video)

video_path = 'dataset/video.mp4'
info = get_video_info(video_path)

print("üìπ VIDEO INFORMATION:")
print(f"   - Resolution: {info['width']}√ó{info['height']}")
print(f"   - FPS: {info['fps']}")
print(f"   - Total Frames: {info['total_frames']}")
print(f"   - Duration: {info['duration']:.1f} seconds")


print("‚úÖ Video info function ready")

üìπ VIDEO INFORMATION:
   - Resolution: 464√ó832
   - FPS: 59
   - Total Frames: 200
   - Duration: 3.4 seconds
‚úÖ Video info function ready


---
## üé¨ STEP 2: FRAME EXTRACTION (SAMPLING)

**Masalah:** Video 10 detik @ 30fps = 300 frame ‚Üí terlalu banyak, redundan!

**Solusi:** Sampling 1 fps (ambil 1 frame per detik) ‚Üí 10 frame saja

**Alasan:**
- Efisiensi: 300 frame vs 10 frame = 30√ó lebih cepat
- Kualitas tetap bagus: 1 detik = jarak mobil bergerak ~3-5 meter (kecepatan 10-20 km/jam)
- Lubang tidak akan terlewat

In [3]:
def extract_frames(video_path, fps_sample=1, output_dir='frames_extracted'):
    """
    Extract frames from video with sampling
    
    Parameters:
    - video_path: path to video file
    - fps_sample: sampling rate (1 = 1 frame per second)
    - output_dir: directory to save extracted frames
    
    Returns:
    - list of (frame_number, timestamp, frame_image)
    """
    # Get video info
    info = get_video_info(video_path)
    fps = info['fps']
    
    # Calculate frame interval
    frame_interval = fps // fps_sample  # e.g., 30 fps / 1 fps_sample = ambil setiap 30 frame
    
    print(f"üìπ Extracting frames:")
    print(f"   - Video FPS: {fps}")
    print(f"   - Sampling: {fps_sample} fps")
    print(f"   - Frame interval: every {frame_interval} frames")
    
    # Create output directory
    os.makedirs(output_dir, exist_ok=True)
    
    # Open video
    cap = cv2.VideoCapture(video_path)
    
    extracted_frames = []
    frame_count = 0
    saved_count = 0
    
    while True:
        ret, frame = cap.read()
        
        if not ret:
            break
        
        # Sample frame
        if frame_count % frame_interval == 0:
            timestamp = frame_count / fps
            
            # Save frame
            frame_filename = f"frame_{saved_count:04d}_t{timestamp:.1f}s.jpg"
            frame_path = os.path.join(output_dir, frame_filename)
            cv2.imwrite(frame_path, frame)
            
            extracted_frames.append({
                'frame_number': frame_count,
                'timestamp': timestamp,
                'image': frame,
                'path': frame_path
            })
            
            saved_count += 1
        
        frame_count += 1
    
    cap.release()
    
    print(f"\n‚úÖ Extraction completed:")
    print(f"   - Total frames in video: {frame_count}")
    print(f"   - Frames extracted: {saved_count}")
    print(f"   - Reduction: {(1 - saved_count/frame_count) * 100:.1f}%")
    print(f"   - Saved to: {output_dir}/")
    
    return extracted_frames

print("‚úÖ Frame extraction function ready")

‚úÖ Frame extraction function ready


---
## üîç STEP 3: DETECTION PER FRAME

Pakai pipeline image processing yang sama (5 tahapan)

In [4]:
def detect_single_frame(frame):
    """
    Apply pothole detection pipeline to single frame
    
    (Sama seperti detection untuk foto)
    
    Returns:
    - list of detected boxes [(x, y, w, h), ...]
    """
    # Resize
    img_resized = cv2.resize(frame, (640, 480), interpolation=cv2.INTER_NEAREST)
    
    # Grayscale
    img_gray = cv2.cvtColor(img_resized, cv2.COLOR_BGR2GRAY)
    
    # Adaptive Threshold
    img_binary = cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                                       cv2.THRESH_BINARY, 101, 10)
    
    # Canny
    blurred = cv2.GaussianBlur(img_binary, (5, 5), 3)
    img_edges = cv2.Canny(blurred, 0, 150)
    
    # Dilasi
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    img_dilated = cv2.dilate(img_edges, kernel, iterations=1)
    
    # Find contours
    contours, _ = cv2.findContours(img_dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    # Basic filtering
    valid_boxes = []
    for contour in contours:
        x, y, w, h = cv2.boundingRect(contour)
        
        # Size filter
        if w < 15 or h < 15 or w > 290 or h > 540:
            continue
        
        # Aspect ratio
        ar = w / h
        if ar < 0.3 or ar > 3.0:
            continue
        
        # Solidity
        area = cv2.contourArea(contour)
        hull = cv2.convexHull(contour)
        hull_area = cv2.contourArea(hull)
        
        if hull_area > 0:
            solidity = area / hull_area
            if solidity < 0.6:
                continue
        
        valid_boxes.append((x, y, w, h))
    
    return valid_boxes

print("‚úÖ Single frame detection function ready")

‚úÖ Single frame detection function ready


---
## üéØ STEP 4: DEDUPLICATION (CRITICAL!)

**Masalah:**
```
Frame 1: Lubang A di (100, 200)
Frame 2: Lubang A di (102, 201)  ‚Üê Kamera bergerak sedikit
Frame 3: Lubang A di (98, 199)

‚Üí Ini 1 lubang atau 3 lubang berbeda?
```

**Solusi:** Hitung jarak Euclidean antar bounding box

**Formula:**
```
Distance = ‚àö[(x1 - x2)¬≤ + (y1 - y2)¬≤]

Threshold: 50 piksel

Jika distance < 50 ‚Üí LUBANG YANG SAMA
Jika distance ‚â• 50 ‚Üí LUBANG BERBEDA
```

**Kenapa 50 piksel?**
- Kamera bergerak ‚Üí objek shift 10-30 piksel per frame
- 50 piksel = buffer cukup untuk toleransi movement
- Tidak terlalu besar (biar 2 lubang berdekatan tidak digabung)

In [5]:
def calculate_distance(box1, box2):
    """
    Calculate Euclidean distance between two bounding boxes
    
    Uses center points of boxes
    
    Parameters:
    - box1, box2: (x, y, w, h)
    
    Returns:
    - distance (float)
    """
    x1, y1, w1, h1 = box1
    x2, y2, w2, h2 = box2
    
    # Calculate center points
    cx1 = x1 + w1 / 2
    cy1 = y1 + h1 / 2
    
    cx2 = x2 + w2 / 2
    cy2 = y2 + h2 / 2
    
    # Euclidean distance
    distance = np.sqrt((cx1 - cx2)**2 + (cy1 - cy2)**2)
    
    return distance

def deduplicate_potholes(all_detections, distance_threshold=50):
    """
    Remove duplicate potholes across frames
    
    Parameters:
    - all_detections: list of {'frame_idx': ..., 'timestamp': ..., 'boxes': [...]}
    - distance_threshold: max distance to consider as same pothole (pixels)
    
    Returns:
    - unique_potholes: list of unique potholes with metadata
    """
    unique_potholes = []
    
    for detection in all_detections:
        frame_idx = detection['frame_idx']
        timestamp = detection['timestamp']
        boxes = detection['boxes']
        
        for box in boxes:
            # Check if this box is similar to any existing unique pothole
            is_duplicate = False
            
            for unique in unique_potholes:
                distance = calculate_distance(box, unique['box'])
                
                if distance < distance_threshold:
                    # Same pothole! Update info
                    is_duplicate = True
                    unique['frame_count'] += 1
                    unique['last_seen_frame'] = frame_idx
                    unique['last_seen_timestamp'] = timestamp
                    break
            
            # New unique pothole
            if not is_duplicate:
                unique_potholes.append({
                    'id': len(unique_potholes) + 1,
                    'box': box,
                    'first_seen_frame': frame_idx,
                    'first_seen_timestamp': timestamp,
                    'last_seen_frame': frame_idx,
                    'last_seen_timestamp': timestamp,
                    'frame_count': 1  # Number of frames this pothole appears in
                })
    
    return unique_potholes

print("‚úÖ Deduplication functions ready")

‚úÖ Deduplication functions ready


---
## üöÄ COMPLETE VIDEO PROCESSING PIPELINE

In [6]:
def process_video(video_path, fps_sample=1, distance_threshold=50, output_dir='output_video'):
    """
    Complete video processing pipeline
    
    Steps:
    1. Extract frames (sampling)
    2. Detect potholes per frame
    3. Deduplicate across frames
    4. Generate output
    
    Returns:
    - unique_potholes: list of unique potholes
    - all_detections: all raw detections (for comparison)
    """
    print("="*60)
    print("üé¨ VIDEO PROCESSING STARTED")
    print("="*60)
    
    # Create output directory
    os.makedirs(output_dir, exist_ok=True)
    
    # Step 1: Extract frames
    print("\n[1/4] Extracting frames...")
    frames_dir = os.path.join(output_dir, 'frames')
    extracted_frames = extract_frames(video_path, fps_sample, frames_dir)
    
    # Step 2: Detect per frame
    print("\n[2/4] Detecting potholes per frame...")
    all_detections = []
    total_raw_detections = 0
    
    for i, frame_data in enumerate(extracted_frames):
        frame = frame_data['image']
        timestamp = frame_data['timestamp']
        
        boxes = detect_single_frame(frame)
        
        if len(boxes) > 0:
            all_detections.append({
                'frame_idx': i,
                'timestamp': timestamp,
                'boxes': boxes,
                'frame_data': frame_data
            })
            total_raw_detections += len(boxes)
            print(f"   Frame {i} (t={timestamp:.1f}s): {len(boxes)} potholes detected")
    
    print(f"\n   Total raw detections: {total_raw_detections}")
    
    # Step 3: Deduplication
    print("\n[3/4] Deduplicating potholes...")
    unique_potholes = deduplicate_potholes(all_detections, distance_threshold)
    
    print(f"   Unique potholes: {len(unique_potholes)}")
    print(f"   Duplicates removed: {total_raw_detections - len(unique_potholes)}")
    print(f"   Reduction: {((total_raw_detections - len(unique_potholes)) / total_raw_detections * 100):.1f}%")
    
    # Step 4: Generate output snapshots
    print("\n[4/4] Generating output snapshots...")
    snapshots_dir = os.path.join(output_dir, 'snapshots')
    os.makedirs(snapshots_dir, exist_ok=True)
    
    for pothole in unique_potholes:
        # Get frame where pothole first appeared
        first_frame_idx = pothole['first_seen_frame']
        frame_data = extracted_frames[first_frame_idx]
        frame = frame_data['image'].copy()
        
        # Resize for display
        frame_resized = cv2.resize(frame, (640, 480), interpolation=cv2.INTER_NEAREST)
        
        # Draw bounding box
        x, y, w, h = pothole['box']
        cv2.rectangle(frame_resized, (x, y), (x+w, y+h), (0, 0, 255), 2)
        
        # Add label
        label = f"Pothole #{pothole['id']}"
        cv2.putText(frame_resized, label, (x, y-10),
                   cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
        
        # Save snapshot
        snapshot_path = os.path.join(snapshots_dir, f"pothole_{pothole['id']:02d}.jpg")
        cv2.imwrite(snapshot_path, frame_resized)
        pothole['snapshot_path'] = snapshot_path
        
        print(f"   Pothole #{pothole['id']}: saved to {snapshot_path}")
    
    # Generate summary report
    report_path = os.path.join(output_dir, 'detection_summary.txt')
    with open(report_path, 'w') as f:
        f.write("="*60 + "\n")
        f.write("VIDEO POTHOLE DETECTION SUMMARY\n")
        f.write("="*60 + "\n\n")
        
        # Video info
        info = get_video_info(video_path)
        f.write(f"Video: {video_path}\n")
        f.write(f"Duration: {info['duration']:.1f} seconds\n")
        f.write(f"Resolution: {info['width']}√ó{info['height']}\n")
        f.write(f"FPS: {info['fps']}\n\n")
        
        # Processing stats
        f.write(f"Frames analyzed: {len(extracted_frames)}\n")
        f.write(f"Sampling rate: {fps_sample} fps\n")
        f.write(f"Raw detections: {total_raw_detections}\n")
        f.write(f"Unique potholes: {len(unique_potholes)}\n\n")
        
        # Detailed results
        f.write("="*60 + "\n")
        f.write("DETAILED RESULTS\n")
        f.write("="*60 + "\n\n")
        
        for pothole in unique_potholes:
            f.write(f"Pothole #{pothole['id']}:\n")
            f.write(f"   First seen: Frame {pothole['first_seen_frame']} (t={pothole['first_seen_timestamp']:.1f}s)\n")
            f.write(f"   Last seen: Frame {pothole['last_seen_frame']} (t={pothole['last_seen_timestamp']:.1f}s)\n")
            f.write(f"   Appeared in: {pothole['frame_count']} frames\n")
            x, y, w, h = pothole['box']
            f.write(f"   Position: ({x}, {y})\n")
            f.write(f"   Size: {w}√ó{h} pixels\n")
            f.write(f"   Snapshot: {pothole['snapshot_path']}\n\n")
    
    print(f"\n‚úÖ Summary report saved to: {report_path}")
    
    print("\n" + "="*60)
    print("‚úÖ VIDEO PROCESSING COMPLETED")
    print("="*60)
    
    return unique_potholes, all_detections

print("‚úÖ Complete video processing pipeline ready!")

‚úÖ Complete video processing pipeline ready!


---
## üß™ EXAMPLE USAGE

In [9]:
# Uncomment when you have video file

video_path = 'dataset/video.mp4'

unique_potholes, all_detections = process_video(
    video_path,
    fps_sample=1,           # Sample 1 frame per second
    distance_threshold=50,  # 50 pixels deduplication threshold
    output_dir='output_video'
)

# Print final results
print("\n" + "="*60)
print("üìä FINAL RESULTS")
print("="*60)
print(f"\nTotal unique potholes detected: {len(unique_potholes)}\n")

for pothole in unique_potholes:
    print(f"üî¥ Pothole #{pothole['id']}")
    print(f"   First detected at: {pothole['first_seen_timestamp']:.1f}s")
    print(f"   Position: {pothole['box']}")
    print(f"   Appeared in {pothole['frame_count']} frames")
    print()


print("üí° Uncomment code above when you have video file!")

üé¨ VIDEO PROCESSING STARTED

[1/4] Extracting frames...
üìπ Extracting frames:
   - Video FPS: 59
   - Sampling: 1 fps
   - Frame interval: every 59 frames

‚úÖ Extraction completed:
   - Total frames in video: 200
   - Frames extracted: 4
   - Reduction: 98.0%
   - Saved to: output_video\frames/

[2/4] Detecting potholes per frame...
   Frame 0 (t=0.0s): 16 potholes detected
   Frame 1 (t=1.0s): 8 potholes detected
   Frame 2 (t=2.0s): 9 potholes detected
   Frame 3 (t=3.0s): 4 potholes detected

   Total raw detections: 37

[3/4] Deduplicating potholes...
   Unique potholes: 21
   Duplicates removed: 16
   Reduction: 43.2%

[4/4] Generating output snapshots...
   Pothole #1: saved to output_video\snapshots\pothole_01.jpg
   Pothole #2: saved to output_video\snapshots\pothole_02.jpg
   Pothole #3: saved to output_video\snapshots\pothole_03.jpg
   Pothole #4: saved to output_video\snapshots\pothole_04.jpg
   Pothole #5: saved to output_video\snapshots\pothole_05.jpg
   Pothole #6: s

---
## üìä VISUALIZATION: Display Snapshots

In [8]:
def visualize_detections(unique_potholes, max_display=6):
    """
    Display snapshots of detected potholes
    """
    n = min(len(unique_potholes), max_display)
    
    if n == 0:
        print("No potholes to display")
        return
    
    cols = 3
    rows = (n + cols - 1) // cols
    
    fig, axes = plt.subplots(rows, cols, figsize=(15, 5*rows))
    
    if rows == 1:
        axes = axes.reshape(1, -1)
    
    for i in range(rows * cols):
        row = i // cols
        col = i % cols
        
        if i < n:
            pothole = unique_potholes[i]
            img = cv2.imread(pothole['snapshot_path'])
            img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            
            axes[row, col].imshow(img_rgb)
            axes[row, col].set_title(
                f"Pothole #{pothole['id']}\nt={pothole['first_seen_timestamp']:.1f}s",
                fontsize=12, fontweight='bold'
            )
        else:
            axes[row, col].axis('off')
        
        axes[row, col].axis('off')
    
    plt.tight_layout()
    plt.show()

# Example
"""
visualize_detections(unique_potholes)
"""

print("‚úÖ Visualization function ready")

‚úÖ Visualization function ready


---
## üí° PARAMETER TUNING GUIDE

### **fps_sample:**
```python
fps_sample = 1   # Standard (1 frame per second)
fps_sample = 2   # More frequent (2 frames per second) - lebih detail tapi lebih lambat
fps_sample = 0.5 # Less frequent (1 frame per 2 seconds) - lebih cepat tapi mungkin skip lubang
```

**Rekomendasi:**
- Kecepatan kendaraan < 20 km/jam ‚Üí fps_sample = 1 (cukup)
- Kecepatan > 20 km/jam ‚Üí fps_sample = 2 (biar tidak skip)

### **distance_threshold:**
```python
distance_threshold = 30   # Strict (2 lubang berdekatan mungkin dianggap sama)
distance_threshold = 50   # Standard (recommended)
distance_threshold = 80   # Loose (lebih banyak dianggap sama)
```

**Jika terlalu banyak duplikat:** Naikkan threshold (50 ‚Üí 70)

**Jika 1 lubang terdeteksi sebagai multiple:** Turunkan threshold (50 ‚Üí 30)

---

## üìã OUTPUT FILES

```
output_video/
‚îú‚îÄ‚îÄ frames/                   # Extracted frames
‚îÇ   ‚îú‚îÄ‚îÄ frame_0000_t0.0s.jpg
‚îÇ   ‚îú‚îÄ‚îÄ frame_0001_t1.0s.jpg
‚îÇ   ‚îî‚îÄ‚îÄ ...
‚îú‚îÄ‚îÄ snapshots/                # Pothole snapshots
‚îÇ   ‚îú‚îÄ‚îÄ pothole_01.jpg
‚îÇ   ‚îú‚îÄ‚îÄ pothole_02.jpg
‚îÇ   ‚îî‚îÄ‚îÄ ...
‚îî‚îÄ‚îÄ detection_summary.txt     # Text report
```

---

## ‚ö†Ô∏è TROUBLESHOOTING

### Error: "Cannot open video"
```python
# Check if file exists
import os
print(os.path.exists(video_path))

# Try different video codec
# Convert video to MP4 H.264 format first
```

### Error: "Frame extraction too slow"
```python
# Reduce fps_sample
fps_sample = 0.5  # Less frames
```

### Warning: "Too many duplicates detected"
```python
# Increase distance_threshold
distance_threshold = 70  # More aggressive deduplication
```