# 📓 Notebook Manager

This cell initializes the widgets required for managing your research notebook. Please run the cell below to enable functionality for:
- Exporting cells tagged with `export` into a `clean` notebook
- Generating a dynamic Table of Contents (TOC)
- Exporting the notebook to GitHub-compatible Markdown

➡️ **Be sure to execute the next cell before continuing with any editing or exporting.**

In [5]:
# Cell 1 - Workflow Tools
import sys
sys.path.insert(0, '../../lib')

from notebook_tools import TOCWidget, ExportWidget
import ipywidgets as widgets


# Create widget instances
toc = TOCWidget()
export = ExportWidget()

# Create horizontal layout
left_side = widgets.VBox([toc.button, export.button, toc.status])
right_side = widgets.VBox([toc.output, export.output])

# Display side by side
display(widgets.HBox([left_side, right_side]))

HBox(children=(VBox(children=(Button(button_style='primary', description='Generate TOC', icon='list', style=Bu…

# 📑 Table of Contents (Auto-Generated)

This section will automatically generate a table of contents for your research notebook once you run the **Generate TOC** function. The table of contents will help you navigate through your data collection, analysis, and findings as your citizen science project develops.

➡️ **Do not edit this cell manually. It will be overwritten automatically.**


<!-- TOC -->
# Table of Contents


<!-- /TOC -->


## 🔧 Environment Setup

This cell establishes the batch preprocessing environment by:

1. **Importing Required Libraries**
  - OpenCV (cv2) for video processing and frame extraction
  - NumPy for array operations
  - Pandas for organizing metadata and results
  - Pathlib for cross-platform file path handling
  - JSON for checkpoint persistence
  - Datetime for timestamp parsing and filtering
  - Logging for process tracking

2. **Setting System Paths**
  - Adding mlops_ops modules to Python path
  - Verifying access to preprocessing utilities

3. **Initializing Checkpoint System**
  - Loading any previous processing state
  - Setting up progress tracking variables
  - Establishing failure recovery mechanism

**Note**: Run this cell first to ensure all dependencies are available before proceeding with batch processing.

In [10]:
# Cell 2 - Environment Setup
import numpy as np
import pandas as pd
from pathlib import Path
import json
from datetime import datetime, timedelta
import logging
import os
import sys

# Add mlops modules to path
sys.path.insert(0, '../lib')

# Setup logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

# Check for OpenCV
try:
    import cv2
    print(f"✓ OpenCV version: {cv2.__version__}")
except ImportError:
    print("⚠️ OpenCV not installed. Install with: pip install opencv-python")
    print("   Continuing without video processing capabilities...")
    cv2 = None

print(f"✓ Python version: {sys.version.split()[0]}")
print(f"✓ Working directory: {os.getcwd()}")

⚠️ OpenCV not installed. Install with: pip install opencv-python
   Continuing without video processing capabilities...
✓ Python version: 3.12.9
✓ Working directory: /home/trauco/v3-traffic-vision/notebooks/MLOps


## 📐 Batch Processing Configuration

Define the core parameters for daily batch preprocessing:

- **Target Time**: Extract frames from videos closest to 12:00 PM EST
- **Date Filter**: Process only videos from yesterday (full calendar day)
- **Frame Count**: Number of frames to extract per video
- **Input Path**: Base directory containing camera subdirectories
- **Output Path**: Where to save extracted frames
- **File Pattern**: Expected video filename format (CAMERA_YYYYMMDD_HHMMSS.mp4)

This configuration serves as the single source of truth for the batch processing workflow.

In [11]:
# Cell 3 - Batch Processing Configuration
from datetime import datetime, timedelta

# Configuration
CONFIG = {
   # Time targeting
   'TARGET_TIME': '120000',  # 12:00:00 in 24-hour format
   'TARGET_HOUR': 12,
   
   # Date filtering - yesterday only
   'PROCESS_DATE': (datetime.now() - timedelta(days=1)).strftime('%Y%m%d'),
   
   # Frame extraction
   'FRAMES_PER_VIDEO': 10,
   
   # Paths
   'INPUT_DIR': Path.home() / 'traffic-recordings',
   'OUTPUT_DIR': Path('batch_processed_frames'),
   
   # File pattern
   'VIDEO_PATTERN': '*_{date}_*.mp4',  # Will be formatted with PROCESS_DATE
   'FILENAME_FORMAT': '{camera}_{date}_{time}.mp4'  # Expected format
}

# Display configuration
print("Batch Processing Configuration:")
print(f"  Target Date: {CONFIG['PROCESS_DATE']}")
print(f"  Target Time: {CONFIG['TARGET_TIME']} (12:00:00)")
print(f"  Frames per video: {CONFIG['FRAMES_PER_VIDEO']}")
print(f"  Input: {CONFIG['INPUT_DIR']}")
print(f"  Output: {CONFIG['OUTPUT_DIR']}")

Batch Processing Configuration:
  Target Date: 20250620
  Target Time: 120000 (12:00:00)
  Frames per video: 10
  Input: /home/trauco/traffic-recordings
  Output: batch_processed_frames


## 💾 Initialize Checkpoint System

Create checkpoint functionality to track processing progress and enable recovery from interruptions. This system saves state after each video completes, allowing the workflow to resume from the last successful video if stopped.