## Comma AI Calibration Challenge

The purpose of this notebook is to investigate how to translate the video files into usable data. I have 5 HEVC video files each 1 minute long. Each video is filmed at 20 frames per second corresponding to 1200 frames per video. The following code block will read in a HEVC file and produce 1200 frames. 

In [20]:
# Import packages
import os 
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import Sequence
from tensorflow.keras.preprocessing.image import load_img, img_to_array

Matplotlib is building the font cache; this may take a moment.


ModuleNotFoundError: No module named 'tensorflow'

----

### Data Preprocessing and Data Augmentation 

In the following code block I've taken the pitch and yaw angles from each of the text files and merged them into a singular dataframe with additional columns for a video and frame index. 

In [19]:
# Path to the directory containing the text files
directory = 'labeled'

# List to store the data (video, frame, pitch, yaw)
data = []

# Iterate over all files in the directory
for filename in os.listdir(directory):
    if filename.endswith(".txt"):
        # Extract the video title from the filename (e.g., "video1")
        video_title = filename.split('.')[0]
        
        # Create the full file path
        filepath = os.path.join(directory, filename)
        
        # Open and read the contents of the text file
        with open(filepath, 'r') as file:
            content = file.read().strip()
            
            # Assuming the content contains alternating pitch and yaw values for each frame
            values = list(map(float, content.split()))
            
            # Iterate over the values two at a time (pitch, yaw) and generate frame data
            for i in range(0, len(values), 2):
                frame_number = i // 2  # Frame index
                pitch = values[i]
                yaw = values[i + 1]
                
                # Append a row of data (video title, frame number, pitch, yaw) to the list
                data.append([video_title, frame_number, pitch, yaw])

# Create a DataFrame from the data
df = pd.DataFrame(data, columns=['video', 'frame', 'pitch', 'yaw'])



In [21]:

# Define paths
extracted_frames_dir = 'extracted_frames'
labeled_dir = 'labeled'
unlabeled_dir = 'unlabeled'

# Function to load image paths and corresponding labels
def load_labeled_data(extracted_frames_dir, labeled_dir):
    image_paths = []
    pitches = []
    yaws = []
    
    # Iterate over each labeled video
    for video_id in range(5):  # Videos 0 to 4
        video_folder = f"video_{video_id}"
        video_frames_dir = os.path.join(extracted_frames_dir, video_folder)
        label_file = os.path.join(labeled_dir, f"{video_id}.txt")
        
        if not os.path.exists(label_file):
            print(f"Warning: Label file {label_file} does not exist.")
            continue
        
        # Load labels from the txt file
        labels = np.loadtxt(label_file)
        if labels.ndim == 1:
            labels = labels.reshape(-1, 2)  # Ensure it's a 2D array
        
        # Check number of frames and labels match
        frame_files = sorted([f for f in os.listdir(video_frames_dir) if f.endswith('.jpg')])
        num_frames = len(frame_files)
        num_labels = labels.shape[0]
        
        if num_frames != num_labels:
            print(f"Warning: Number of frames ({num_frames}) and labels ({num_labels}) do not match for video {video_id}.")
            min_length = min(num_frames, num_labels)
            frame_files = frame_files[:min_length]
            labels = labels[:min_length]
        
        # Assign labels to each frame
        for frame_file, (pitch, yaw) in zip(frame_files, labels):
            frame_path = os.path.join(video_frames_dir, frame_file)
            image_paths.append(frame_path)
            pitches.append(pitch)
            yaws.append(yaw)
    
    return np.array(image_paths), np.array(pitches), np.array(yaws)

# Load all labeled data
image_paths, pitches, yaws = load_labeled_data(extracted_frames_dir, labeled_dir)

print(f"Total labeled frames: {len(image_paths)}")

# Optionally, remove frames with NaN labels if any (assuming some labels might be NaN)
valid_indices = ~np.isnan(pitches) & ~np.isnan(yaws)
image_paths = image_paths[valid_indices]
pitches = pitches[valid_indices]
yaws = yaws[valid_indices]

print(f"Total labeled frames after removing NaNs: {len(image_paths)}")


Total labeled frames: 5996
Total labeled frames after removing NaNs: 5019


----

### Data Augmentation and Generator