# Sign Language Data Exploration Notebook

This notebook explores sign language video data for the Sign2Text project. We'll load, visualize, and analyze the data to understand its characteristics before model development.

In [None]:
# Import necessary libraries
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import mediapipe as mp
from pathlib import Path
from tqdm.notebook import tqdm
import plotly.express as px
import plotly.graph_objects as go
from IPython.display import HTML, display
from matplotlib.animation import FuncAnimation

# Add parent directory to path for imports
sys.path.append('..')
from data_processing.preprocess import SignVideoProcessor

# Set up plotting
plt.style.use('ggplot')
%matplotlib inline

# Configure Mediapipe
mp_drawing = mp.solutions.drawing_utils
mp_holistic = mp.solutions.holistic

## 1. Data Loading

First, let's set up paths to our data and load it. The following code assumes you have a dataset of sign language videos, along with annotations.

In [None]:
# Set paths to data
# Update these paths to point to your actual data
DATA_ROOT = "../../../data"  # Change this to your data directory
VIDEO_DIR = os.path.join(DATA_ROOT, "videos")
ANNOTATIONS_FILE = os.path.join(DATA_ROOT, "annotations.csv")

# Check if paths exist
print(f"DATA_ROOT exists: {os.path.exists(DATA_ROOT)}")
print(f"VIDEO_DIR exists: {os.path.exists(VIDEO_DIR)}")
print(f"ANNOTATIONS_FILE exists: {os.path.exists(ANNOTATIONS_FILE)}")

In [None]:
# Load annotations if available
annotations = None
if os.path.exists(ANNOTATIONS_FILE):
    annotations = pd.read_csv(ANNOTATIONS_FILE)
    print(f"Loaded {len(annotations)} annotations")
    display(annotations.head())
else:
    print("Annotations file not found. We'll create one from video files.")
    
    # Create a list of video files
    video_files = []
    for ext in [".mp4", ".avi", ".mov"]:
        video_files.extend(list(Path(VIDEO_DIR).glob(f"**/*{ext}")))
    
    # Create a dataframe with file information
    annotations = pd.DataFrame({
        "filename": [p.stem for p in video_files],
        "filepath": [str(p) for p in video_files],
        # Placeholder for labels - you'll need to add actual labels
        "label": ["unknown" for _ in video_files]
    })
    
    print(f"Created annotations for {len(annotations)} video files")
    display(annotations.head())

## 2. Basic Data Analysis

Let's analyze the dataset to understand its characteristics.

In [None]:
# Analyze the distribution of sign classes
if 'label' in annotations.columns:
    plt.figure(figsize=(12, 6))
    counts = annotations['label'].value_counts()
    
    # If there are many classes, show only top 30
    if len(counts) > 30:
        counts = counts.nlargest(30)
        title = "Distribution of Top 30 Sign Classes"
    else:
        title = "Distribution of Sign Classes"
        
    sns.barplot(x=counts.index, y=counts.values)
    plt.title(title)
    plt.xlabel('Sign Class')
    plt.ylabel('Count')
    plt.xticks(rotation=90)
    plt.tight_layout()
    plt.show()
    
    # Print some statistics
    print(f"Number of unique classes: {annotations['label'].nunique()}")
    print(f"Most common class: {annotations['label'].value_counts().idxmax()} ({annotations['label'].value_counts().max()} instances)")
    print(f"Least common class: {annotations['label'].value_counts().idxmin()} ({annotations['label'].value_counts().min()} instances)")

In [None]:
# Analyze video properties
def get_video_properties(video_path):
    """Extract basic properties of a video file."""
    try:
        cap = cv2.VideoCapture(video_path)
        if not cap.isOpened():
            return None
        
        # Extract properties
        width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        fps = cap.get(cv2.CAP_PROP_FPS)
        frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
        duration = frame_count / fps if fps > 0 else 0
        
        cap.release()
        
        return {
            'width': width,
            'height': height,
            'fps': fps,
            'frame_count': frame_count,
            'duration': duration
        }
    except Exception as e:
        print(f"Error processing {video_path}: {e}")
        return None

# Sample a subset of videos for analysis
sample_size = min(100, len(annotations))  # Analyze up to 100 videos
sample_videos = annotations.sample(sample_size, random_state=42)

# Collect video properties
video_properties = []
for idx, row in tqdm(sample_videos.iterrows(), total=len(sample_videos), desc="Analyzing videos"):
    video_path = row['filepath'] if 'filepath' in row else os.path.join(VIDEO_DIR, row['filename'])
    if not os.path.exists(video_path) and 'filepath' not in row:
        # Try common extensions if filepath not specified
        for ext in [".mp4", ".avi", ".mov"]:
            test_path = os.path.join(VIDEO_DIR, f"{row['filename']}{ext}")
            if os.path.exists(test_path):
                video_path = test_path
                break
    
    props = get_video_properties(video_path)
    if props:
        props['filename'] = row['filename']
        if 'label' in row:
            props['label'] = row['label']
        video_properties.append(props)

# Convert to DataFrame
video_props_df = pd.DataFrame(video_properties)
display(video_props_df.head())

# Display summary statistics
display(video_props_df.describe())

In [None]:
# Visualize distributions of video properties
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Duration distribution
sns.histplot(video_props_df['duration'], kde=True, ax=axes[0, 0])
axes[0, 0].set_title('Video Duration Distribution (seconds)')
axes[0, 0].set_xlabel('Duration (s)')

# Frame count distribution
sns.histplot(video_props_df['frame_count'], kde=True, ax=axes[0, 1])
axes[0, 1].set_title('Frame Count Distribution')
axes[0, 1].set_xlabel('Number of Frames')

# Resolution distribution
video_props_df['resolution'] = video_props_df['width'].astype(str) + 'x' + video_props_df['height'].astype(str)
res_counts = video_props_df['resolution'].value_counts()
sns.barplot(x=res_counts.index, y=res_counts.values, ax=axes[1, 0])
axes[1, 0].set_title('Video Resolution Distribution')
axes[1, 0].set_xlabel('Resolution')
axes[1, 0].set_ylabel('Count')
axes[1, 0].tick_params(axis='x', rotation=90)

# FPS distribution
sns.histplot(video_props_df['fps'], kde=True, ax=axes[1, 1])
axes[1, 1].set_title('FPS Distribution')
axes[1, 1].set_xlabel('Frames Per Second')

plt.tight_layout()
plt.show()

## 3. Video Visualization

Let's visualize some sample videos from the dataset.

In [None]:
def display_video_frames(video_path, max_frames=20, interval=2):
    """Display frames from a video."""
    cap = cv2.VideoCapture(video_path)
    if not cap.isOpened():
        print(f"Could not open video: {video_path}")
        return
    
    # Get video properties
    frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    fps = cap.get(cv2.CAP_PROP_FPS)
    duration = frame_count / fps if fps > 0 else 0
    
    print(f"Video: {os.path.basename(video_path)}")
    print(f"Duration: {duration:.2f} seconds, Frames: {frame_count}, FPS: {fps:.1f}")
    
    # Calculate sampling interval to get desired number of frames
    if frame_count > max_frames:
        sample_interval = frame_count // max_frames
    else:
        sample_interval = 1
    
    frames = []
    frame_idx = 0
    
    # Read frames
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
            
        if frame_idx % sample_interval == 0:
            # Convert from BGR to RGB for display
            frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            frames.append(frame_rgb)
            
            if len(frames) >= max_frames:
                break
                
        frame_idx += 1
    
    cap.release()
    
    # Display frames in a grid
    rows = (len(frames) + 3) // 4  # 4 frames per row
    fig, axes = plt.subplots(rows, min(4, len(frames)), figsize=(15, 3*rows))
    
    # Flatten axes array if needed
    if rows == 1 and len(frames) < 4:
        axes = [axes]
    elif rows > 1:
        axes = axes.flatten()
        
    for i, frame in enumerate(frames):
        if i < len(axes):
            axes[i].imshow(frame)
            axes[i].set_title(f"Frame {i*sample_interval}")
            axes[i].axis('off')
    
    # Hide unused subplots
    for i in range(len(frames), len(axes)):
        fig.delaxes(axes[i])
        
    plt.tight_layout()
    plt.show()

In [None]:
# Select a random video to visualize
if len(video_properties) > 0:
    random_video = video_props_df.sample(1).iloc[0]
    video_path = os.path.join(VIDEO_DIR, random_video['filename'])
    
    # Try common extensions if needed
    if not os.path.exists(video_path):
        for ext in [".mp4", ".avi", ".mov"]:
            test_path = os.path.join(VIDEO_DIR, f"{random_video['filename']}{ext}")
            if os.path.exists(test_path):
                video_path = test_path
                break
    
    if os.path.exists(video_path):
        display_video_frames(video_path)
    else:
        print(f"Could not find video file for {random_video['filename']}")

## 4. Hand Landmark Detection

Let's explore hand landmark detection using MediaPipe, which is crucial for sign language recognition.

In [None]:
def detect_landmarks(video_path, max_frames=20):
    """Detect hand and pose landmarks in a video using MediaPipe."""
    cap = cv2.VideoCapture(video_path)
    if not cap.isOpened():
        print(f"Could not open video: {video_path}")
        return
    
    # Initialize MediaPipe Holistic
    with mp_holistic.Holistic(
        min_detection_confidence=0.5,
        min_tracking_confidence=0.5) as holistic:
        
        frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
        
        # Calculate sampling interval
        if frame_count > max_frames:
            sample_interval = frame_count // max_frames
        else:
            sample_interval = 1
        
        frames = []
        landmarks = []
        frame_idx = 0
        
        while cap.isOpened():
            ret, frame = cap.read()
            if not ret:
                break
                
            if frame_idx % sample_interval == 0:
                # Convert to RGB for MediaPipe
                image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
                
                # Process frame with MediaPipe
                results = holistic.process(image)
                
                # Draw landmarks on the image
                annotated_image = image.copy()
                
                # Draw pose landmarks
                mp_drawing.draw_landmarks(
                    annotated_image, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
                
                # Draw left hand landmarks
                mp_drawing.draw_landmarks(
                    annotated_image, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
                
                # Draw right hand landmarks
                mp_drawing.draw_landmarks(
                    annotated_image, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
                
                frames.append(annotated_image)
                landmarks.append({
                    'pose': results.pose_landmarks,
                    'left_hand': results.left_hand_landmarks,
                    'right_hand': results.right_hand_landmarks
                })
                
                if len(frames) >= max_frames:
                    break
            
            frame_idx += 1
    
    cap.release()
    
    # Display frames in a grid
    rows = (len(frames) + 3) // 4  # 4 frames per row
    fig, axes = plt.subplots(rows, min(4, len(frames)), figsize=(15, 3*rows))
    
    # Flatten axes array if needed
    if rows == 1 and len(frames) < 4:
        axes = [axes]
    elif rows > 1:
        axes = axes.flatten()
        
    for i, frame in enumerate(frames):
        if i < len(axes):
            axes[i].imshow(frame)
            axes[i].set_title(f"Frame {i*sample_interval}")
            axes[i].axis('off')
    
    # Hide unused subplots
    for i in range(len(frames), len(axes)):
        fig.delaxes(axes[i])
        
    plt.tight_layout()
    plt.show()
    
    return landmarks

In [None]:
# Detect landmarks in a sample video
if len(video_properties) > 0:
    random_video = video_props_df.sample(1).iloc[0]
    video_path = os.path.join(VIDEO_DIR, random_video['filename'])
    
    # Try common extensions if needed
    if not os.path.exists(video_path):
        for ext in [".mp4", ".avi", ".mov"]:
            test_path = os.path.join(VIDEO_DIR, f"{random_video['filename']}{ext}")
            if os.path.exists(test_path):
                video_path = test_path
                break
    
    if os.path.exists(video_path):
        landmarks = detect_landmarks(video_path)
    else:
        print(f"Could not find video file for {random_video['filename']}")

## 5. Landmark Analysis

Let's analyze the detected landmarks for potential features.

In [None]:
def analyze_hand_landmarks(landmarks):
    """Analyze hand landmarks from multiple frames."""
    if not landmarks:
        return
    
    # Count frames with detected hands
    left_hand_detected = sum(1 for l in landmarks if l['left_hand'] is not None)
    right_hand_detected = sum(1 for l in landmarks if l['right_hand'] is not None)
    both_hands_detected = sum(1 for l in landmarks 
                             if l['left_hand'] is not None and l['right_hand'] is not None)
    
    print(f"Total frames analyzed: {len(landmarks)}")
    print(f"Frames with left hand detected: {left_hand_detected} ({left_hand_detected/len(landmarks)*100:.1f}%)")
    print(f"Frames with right hand detected: {right_hand_detected} ({right_hand_detected/len(landmarks)*100:.1f}%)")
    print(f"Frames with both hands detected: {both_hands_detected} ({both_hands_detected/len(landmarks)*100:.1f}%)")
    
    # Extract and plot hand movements
    left_hand_positions = []
    right_hand_positions = []
    
    for frame_landmarks in landmarks:
        # Get left hand wrist position
        if frame_landmarks['left_hand'] is not None:
            wrist = frame_landmarks['left_hand'].landmark[0]  # Wrist is landmark 0
            left_hand_positions.append((wrist.x, wrist.y, wrist.z))
        else:
            left_hand_positions.append((None, None, None))
            
        # Get right hand wrist position
        if frame_landmarks['right_hand'] is not None:
            wrist = frame_landmarks['right_hand'].landmark[0]  # Wrist is landmark 0
            right_hand_positions.append((wrist.x, wrist.y, wrist.z))
        else:
            right_hand_positions.append((None, None, None))
    
    # Plot hand trajectories (2D)
    plt.figure(figsize=(12, 6))
    
    # Left hand
    left_x = [p[0] for p in left_hand_positions if p[0] is not None]
    left_y = [p[1] for p in left_hand_positions if p[1] is not None]
    if left_x and left_y:
        plt.plot(left_x, left_y, 'b-', alpha=0.7, label='Left Hand')
        plt.scatter(left_x[0], left_y[0], color='blue', s=100, marker='o', label='Start')
        plt.scatter(left_x[-1], left_y[-1], color='blue', s=100, marker='x', label='End')
    
    # Right hand
    right_x = [p[0] for p in right_hand_positions if p[0] is not None]
    right_y = [p[1] for p in right_hand_positions if p[1] is not None]
    if right_x and right_y:
        plt.plot(right_x, right_y, 'r-', alpha=0.7, label='Right Hand')
        plt.scatter(right_x[0], right_y[0], color='red', s=100, marker='o')
        plt.scatter(right_x[-1], right_y[-1], color='red', s=100, marker='x')
    
    plt.title('Hand Trajectory')
    plt.xlabel('X Coordinate')
    plt.ylabel('Y Coordinate')
    plt.gca().invert_yaxis()  # Invert Y-axis to match image coordinates
    plt.legend()
    plt.grid(True)
    plt.show()
    
    # Create a 3D visualization if we have enough data points
    left_x = [p[0] for p in left_hand_positions if p[0] is not None]
    left_y = [p[1] for p in left_hand_positions if p[1] is not None]
    left_z = [p[2] for p in left_hand_positions if p[2] is not None]
    
    right_x = [p[0] for p in right_hand_positions if p[0] is not None]
    right_y = [p[1] for p in right_hand_positions if p[1] is not None]
    right_z = [p[2] for p in right_hand_positions if p[2] is not None]
    
    if len(left_x) > 5 or len(right_x) > 5:
        fig = plt.figure(figsize=(12, 10))
        ax = fig.add_subplot(111, projection='3d')
        
        if len(left_x) > 5:
            ax.plot(left_x, left_y, left_z, 'b-', alpha=0.7, label='Left Hand')
            ax.scatter(left_x[0], left_y[0], left_z[0], color='blue', s=100, marker='o')
            ax.scatter(left_x[-1], left_y[-1], left_z[-1], color='blue', s=100, marker='x')
            
        if len(right_x) > 5:
            ax.plot(right_x, right_y, right_z, 'r-', alpha=0.7, label='Right Hand')
            ax.scatter(right_x[0], right_y[0], right_z[0], color='red', s=100, marker='o')
            ax.scatter(right_x[-1], right_y[-1], right_z[-1], color='red', s=100, marker='x')
            
        ax.set_title('3D Hand Trajectory')
        ax.set_xlabel('X Coordinate')
        ax.set_ylabel('Y Coordinate')
        ax.set_zlabel('Z Coordinate')
        ax.legend()
        plt.show()
    
    return {
        'left_hand_positions': left_hand_positions,
        'right_hand_positions': right_hand_positions
    }

In [None]:
# Analyze landmarks if available
if 'landmarks' in locals() and landmarks:
    hand_movement = analyze_hand_landmarks(landmarks)

## 6. Feature Engineering

Based on our analysis, let's explore potential features for sign language recognition.

In [None]:
def extract_hand_features(landmarks):
    """Extract useful features from hand landmarks."""
    if not landmarks:
        return None
    
    features = []
    
    for frame_idx, frame_landmarks in enumerate(landmarks):
        frame_features = {
            'frame_idx': frame_idx,
            'left_hand_present': frame_landmarks['left_hand'] is not None,
            'right_hand_present': frame_landmarks['right_hand'] is not None,
        }
        
        # Left hand features
        if frame_landmarks['left_hand'] is not None:
            # Calculate hand centroid
            x_sum = y_sum = z_sum = 0
            for landmark in frame_landmarks['left_hand'].landmark:
                x_sum += landmark.x
                y_sum += landmark.y
                z_sum += landmark.z
            
            num_landmarks = len(frame_landmarks['left_hand'].landmark)
            frame_features['left_hand_centroid_x'] = x_sum / num_landmarks
            frame_features['left_hand_centroid_y'] = y_sum / num_landmarks
            frame_features['left_hand_centroid_z'] = z_sum / num_landmarks
            
            # Calculate spread of hand (variance as a measure of how open the hand is)
            x_vars = sum((landmark.x - frame_features['left_hand_centroid_x'])**2 
                        for landmark in frame_landmarks['left_hand'].landmark) / num_landmarks
            y_vars = sum((landmark.y - frame_features['left_hand_centroid_y'])**2 
                        for landmark in frame_landmarks['left_hand'].landmark) / num_landmarks
            
            frame_features['left_hand_spread'] = (x_vars + y_vars) ** 0.5
        else:
            # Set default values when hand not present
            frame_features['left_hand_centroid_x'] = None
            frame_features['left_hand_centroid_y'] = None
            frame_features['left_hand_centroid_z'] = None
            frame_features['left_hand_spread'] = None
        
        # Right hand features (similar to left hand)
        if frame_landmarks['right_hand'] is not None:
            # Calculate hand centroid
            x_sum = y_sum = z_sum = 0
            for landmark in frame_landmarks['right_hand'].landmark:
                x_sum += landmark.x
                y_sum += landmark.y
                z_sum += landmark.z
            
            num_landmarks = len(frame_landmarks['right_hand'].landmark)
            frame_features['right_hand_centroid_x'] = x_sum / num_landmarks
            frame_features['right_hand_centroid_y'] = y_sum / num_landmarks
            frame_features['right_hand_centroid_z'] = z_sum / num_landmarks
            
            # Calculate spread of hand
            x_vars = sum((landmark.x - frame_features['right_hand_centroid_x'])**2 
                        for landmark in frame_landmarks['right_hand'].landmark) / num_landmarks
            y_vars = sum((landmark.y - frame_features['right_hand_centroid_y'])**2 
                        for landmark in frame_landmarks['right_hand'].landmark) / num_landmarks
            
            frame_features['right_hand_spread'] = (x_vars + y_vars) ** 0.5
        else:
            # Set default values when hand not present
            frame_features['right_hand_centroid_x'] = None
            frame_features['right_hand_centroid_y'] = None
            frame_features['right_hand_centroid_z'] = None
            frame_features['right_hand_spread'] = None
            
        # Hand distance feature (when both hands are present)
        if frame_features['left_hand_present'] and frame_features['right_hand_present']:
            dx = frame_features['left_hand_centroid_x'] - frame_features['right_hand_centroid_x']
            dy = frame_features['left_hand_centroid_y'] - frame_features['right_hand_centroid_y']
            dz = frame_features['left_hand_centroid_z'] - frame_features['right_hand_centroid_z']
            frame_features['hand_distance'] = (dx**2 + dy**2 + dz**2)**0.5
        else:
            frame_features['hand_distance'] = None
        
        features.append(frame_features)
    
    # Convert to DataFrame
    return pd.DataFrame(features)

In [None]:
# Extract features if landmarks are available
if 'landmarks' in locals() and landmarks:
    features_df = extract_hand_features(landmarks)
    display(features_df.head())
    
    # Visualize features
    plt.figure(figsize=(12, 8))
    
    # Create subplot for hand presence
    plt.subplot(3, 1, 1)
    plt.plot(features_df['frame_idx'], features_df['left_hand_present'], 'b-', label='Left Hand')
    plt.plot(features_df['frame_idx'], features_df['right_hand_present'], 'r-', label='Right Hand')
    plt.title('Hand Presence')
    plt.xlabel('Frame')
    plt.ylabel('Present')
    plt.legend()
    plt.grid(True)
    
    # Create subplot for hand spread
    plt.subplot(3, 1, 2)
    plt.plot(features_df['frame_idx'], features_df['left_hand_spread'], 'b-', label='Left Hand Spread')
    plt.plot(features_df['frame_idx'], features_df['right_hand_spread'], 'r-', label='Right Hand Spread')
    plt.title('Hand Spread (measure of how open the hand is)')
    plt.xlabel('Frame')
    plt.ylabel('Spread')
    plt.legend()
    plt.grid(True)
    
    # Create subplot for hand distance
    plt.subplot(3, 1, 3)
    plt.plot(features_df['frame_idx'], features_df['hand_distance'], 'g-')
    plt.title('Distance Between Hands')
    plt.xlabel('Frame')
    plt.ylabel('Distance')
    plt.grid(True)
    
    plt.tight_layout()
    plt.show()

## 7. Batch Processing

Let's process multiple videos to get a more comprehensive view of the dataset.

In [None]:
def process_multiple_videos(video_paths, max_videos=5):
    """Process multiple videos to extract features."""
    # Limit to max_videos
    video_paths = video_paths[:max_videos]
    
    all_features = []
    all_video_names = []
    
    # Initialize processor
    processor = SignVideoProcessor()
    
    for path in tqdm(video_paths, desc="Processing videos"):
        video_name = os.path.basename(path)
        
        try:
            # Process video
            result = processor.process_video(path, max_frames=30)
            
            # Extract hand landmarks
            if 'hand_landmarks' in result:
                # Convert to list of dicts
                landmarks = []
                for i in range(len(result['hand_landmarks'])):
                    landmarks.append({
                        'left_hand': result['hand_landmarks'][i]['left'],
                        'right_hand': result['hand_landmarks'][i]['right'],
                        'pose': None  # No pose in this processor output
                    })
                
                # Extract features
                features = extract_hand_features(landmarks)
                if features is not None:
                    features['video_name'] = video_name
                    all_features.append(features)
                    all_video_names.append(video_name)
        except Exception as e:
            print(f"Error processing {video_name}: {e}")
    
    # Concatenate features
    if all_features:
        combined_features = pd.concat(all_features, ignore_index=True)
        return combined_features, all_video_names
    else:
        return None, all_video_names

In [None]:
# Process multiple videos
# Get paths to a few videos
sample_size = min(5, len(video_props_df))
sample_videos = video_props_df.sample(sample_size, random_state=42)

video_paths = []
for idx, row in sample_videos.iterrows():
    video_path = os.path.join(VIDEO_DIR, row['filename'])
    
    # Try common extensions if needed
    if not os.path.exists(video_path):
        for ext in [".mp4", ".avi", ".mov"]:
            test_path = os.path.join(VIDEO_DIR, f"{row['filename']}{ext}")
            if os.path.exists(test_path):
                video_path = test_path
                break
    
    if os.path.exists(video_path):
        video_paths.append(video_path)

if video_paths:
    batch_features, video_names = process_multiple_videos(video_paths)
    
    if batch_features is not None:
        print(f"Processed {len(video_names)} videos")
        display(batch_features.head())
        
        # Visualize hand spread across videos
        plt.figure(figsize=(12, 6))
        for video_name in video_names:
            video_data = batch_features[batch_features['video_name'] == video_name]
            plt.plot(video_data['frame_idx'], video_data['left_hand_spread'], '-', label=f"{video_name} (Left)")
        
        plt.title('Left Hand Spread Comparison')
        plt.xlabel('Frame')
        plt.ylabel('Spread')
        plt.legend()
        plt.grid(True)
        plt.show()
else:
    print("No video paths found")

## 8. Conclusions

Based on our exploration, we can draw the following conclusions for the Sign2Text project:

### Key Findings

1. **Video Characteristics**: Our dataset consists of videos with varying durations, resolutions, and frame rates. Most videos are [summary of analysis].

2. **Hand Detection**: MediaPipe successfully detects hand landmarks in most frames, with [percentage]% of frames having at least one hand detected.

3. **Feature Effectiveness**: The most promising features for sign recognition include:
   - Hand trajectory (position over time)
   - Hand spread (openness of the hand)
   - Relative position between hands
   - Hand velocity and acceleration

4. **Data Quality**: Overall, the dataset appears [assessment of quality] for training a sign language recognition model. Some challenges include [list challenges if any].

### Next Steps

Based on this exploration, we recommend the following next steps:

1. **Data Processing Pipeline**:
   - Implement consistent preprocessing to handle videos of different durations and resolutions
   - Extract and normalize hand landmarks using MediaPipe
   - Implement data augmentation for better model generalization

2. **Feature Engineering**:
   - Create temporal features capturing hand movement patterns
   - Include both raw landmarks and engineered features
   - Normalize features to be invariant to person size and camera position

3. **Model Architecture**:
   - Use a hybrid CNN-LSTM or Transformer architecture
   - Combine visual features from frames with landmark features
   - Consider attention mechanisms to focus on important frames in the sequence

4. **Evaluation Metrics**:
   - Track both frame-level and sequence-level accuracy
   - Use Word Error Rate (WER) for text output evaluation
   - Consider user studies for real-world effectiveness