# Video Processing for Eye Tracking Analysis

This notebook processes eye tracking data from a research study that combines video recordings with gaze tracking information. The main functions include:
- Converting WMV video files to MP4 format
- Processing eye tracking data from CSV files
- Overlaying gaze positions on video recordings
- Chopping videos based on annotation markers

## Prerequisites

- Required libraries: pandas, cv2 (OpenCV), numpy
- Custom utilities from ivr_utils package
- Access to the data directory containing participant recordings

## Data Organization

The code expects:

- A server data directory containing:
  - Eyetracking data in CSV format
  - Video recordings in WMV format
- CSV files contain columns:
  - Timestamp: Time in milliseconds
  - Gaze X/Y: Screen coordinates of gaze position
  - SlideEvent: Events like "StartMedia"
  - Respondent Annotations active: Used for video chopping

## Potential Improvements
Consider adding:
- Error handling for missing or corrupted files
- Progress indicators for long processing operations
- Validation of input data format and content
- Configuration options for visualization parameters
- Memory optimization for large video files
- Batch processing capabilities for multiple participants

In [1]:
import pandas as pd
import cv2
from pathlib import Path
from typing import Tuple
from ivr_utils.ivr_utils import (
    find_participant_files,
    convert_wmv_to_mp4,
    pyav_timestamps,
    process_video,
)
import numpy as np

import logging

# Configure logging with timestamp, level and message
logging.basicConfig(
    level=logging.DEBUG,
    format="%(asctime)s - %(levelname)s - %(message)s",
    datefmt="%Y-%m-%d %H:%M:%S",
)


# Constants
PARTICIPANT_ID = "P48"

local_data_dir = Path.cwd().parent / "data"
server_data_dir = Path("/Volumes/ritd-ag-project-rd01wq-tober63/SSID IVR Study 1/")
output_dir = local_data_dir.joinpath("output/2025-01-21-test/")

# Check if the server data directory exists
assert server_data_dir.is_dir(), "Server data directory not found"

# Create the output directory if it does not exist
output_dir.mkdir(parents=True, exist_ok=True)

## Find Files for participant

In [2]:
eyetracking_dir = server_data_dir / "Eyetracking"

part_csv_path, part_wmv_path = find_participant_files(PARTICIPANT_ID, eyetracking_dir)
print(f"Participant CSV: {part_csv_path}")
print(f"Participant WMV: {part_wmv_path}")

Scenario video directory: /Volumes/ritd-ag-project-rd01wq-tober63/SSID IVR Study 1/Eyetracking/SSID AV1/Gaze Replays
Found video files: [PosixPath('/Volumes/ritd-ag-project-rd01wq-tober63/SSID IVR Study 1/Eyetracking/SSID AV1/Gaze Replays/Scene_P48_ScreenRecording-1_(0,OTHER,1005).wmv')]
Participant CSV: /Volumes/ritd-ag-project-rd01wq-tober63/SSID IVR Study 1/Eyetracking/SSID AV1/Sensor Data/001_P48.csv
Participant WMV: /Volumes/ritd-ag-project-rd01wq-tober63/SSID IVR Study 1/Eyetracking/SSID AV1/Gaze Replays/Scene_P48_ScreenRecording-1_(0,OTHER,1005).wmv


# Import csv file for participant

In [None]:
needed_columns = ["Timestamp", "SlideEvent", "Gaze X", "Gaze Y", "Respondent Annotations active"]
points = pd.read_csv(part_csv_path, skiprows=lambda x: x < 26, usecols=needed_columns, engine="c")


# find the "StartMedia" timestamp
row = points[points["SlideEvent"] == "StartMedia"]
timestamp_diff = row["Timestamp"].values[0]

# clean the NaN in columns
points = points.dropna(subset=["Gaze X", "Gaze Y"])

# - timestamp_diff to make the timestamp start from 0
points["Timestamp"] = points["Timestamp"] - timestamp_diff

  points = pd.read_csv(part_csv_path, skiprows=lambda x: x < 26, usecols=needed_columns, engine="c")


## Checking timestamps

In [11]:
# av_timestamps = pyav_timestamps(output_chopping_path)

# print(f"Number of timestamps from AV (i.e. n frames): {len(av_timestamps)}")
# print(av_timestamps[:10])

# tdiff = np.diff(av_timestamps)
# print(tdiff[:10])
# print(f"Average time difference: {np.mean(tdiff)}")
# print(f"Standard deviation of time difference: {np.std(tdiff)}")
# print(f"Equivalent FPS {1 / (np.mean(tdiff) / 1000)}")

## Convert WMV to MP4

In [12]:
variablefps_result = convert_wmv_to_mp4(
    part_wmv_path,
    output_dir.joinpath("output_test_variablefps.mp4"),
    output_fps=None,
)
variablefps_result.print_status()


2025-01-21 11:33:19 - INFO - Checking existing MP4 file
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x1087f8380] moov atom not found
OpenCV: Couldn't read video stream from file "/Users/mitch/Documents/UCL/Papers_2025/SSID_IVR_Study/data/output/2025-01-21-test/output_test_variablefps.mp4"
2025-01-21 11:33:19 - INFO - Converting WMV to MP4
frame=19405 fps= 50 q=-0.0 size=  995584KiB time=00:37:06.64 bitrate=3662.8kbits/s speed=5.77x    

Conversion successful


frame=19417 fps= 50 q=-0.0 Lsize=  995856KiB time=00:37:07.64 bitrate=3662.2kbits/s speed=5.76x    


In [13]:
fixedfps_result = convert_wmv_to_mp4(
    part_wmv_path,
    output_dir.joinpath("output_test_fixedfps.mp4"),
    output_fps=30,
)
fixedfps_result.print_status()


2025-01-21 11:39:46 - INFO - Converting WMV to MP4
[vf#0:0 @ 0x138623b90] More than 1000 frames duplicated45.83 bitrate=3569.0kbits/s dup=997 drop=0 speed=2.07x    
[vf#0:0 @ 0x138623b90] More than 10000 frames duplicated8.40 bitrate=4570.4kbits/s dup=9982 drop=0 speed=2.15x    
frame=66799 fps= 65 q=-0.0 size= 1202688KiB time=00:37:06.60 bitrate=4424.9kbits/s dup=47408 drop=0 speed=2.17x    

Conversion successful


frame=66831 fps= 65 q=-0.0 Lsize= 1203251KiB time=00:37:07.66 bitrate=4424.8kbits/s dup=47414 drop=0 speed=2.17x    


## Chopping and Overlaying Gaze point

In [None]:
N_FRAMES_PROC = 30 * 120  # 30 fps * 60 seconds

# input_video_path = result.output_path
input_video_path = output_dir.joinpath("output_test_fixedfps.mp4")
output_chopping_path = output_dir.joinpath("output_chopped_test2_fixedfps.mp4")
output_overlay_path = output_dir.joinpath("output_overlay_test_fixedfps.mp4")

process_video(
    input_video_path, output_chopping_path, output_overlay_path, points, N_FRAMES_PROC
)


2025-01-21 12:40:06 - INFO - process_video - Reading video from /Users/mitch/Documents/UCL/Papers_2025/SSID_IVR_Study/data/output/2025-01-21-test/output_test_fixedfps.mp4
2025-01-21 12:40:06 - DEBUG - process_video - instantiating VideoWriter with /Users/mitch/Documents/UCL/Papers_2025/SSID_IVR_Study/data/output/2025-01-21-test/output_chopped_test2_fixedfps.mp4
2025-01-21 12:40:06 - DEBUG - process_video - Instatiating VideoWriter with /Users/mitch/Documents/UCL/Papers_2025/SSID_IVR_Study/data/output/2025-01-21-test/output_chopped_test2_fixedfps.mp4
2025-01-21 12:40:06 - DEBUG - process_video - Instatiating VideoWriter with /Users/mitch/Documents/UCL/Papers_2025/SSID_IVR_Study/data/output/2025-01-21-test/output_overlay_test_fixedfps.mp4


  0%|          | 0/3600 [00:00<?, ?it/s]

2025-01-21 12:41:49 - DEBUG - chopping_video - releasing resources
