#  Model Training and Evaluation

## Notebook Overview

This notebook provides a comprehensive and reproducible workflow for training and evaluating VSLNet and VSLBase models on the Ego4D Natural Language Queries (NLQ) task. The process is designed to be run in a Google Colab environment and is broken down into three main sections:

1.  **Experiment Set-up:** This section handles all the initial configuration. It mounts Google Drive, sets up a dynamic configuration for our experiments (allowing easy selection of models, features, and encoders), installs necessary dependencies, clones the model's source code, and unpacks the dataset into the local Colab environment for fast access.

2.  **Symbolic Links & Data Preparation:** Here, we prepare the data for training. This involves creating symbolic links to point the training scripts to the correct data and feature directories. We then run the `prepare_ego4d_dataset.py` script, which preprocesses the annotation files into the format required by the model's data loaders.

3.  **Training:** The final section is dedicated to launching the model training. We will show the command that executes the `main.py` script, using the parameters defined in our configuration, to start a training run. By changing the configuration in Section 1, this command can be used to reproduce any of the experiments documented in our report.

## 1. Experiment Set-up

### 1.1. Mount Google Drive
We begin by mounting Google Drive to access our datasets and save our experiment results.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

### 1.2. Define Experiment Configuration
This is the main control cell for all our experiments. Modify the variables in this cell to select the model, features, text encoder, and run number. The `vars.sh` script, which controls the entire workflow, will be generated automatically based on these settings.

In [None]:
# --- CHOOSE YOUR EXPERIMENT CONFIGURATION ---

MODEL_USED = "vsl_net"  # Options: "vsl_net", "vsl_base"
FEATURE_TYPE = "egovlp" # Options: "egovlp", "omnivore"
TEXT_ENCODER = "bert"   # Options: "bert", "glove"
RUN_NUMBER = 1          # An integer for the run number, e.g., 1, 2, 3

# --- AUTO-GENERATED SETTINGS ---

# Set feature directory and dimension based on FEATURE_TYPE
if FEATURE_TYPE == "egovlp":
    FEATURE_DIR_NAME = "egovlp_fp16"
    VISUAL_FEATURE_DIM = 256
elif FEATURE_TYPE == "omnivore":
    FEATURE_DIR_NAME = "omnivore_video_swinl_fp16"
    VISUAL_FEATURE_DIM = 1536
else:
    raise ValueError("Invalid FEATURE_TYPE selected.")

# Construct the unique experiment name
EXPERIMENT_NAME = f"{MODEL_USED}_{FEATURE_TYPE}_{TEXT_ENCODER}_run{RUN_NUMBER}"

# --- GENERATE THE vars.sh SCRIPT CONTENT ---

vars_sh_content = f"""
#!/bin/bash

# --- Dynamic Experiment Configuration ---
export NAME={EXPERIMENT_NAME}
export MODEL_NAME={MODEL_USED} # vsl_net or vsl_base
export VISUAL_FEATURE_TYPE={FEATURE_TYPE} # egovlp or omnivore
export TEXT_ENCODER_TYPE={TEXT_ENCODER} # bert or glove
export VISUAL_FEATURE_DIM={VISUAL_FEATURE_DIM}

# --- Static Path Configuration ---
export FEATURE_SOURCE_ZIP_PATH=/content/drive/MyDrive/EgoVisionProject/Data #change with the directory you have the ego4d_data.zip
export DRIVE_ZIP_FILENAME=ego4d_data.zip
export MODEL_BASE_DIR=/content/drive/MyDrive/EgoVisionProject/Experiments
export LOCAL_DATA_ROOT=/content/data


# --- Derived Path Configuration ---
export TASK_NAME=nlq_official_v1_$NAME
export BASE_DIR=$LOCAL_DATA_ROOT/dataset/$TASK_NAME
export FEATURE_BASE_DIR=$LOCAL_DATA_ROOT/features/$TASK_NAME/official
export FEATURE_DIR=$LOCAL_DATA_ROOT/ego4d_data/v1/{FEATURE_DIR_NAME}
export LOCAL_ANNOTATIONS_DIR=$LOCAL_DATA_ROOT/ego4d_data/v1/annotations
export LOCAL_TRAIN_SPLIT=$LOCAL_ANNOTATIONS_DIR/nlq_train.json
export LOCAL_VAL_SPLIT=$LOCAL_ANNOTATIONS_DIR/nlq_val.json
export LOCAL_TEST_SPLIT=$LOCAL_ANNOTATIONS_DIR/nlq_test_unannotated.json
export LOCAL_MODEL_DIR=$LOCAL_DATA_ROOT/experiments
"""

# Write the content to the vars.sh file
with open("vars.sh", "w") as f:
    f.write(vars_sh_content)

print("vars.sh file generated successfully for experiment:")
print(f"--> {EXPERIMENT_NAME}")

### 1.4. Clone Model Repository
Next, we clone the `VSLNet_Code` folder from our GitHub repository. This contains the core Python scripts for the model, training, and evaluation. We then change the current directory to this folder so the scripts can be executed directly.

In [None]:
%%bash

# Clone the repository (if it doesn't already exist)
if [ ! -d "VSLNet_Code" ]; then
  git clone https://github.com/pietrogiancristofaro2001/ego4d-nlq-project.git
  #The above clones the whole project. Let's move the required code folder
  mv ego4d-nlq-project/VSLNet_Code .
  rm -rf ego4d-nlq-project
  echo "Repository cloned."
else
  echo "Repository already exists."
fi


Now, we change the current directory of the notebook to `VSLNet_Code` using the `%cd` magic command. This ensures that all subsequent cells will be executed from this path, allowing scripts and utilities to be called directly.

In [None]:
%cd VSLNet_Code
print(f"Current directory changed to: {%pwd}")

### 1.5. Install Dependencies
We install all the necessary Python libraries for the project. These are listed in the `requirements.txt` file and include packages like PyTorch, Transformers, and others.

In [None]:
%%bash
%%capture

source vars.sh

pip install pandas matplotlib seaborn nltk submitit torch torchaudio torchvision tqdm transformers tensorboard Pillow terminaltables accelerate bitsandbytes sentencepiece

### 1.6. Unpack Dataset
We copy the `ego4d_data.zip` file from Drive to the local Colab storage and unzip it. This significantly speeds up data access during training.

In [None]:
%%bash
source vars.sh

# Create the local destination directory
mkdir -p "$LOCAL_DATA_ROOT"

# Full path of the zip file on Drive
DRIVE_ZIP_FILE_PATH="$FEATURE_SOURCE_ZIP_PATH/$DRIVE_ZIP_FILENAME"
# Temporary local path to copy the zip file to
LOCAL_TEMP_ZIP_FILE="/content/$DRIVE_ZIP_FILENAME"

if [ -f "$DRIVE_ZIP_FILE_PATH" ]; then
    echo "Copying $DRIVE_ZIP_FILENAME..."
    cp "$DRIVE_ZIP_FILE_PATH" "$LOCAL_TEMP_ZIP_FILE"

    echo "Extracting file to $LOCAL_DATA_ROOT..."
    # -o overwrites existing files, -q for quiet mode
    unzip -o -q "$LOCAL_TEMP_ZIP_FILE" -d "$LOCAL_DATA_ROOT"

    echo "Removing temporary zip file..."
    rm "$LOCAL_TEMP_ZIP_FILE"

    echo "Data setup complete."
else
    echo "ERROR: File not found at $DRIVE_ZIP_FILE_PATH"
    exit 1
fi

## 2. Symbolic Links & Data Preparation

### 2.1. Create Symbolic Links
The training scripts expect the data and feature directories to be in specific locations. We create symbolic links (`ln -s`) to point from the expected locations to our actual data folders in the local Colab storage. This avoids modifying the core scripts. We create links for the annotations, video features, and the GloVe word embeddings.

In [None]:
%%bash
source vars.sh

# --- Symbolic Link for Data/Annotations ---
# The script expects a 'data' folder in the current directory
DATA_LINK_TARGET_DIR="data"
mkdir -p "$DATA_LINK_TARGET_DIR"
# Link the 'nlq_official_v1' folder inside it, pointing to our annotations
ln -sfn "$LOCAL_ANNOTATIONS_DIR" "$DATA_LINK_TARGET_DIR/nlq_official_v1"
echo "Created symlink for annotations:"
ls -l "$DATA_LINK_TARGET_DIR"


# --- Symbolic Link for Video Features ---
# The script expects a 'features' folder in the current directory
FEATURE_LINK_TARGET_DIR="features"
mkdir -p "$FEATURE_LINK_TARGET_DIR"
# Link the specific feature directory (e.g., egovlp_fp16) inside it
ln -sfn "$FEATURE_DIR" "$FEATURE_LINK_TARGET_DIR/$VISUAL_FEATURE_TYPE"
echo -e "\nCreated symlink for video features:"
ls -l "$FEATURE_LINK_TARGET_DIR"


# --- Symbolic Link for GloVe ---
# Only create this link if the experiment uses GloVe
if [ "$TEXT_ENCODER_TYPE" == "glove" ]; then
  GLOVE_FILE_PATH="$LOCAL_DATA_ROOT/ego4d_data/v1/glove_encoder/glove.840B.300d.txt"
  EXPECTED_GLOVE_PARENT_DIR="data/features"
  mkdir -p "$EXPECTED_GLOVE_PARENT_DIR"
  ln -sfn "$GLOVE_FILE_PATH" "$EXPECTED_GLOVE_PARENT_DIR/glove.840B.300d.txt"
  echo -e "\nCreated symlink for GloVe embeddings:"
  ls -l "$EXPECTED_GLOVE_PARENT_DIR"
fi

### 2.2. Run Data Preprocessing Script
Now we run the `prepare_ego4d_dataset.py` script. This script reads the raw JSON annotation files (`nlq_train.json`, etc.), processes them, and saves them in a format optimized for training. It also pre-processes the text queries using the chosen text encoder (BERT or GloVe).

In [None]:
%%bash
source vars.sh

echo "Creating output directories..."
mkdir -p "$BASE_DIR"
mkdir -p "$FEATURE_BASE_DIR"

echo "Running data preparation script..."
python utils/prepare_ego4d_dataset.py \
    --input_train_split "$LOCAL_TRAIN_SPLIT" \
    --input_val_split "$LOCAL_VAL_SPLIT" \
    --input_test_split "$LOCAL_TEST_SPLIT" \
    --video_feature_read_path "$FEATURE_DIR" \
    --clip_feature_save_path "$FEATURE_BASE_DIR" \
    --output_save_path "$BASE_DIR"

echo "Data preparation finished."

## 3. Training

### 3.1. Launching the Training Script
This is the final step. We execute `main.py` with all the configured parameters from `vars.sh`. This command starts the training process for the defined model, features, and hyperparameters. The model checkpoints and prediction results will be saved to your Google Drive in the `Experiments/$NAME` directory.

In [None]:
%%bash
source vars.sh

# --- Hyper-parameter Configuration ---
export DATALOADER_WORKERS=4
export NUM_WORKERS=4
export BATCH_SIZE=128
export DIM=128
export NUM_EPOCH=100
export MAX_POS_LEN=128
export INIT_LR=0.001

# --- Construct TensorBoard Log Name ---
export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

# Create local directories for saving models and logs, if they don't exist
mkdir -p "$LOCAL_MODEL_DIR/$NAME"

echo "--- Starting Training ---"
echo "Experiment Name: $NAME"
echo "Model: $MODEL_NAME"
echo "Video Features: $VISUAL_FEATURE_TYPE (Dim: $VISUAL_FEATURE_DIM)"
echo "Text Encoder: $TEXT_ENCODER_TYPE"
echo "--------------------------"

python main.py \
    --task $TASK_NAME \
    --mode train \
    --predictor $TEXT_ENCODER_TYPE \
    --dim $DIM \
    --video_feature_dim $VISUAL_FEATURE_DIM \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_name $MODEL_NAME \
    --visual_feature $VISUAL_FEATURE_TYPE \
    --model_dir "$LOCAL_MODEL_DIR/$NAME" \
    --eval_gt_json "$LOCAL_VAL_SPLIT" \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

## 4. Save Results to Google Drive (Optional)
After the training is complete, the model checkpoints and prediction files are stored in the local Colab environment. This final, optional step copies the entire experiment folder from the local storage to your specified directory on Google Drive for permanent storage.

In [None]:
%%bash
source vars.sh

# Source directory (local)
SOURCE_DIR="$LOCAL_MODEL_DIR/$NAME"

# Destination directory (on Google Drive)
DEST_DIR="$MODEL_BASE_DIR"

# Check if the local experiment directory exists
if [ -d "$SOURCE_DIR" ]; then
  echo "Copying results from $SOURCE_DIR to $DEST_DIR..."
  # Create the base destination directory on Drive if it doesn't exist
  mkdir -p "$DEST_DIR"
  # Copy the entire experiment folder recursively
  cp -r "$SOURCE_DIR" "$DEST_DIR"
  echo "Copy complete!"
  echo "You can find your results in: $DEST_DIR/$NAME"
else
  echo "ERROR: Source directory $SOURCE_DIR not found. Was the training completed?"
fi