Setting AWS credentials by defining environment variables using the `os` module. It assigns the AWS access key and secret key, allowing authenticated access to AWS services.

Provide your own KEY ID and ACCESS KEY

In [1]:
import os
os.environ['AWS_ACCESS_KEY_ID'] = ""
os.environ['AWS_SECRET_ACCESS_KEY'] = ""

Installing the AWS CLI on a Linux system, configures it with AWS credentials, and prepares to interact with AWS services. It uses a bash shell script to download, unzip, install, and configure the AWS CLI.

In [2]:
# Download the AWS and Ego4D CLIs, then download the annotations locally
%%bash

# Set up the AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip -o awscliv2.zip >/dev/null
sudo ./aws/install >/dev/null 2>&1
aws configure set aws_access_key_id "$AWS_ACCESS_KEY_ID" && aws configure set aws_secret_access_key "$AWS_SECRET_ACCESS_KEY"
rm "awscliv2.zip"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 72 58.0M   72 42.2M    0     0   124M      0 --:--:-- --:--:-- --:--:--  124M100 58.0M  100 58.0M    0     0   132M      0 --:--:-- --:--:-- --:--:--  132M


Installing the `ego4d` Python package using pip



In [3]:
!pip install ego4d

Collecting ego4d
  Downloading ego4d-1.7.3.tar.gz (94 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/94.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.5/94.5 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting boto3 (from ego4d)
  Downloading boto3-1.35.17-py3-none-any.whl.metadata (6.6 kB)
Collecting dataclasses-json (from ego4d)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting iopath (from ego4d)
  Downloading iopath-0.1.10.tar.gz (42 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.2/42.2 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting botocore<1.36.0,>=1.35.17 (from boto3->ego4d)
  Downloading bo

uses the `ego4d` CLI to download the specified annotations and NLQ benchmark data (version v1) into the `/content/ego4d_data/` directory in Colab.

In [4]:
# Download the Ego4D Annotations to ego4d_data/
!ego4d --output_directory="/content/ego4d_data" --datasets annotations omnivore_video_swinl_fp16 --benchmarks nlq --version v1 -y

Datasets to download: {'omnivore_video_swinl_fp16', 'annotations'}
Download Path: /content/ego4d_data/v1
Downloading Ego4D metadata json..
Ego4D Metadata: /content/ego4d_data/ego4d.json
Checking requested datasets and versions...
Created download directory for version 'v1' of dataset: 'omnivore_video_swinl_fp16' at: /content/ego4d_data/v1/omnivore_video_swinl_fp16
Filtering by benchmarks: ['nlq']
Created download directory for version 'v1' of dataset: 'annotations' at: /content/ego4d_data/v1/annotations
Benchmarks specified but ignored without a benchmarks field in manifest.
Retrieving object metadata from S3...
100% 1290/1290 [00:01<00:00, 972.66object/s] 
Checking if latest file versions are already downloaded...
100% 1290/1290 [00:15<00:00, 85.28file/s]
No existing videos to filter.
Downloading 1290 files..
100% 12.8G/12.8G [04:11<00:00, 240MiB/s]Checking file integrity...
100% 12.8G/12.8G [04:11<00:00, 54.8MiB/s]


Using the ego4d CLI to download the Omnivore Video SwinL FP16 features for the NLQ benchmark (version v1) and saves them to the /content/ego4d_data/features directory.

In [5]:
!ego4d --output_directory="/content/ego4d_data/features" --datasets omnivore_video_swinl_fp16 --benchmarks nlq --version v1 -y

Datasets to download: {'omnivore_video_swinl_fp16'}
Download Path: /content/ego4d_data/features/v1
Downloading Ego4D metadata json..
Ego4D Metadata: /content/ego4d_data/features/ego4d.json
Checking requested datasets and versions...
Created download directory for version 'v1' of dataset: 'omnivore_video_swinl_fp16' at: /content/ego4d_data/features/v1/omnivore_video_swinl_fp16
Filtering by benchmarks: ['nlq']
Retrieving object metadata from S3...
100% 1259/1259 [00:01<00:00, 949.75object/s]
Checking if latest file versions are already downloaded...
100% 1259/1259 [00:14<00:00, 85.77file/s] 
No existing videos to filter.
Downloading 1259 files..
100% 10.3G/10.3G [03:56<00:00, 51.0MiB/s]Checking file integrity...
100% 10.3G/10.3G [03:57<00:00, 46.7MiB/s]


Checking annotations located in the proper location

In [6]:
 !ls /content/ego4d_data/v1/annotations | grep nlq

nlq_test_unannotated.json
nlq_train.json
nlq_val.json


Check all videos are located in the proper location

In [7]:
!ls /content/ego4d_data/v1/omnivore_video_swinl_fp16 | wc -l

1260


Cloning the repository git that contains VSLNet and VSLBase models

In [8]:
!git clone https://github.com/saghal/episodic-memory

Cloning into 'episodic-memory'...
remote: Enumerating objects: 840, done.[K
remote: Counting objects: 100% (191/191), done.[K
remote: Compressing objects: 100% (82/82), done.[K
remote: Total 840 (delta 122), reused 112 (delta 109), pack-reused 649 (from 1)[K
Receiving objects: 100% (840/840), 98.11 MiB | 29.38 MiB/s, done.
Resolving deltas: 100% (354/354), done.


In [9]:
%cd /content/episodic-memory/NLQ/VSLNet/

/content/episodic-memory/NLQ/VSLNet


Createing a shell script file (vars.sh) with environment variable definitions for a project. It sets up variables for dataset paths, feature directories, and model checkpoints, and changes the working directory to /episodic-memory/NLQ/VSLNet.

In [10]:
with open("vars.sh", "w") as out_f:
  out_f.write("""
export NAME=omnivore_video_fp16
export TASK_NAME=nlq_official_v1_$NAME
export BASE_DIR=data/dataset/nlq_official_v1_$NAME
export FEATURE_BASE_DIR=data/features/nlq_official_v1_$NAME/
export FEATURE_DIR=$FEATURE_BASE_DIR/video_features
export MODEL_BASE_DIR=/content/nlq_official_v1/checkpoints/

cd /content/episodic-memory/NLQ/VSLNet
"""
  )

Sourcing the vars.sh script to set environment variables, prints the FEATURE_BASE_DIR path, creates the directory if it doesn’t exist, and creates a symbolic link from /content/ego4d_data/v1/egovlp_fp16 to FEATURE_DIR.

In [11]:
%%bash

source vars.sh

echo $FEATURE_BASE_DIR
mkdir -p $FEATURE_BASE_DIR
ln -s /content/ego4d_data/v1/omnivore_video_swinl_fp16 $FEATURE_DIR

data/features/nlq_official_v1_omnivore_video_fp16/


Sourcing the vars.sh script to set environment variables and then installs several Python packages, including `nltk`, `submitit`, `torch`, and others. The `%%capture` magic suppresses the output of the command

In [12]:
%%bash
%%capture

source vars.sh
pip install nltk submitit torch torchaudio torchvision tqdm transformers tensorboard Pillow terminaltables

Collecting submitit
  Downloading submitit-1.5.1-py3-none-any.whl.metadata (8.0 kB)
Collecting terminaltables
  Downloading terminaltables-3.1.10-py2.py3-none-any.whl.metadata (3.5 kB)
Downloading submitit-1.5.1-py3-none-any.whl (74 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.7/74.7 kB 4.1 MB/s eta 0:00:00
Downloading terminaltables-3.1.10-py2.py3-none-any.whl (15 kB)
Installing collected packages: terminaltables, submitit
Successfully installed submitit-1.5.1 terminaltables-3.1.10


bash: line 1: fg: no job control


In [1]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

Sourcing vars.sh, then runs a Python script to prepare the Ego4D dataset by specifying paths for input annotations, video features, and output directories. It processes and saves the dataset in the defined locations.

In [15]:
%%bash

source vars.sh

python utils/prepare_ego4d_dataset.py \
    --input_train_split /content/ego4d_data/v1/annotations/nlq_train.json \
    --input_val_split /content/ego4d_data/v1/annotations/nlq_val.json \
    --input_test_split /content/ego4d_data/v1/annotations/nlq_test_unannotated.json \
    --video_feature_read_path $FEATURE_DIR \
    --clip_feature_save_path $FEATURE_BASE_DIR/official \
    --output_save_path $BASE_DIR

Reading [train]: /content/ego4d_data/v1/annotations/nlq_train.json
# train: 11291
Writing [train]: data/dataset/nlq_official_v1_omnivore_video_fp16/train.json
Reading [val]: /content/ego4d_data/v1/annotations/nlq_val.json
# val: 3874
Writing [val]: data/dataset/nlq_official_v1_omnivore_video_fp16/val.json
Reading [test]: /content/ego4d_data/v1/annotations/nlq_test_unannotated.json
# test: 4004
Writing [test]: data/dataset/nlq_official_v1_omnivore_video_fp16/test.json


  feature = torch.load(feature_path)
Extracting features:   0%|          | 2/1659 [00:00<01:24, 19.60it/s]Extracting features:   0%|          | 6/1659 [00:00<01:02, 26.58it/s]Extracting features:   1%|          | 18/1659 [00:00<00:25, 63.95it/s]Extracting features:   2%|▏         | 26/1659 [00:00<00:23, 69.44it/s]Extracting features:   2%|▏         | 34/1659 [00:00<00:41, 39.11it/s]Extracting features:   2%|▏         | 40/1659 [00:00<00:39, 40.89it/s]Extracting features:   3%|▎         | 46/1659 [00:01<00:42, 38.22it/s]Extracting features:   3%|▎         | 51/1659 [00:01<00:48, 33.30it/s]Extracting features:   3%|▎         | 57/1659 [00:01<00:41, 38.46it/s]Extracting features:   4%|▍         | 63/1659 [00:01<00:37, 42.50it/s]Extracting features:   4%|▍         | 68/1659 [00:01<00:56, 28.29it/s]Extracting features:   4%|▍         | 72/1659 [00:01<00:53, 29.64it/s]Extracting features:   5%|▍         | 78/1659 [00:02<00:47, 33.43it/s]Extracting features:   5%|▍         | 82/

 Extracting the contents of a GloVe word embeddings zip file from Google Drive into a specified directory. It creates the target directory if it doesn’t exist and lists the files in the directory after extraction. make sure to located GloVe file in the cited directory. [downlowdable via [link](https://nlp.stanford.edu/projects/glove/)]

In [16]:
from google.colab import drive
import shutil
import tarfile
import os

drive.mount('/content/drive')

Mounted at /content/drive


In [17]:
import zipfile
import os

# Copy the file from Google Drive
glove_path = '/content/drive/MyDrive/glove.840B.300d.zip'

# Directory where you want to extract the contents
features_dir = '/content/episodic-memory/NLQ/VSLNet/data/features'


os.makedirs(features_dir, exist_ok=True)


with zipfile.ZipFile(glove_path, 'r') as zip_ref:
    zip_ref.extractall(features_dir)

os.listdir(features_dir)

['nlq_official_v1_omnivore_video_fp16', 'glove.840B.300d.txt']

Running VSLNet Model by using Glove Encoder With 32 `BATCH_SIZE`, 10 `NUM_EPOCHS`, 0.0015 `INIT_LR`.

In [None]:
%%bash

source vars.sh

# machine parameters
export DATALOADER_WORKERS=1
export NUM_WORKERS=2
export VAL_JSON_PATH="/content/ego4d_data/v1/annotations/nlq_val.json"

# hyper parameters
export BATCH_SIZE=32
export DIM=128
export NUM_EPOCH=10
export MAX_POS_LEN=128
export INIT_LR=0.0015

export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

python main.py \
    --task $TASK_NAME \
    --predictor embedding \
    --dim $DIM \
    --mode train \
    --video_feature_dim 1536 \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_dir $MODEL_BASE_DIR/$NAME \
    --eval_gt_json $VAL_JSON_PATH \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

Running with Namespace(save_dir='datasets', task='nlq_official_v1_omnivore_video_fp16', eval_gt_json='/content/ego4d_data/v1/annotations/nlq_val.json', fv='official', max_pos_len=128, num_workers=2, data_loader_workers=1, word_size=None, char_size=None, word_dim=300, video_feature_dim=1536, char_dim=50, dim=128, highlight_lambda=5.0, num_heads=8, drop_rate=0.2, predictor='embedding', gpu_idx='0', seed=12345, mode='train', epochs=10, batch_size=32, num_train_steps=None, init_lr=0.0015, clip_norm=1.0, warmup_proportion=0.0, extend=0.1, period=100, text_agnostic=False, video_agnostic=False, model_dir='/content/nlq_official_v1/checkpoints//omnivore_video_fp16', model_name='vslnet', suffix=None, log_to_tensorboard='omnivore_video_fp16_bs32_dim128_epoch10_ilr0.0015', tb_log_dir='./runs', tb_log_freq=5, slurm=False, slurm_wait=False, slurm_partition='pixar', slurm_constraint='volta', slurm_gpus=1, slurm_cpus=10, slurm_timeout_min=720, slurm_log_folder='slurm_log', remove_empty_queries_from=['

2024-09-11 16:11:06.924766: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-11 16:11:07.235420: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-11 16:11:07.317825: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-11 16:11:07.816995: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
process episodic nlq train:   0%|          | 0/998 [

Running VSLNet Model by using Glove Encoder With 32 `BATCH_SIZE`, 10 `NUM_EPOCHS`, 0.0025 `INIT_LR`.

In [None]:
%%bash

source vars.sh

# machine parameters
export DATALOADER_WORKERS=1
export NUM_WORKERS=2
export VAL_JSON_PATH="/content/ego4d_data/v1/annotations/nlq_val.json"

# hyper parameters
export BATCH_SIZE=32
export DIM=128
export NUM_EPOCH=10
export MAX_POS_LEN=128
export INIT_LR=0.0025

export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

python main.py \
    --task $TASK_NAME \
    --predictor embedding \
    --dim $DIM \
    --mode train \
    --video_feature_dim 1536 \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_dir $MODEL_BASE_DIR/$NAME \
    --eval_gt_json $VAL_JSON_PATH \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

Running with Namespace(save_dir='datasets', task='nlq_official_v1_omnivore_video_fp16', eval_gt_json='/content/ego4d_data/v1/annotations/nlq_val.json', fv='official', max_pos_len=128, num_workers=2, data_loader_workers=1, word_size=None, char_size=None, word_dim=300, video_feature_dim=1536, char_dim=50, dim=128, highlight_lambda=5.0, num_heads=8, drop_rate=0.2, predictor='embedding', gpu_idx='0', seed=12345, mode='train', epochs=10, batch_size=32, num_train_steps=None, init_lr=0.0025, clip_norm=1.0, warmup_proportion=0.0, extend=0.1, period=100, text_agnostic=False, video_agnostic=False, model_dir='/content/nlq_official_v1/checkpoints//omnivore_video_fp16', model_name='vslnet', suffix=None, log_to_tensorboard='omnivore_video_fp16_bs32_dim128_epoch10_ilr0.0025', tb_log_dir='./runs', tb_log_freq=5, slurm=False, slurm_wait=False, slurm_partition='pixar', slurm_constraint='volta', slurm_gpus=1, slurm_cpus=10, slurm_timeout_min=720, slurm_log_folder='slurm_log', remove_empty_queries_from=['

2024-09-10 15:21:06.657490: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-10 15:21:06.679218: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-10 15:21:06.685649: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-10 15:21:06.701232: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  feature = torch.load(filename).to(torch.float32).cp

Running VSLNet Model by using Glove Encoder With 64 `BATCH_SIZE`, 20 `NUM_EPOCHS`, 0.0025 `INIT_LR`.

In [None]:
%%bash

source vars.sh

# machine parameters
export DATALOADER_WORKERS=1
export NUM_WORKERS=2
export VAL_JSON_PATH="/content/ego4d_data/v1/annotations/nlq_val.json"

# hyper parameters
export BATCH_SIZE=64
export DIM=128
export NUM_EPOCH=20
export MAX_POS_LEN=128
export INIT_LR=0.0025

export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

python main.py \
    --task $TASK_NAME \
    --predictor embedding \
    --dim $DIM \
    --mode train \
    --video_feature_dim 1536 \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_dir $MODEL_BASE_DIR/$NAME \
    --eval_gt_json $VAL_JSON_PATH \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

Running with Namespace(save_dir='datasets', task='nlq_official_v1_omnivore_video_fp16', eval_gt_json='/content/ego4d_data/v1/annotations/nlq_val.json', fv='official', max_pos_len=128, num_workers=2, data_loader_workers=1, word_size=None, char_size=None, word_dim=300, video_feature_dim=1536, char_dim=50, dim=128, highlight_lambda=5.0, num_heads=8, drop_rate=0.2, predictor='embedding', gpu_idx='0', seed=12345, mode='train', epochs=20, batch_size=64, num_train_steps=None, init_lr=0.0025, clip_norm=1.0, warmup_proportion=0.0, extend=0.1, period=100, text_agnostic=False, video_agnostic=False, model_dir='/content/nlq_official_v1/checkpoints//omnivore_video_fp16', model_name='vslnet', suffix=None, log_to_tensorboard='omnivore_video_fp16_bs64_dim128_epoch20_ilr0.0025', tb_log_dir='./runs', tb_log_freq=5, slurm=False, slurm_wait=False, slurm_partition='pixar', slurm_constraint='volta', slurm_gpus=1, slurm_cpus=10, slurm_timeout_min=720, slurm_log_folder='slurm_log', remove_empty_queries_from=['

2024-09-10 15:26:46.942436: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-10 15:26:46.963782: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-10 15:26:46.970667: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-10 15:26:46.986569: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  feature = torch.load(filename).to(torch.float32).cp

Running VSLNet Model by using Glove Encoder With 64 `BATCH_SIZE`, 20 `NUM_EPOCHS`, 0.0015 `INIT_LR`. `Inline code`

In [None]:
%%bash

source vars.sh

# machine parameters
export DATALOADER_WORKERS=1
export NUM_WORKERS=2
export VAL_JSON_PATH="/content/ego4d_data/v1/annotations/nlq_val.json"

# hyper parameters
export BATCH_SIZE=64
export DIM=128
export NUM_EPOCH=20
export MAX_POS_LEN=128
export INIT_LR=0.0015

export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

python main.py \
    --task $TASK_NAME \
    --predictor embedding \
    --dim $DIM \
    --mode train \
    --video_feature_dim 1536 \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_dir $MODEL_BASE_DIR/$NAME \
    --eval_gt_json $VAL_JSON_PATH \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

Running with Namespace(save_dir='datasets', task='nlq_official_v1_omnivore_video_fp16', eval_gt_json='/content/ego4d_data/v1/annotations/nlq_val.json', fv='official', max_pos_len=128, num_workers=2, data_loader_workers=1, word_size=None, char_size=None, word_dim=300, video_feature_dim=1536, char_dim=50, dim=128, highlight_lambda=5.0, num_heads=8, drop_rate=0.2, predictor='embedding', gpu_idx='0', seed=12345, mode='train', epochs=20, batch_size=64, num_train_steps=None, init_lr=0.0015, clip_norm=1.0, warmup_proportion=0.0, extend=0.1, period=100, text_agnostic=False, video_agnostic=False, model_dir='/content/nlq_official_v1/checkpoints//omnivore_video_fp16', model_name='vslnet', suffix=None, log_to_tensorboard='omnivore_video_fp16_bs64_dim128_epoch20_ilr0.0015', tb_log_dir='./runs', tb_log_freq=5, slurm=False, slurm_wait=False, slurm_partition='pixar', slurm_constraint='volta', slurm_gpus=1, slurm_cpus=10, slurm_timeout_min=720, slurm_log_folder='slurm_log', remove_empty_queries_from=['

2024-09-11 16:26:57.251370: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-11 16:26:57.275835: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-11 16:26:57.282682: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-11 16:26:57.299194: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  feature = torch.load(filename).to(torch.float32).cp

Running VSLNet Model by using Glove Encoder With 64 `BATCH_SIZE`, 40 `NUM_EPOCHS`, 0.0025 `INIT_LR`. `Inline code`

In [None]:
%%bash

source vars.sh

# machine parameters
export DATALOADER_WORKERS=1
export NUM_WORKERS=2
export VAL_JSON_PATH="/content/ego4d_data/v1/annotations/nlq_val.json"

# hyper parameters
export BATCH_SIZE=64
export DIM=128
export NUM_EPOCH=40
export MAX_POS_LEN=128
export INIT_LR=0.0025

export TB_LOG_NAME="${NAME}_bs${BATCH_SIZE}_dim${DIM}_epoch${NUM_EPOCH}_ilr${INIT_LR}"

python main.py \
    --task $TASK_NAME \
    --predictor embedding \
    --dim $DIM \
    --mode train \
    --video_feature_dim 1536 \
    --max_pos_len $MAX_POS_LEN \
    --init_lr $INIT_LR \
    --epochs $NUM_EPOCH \
    --batch_size $BATCH_SIZE \
    --fv official \
    --num_workers $NUM_WORKERS \
    --data_loader_workers $DATALOADER_WORKERS \
    --model_dir $MODEL_BASE_DIR/$NAME \
    --eval_gt_json $VAL_JSON_PATH \
    --log_to_tensorboard $TB_LOG_NAME \
    --tb_log_freq 5 \
    --remove_empty_queries_from train

Running with Namespace(save_dir='datasets', task='nlq_official_v1_omnivore_video_fp16', eval_gt_json='/content/ego4d_data/v1/annotations/nlq_val.json', fv='official', max_pos_len=128, num_workers=2, data_loader_workers=1, word_size=None, char_size=None, word_dim=300, video_feature_dim=1536, char_dim=50, dim=128, highlight_lambda=5.0, num_heads=8, drop_rate=0.2, predictor='embedding', gpu_idx='0', seed=12345, mode='train', epochs=40, batch_size=64, num_train_steps=None, init_lr=0.0025, clip_norm=1.0, warmup_proportion=0.0, extend=0.1, period=100, text_agnostic=False, video_agnostic=False, model_dir='/content/nlq_official_v1/checkpoints//omnivore_video_fp16', model_name='vslnet', suffix=None, log_to_tensorboard='omnivore_video_fp16_bs64_dim128_epoch40_ilr0.0025', tb_log_dir='./runs', tb_log_freq=5, slurm=False, slurm_wait=False, slurm_partition='pixar', slurm_constraint='volta', slurm_gpus=1, slurm_cpus=10, slurm_timeout_min=720, slurm_log_folder='slurm_log', remove_empty_queries_from=['

2024-09-10 15:34:29.826906: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-10 15:34:29.847948: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-10 15:34:29.854402: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-10 15:34:29.869498: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  feature = torch.load(filename).to(torch.float32).cp