# Visual Question Answering Training

This notebook runs the VQA training pipeline on Google Colab with GPU acceleration.

## 1. Setup Environment

First, let's install the required dependencies with compatible versions.

In [None]:
# Remove all potentially conflicting packages
!pip uninstall -y torch torchvision torchaudio numpy transformers sentence-transformers timm

In [None]:
# Install base dependencies first
!pip install numpy==1.24.3
!pip install packaging==23.1

In [None]:
# Install PyTorch ecosystem
!pip install torch==2.0.0 torchvision==0.15.0 --index-url https://download.pytorch.org/whl/cu118

In [None]:
# Install transformers and related packages
!pip install transformers==4.28.0
!pip install sentence-transformers==2.2.2

In [None]:
# Install remaining dependencies
!pip install timm==0.6.12 \
            stable-baselines3==2.0.0 \
            opencv-python==4.7.0 \
            pillow==9.4.0 \
            wandb==0.15.0 \
            tqdm==4.65.0 \
            matplotlib==3.7.0

In [None]:
# Restart the runtime to ensure clean environment
print("Please restart the runtime now (Runtime -> Restart runtime)")
import os
os.kill(os.getpid(), 9)

## 2. Verify Environment

After restarting the runtime, run this cell to verify all packages are properly installed.

In [None]:
import torch
import torchvision
import numpy as np
import transformers
from sentence_transformers import SentenceTransformer

print("Checking package versions:")
print(f"PyTorch: {torch.__version__}")
print(f"Torchvision: {torchvision.__version__}")
print(f"NumPy: {np.__version__}")
print(f"Transformers: {transformers.__version__}")

print("\nChecking CUDA:")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU device: {torch.cuda.get_device_name(0)}")

print("\nTesting transformers:")
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
print("Transformers test successful!")

## 3. Clone Repository

In [None]:
# Clone repository
!git clone https://github.com/YOUR_USERNAME/ML-4.git
%cd ML-4

## 4. Setup Data

Mount Google Drive and set up data directories.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

# Create symbolic links to your data
!ln -s /content/gdrive/MyDrive/path_to_your_data/visual7w-images ITM_Classifier-baselines/visual7w-images

## 5. Configure WandB

In [None]:
import wandb
!wandb login

## 6. Run Training

In [None]:
!python ITM_Classifier-baselines/train_vqa_colab.py

## 7. Monitor Results

The training progress can be monitored in:
1. The output above
2. The WandB dashboard
3. Saved models will be in your Google Drive under 'ML4_models/'

Note: If you encounter any errors after installing dependencies:
1. Restart the runtime
2. Run all cells from the beginning in order
3. Make sure the version verification cell passes before proceeding