# Visual Question Answering Training

This notebook runs the VQA training pipeline on Google Colab with GPU acceleration.

## 1. Setup Environment

First, let's install the required dependencies with compatible versions. We need to be careful with the order of installation to avoid conflicts.

In [None]:
# Remove any existing installations to avoid conflicts
!pip uninstall -y torch torchvision numpy transformers

In [None]:
# First install numpy 1.24.3 which is compatible with both torch and transformers
!pip install numpy==1.24.3

In [None]:
# Install PyTorch and torchvision with compatible CUDA version
!pip install torch==2.0.0 torchvision==0.15.0 --index-url https://download.pytorch.org/whl/cu118

In [None]:
# Install other dependencies with specific versions
!pip install transformers==4.28.0 \
            timm==0.6.12 \
            stable-baselines3==2.0.0 \
            opencv-python==4.7.0 \
            pillow==9.4.0 \
            sentence-transformers==2.2.2 \
            wandb==0.15.0 \
            tqdm==4.65.0 \
            matplotlib==3.7.0

In [None]:
# Restart the runtime to ensure all packages are properly loaded
import os
os.kill(os.getpid(), 9)

## 2. Clone Repository and Verify Setup

In [None]:
# Clone repository
!git clone https://github.com/YOUR_USERNAME/ML-4.git
%cd ML-4

In [None]:
# Verify package versions and CUDA setup
import torch
import torchvision
import numpy as np
import transformers

print(f"PyTorch version: {torch.__version__}")
print(f"Torchvision version: {torchvision.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"Transformers version: {transformers.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU device: {torch.cuda.get_device_name(0)}")

## 3. Setup Data

Mount Google Drive and set up data directories.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

# Create symbolic links to your data
!ln -s /content/gdrive/MyDrive/path_to_your_data/visual7w-images ITM_Classifier-baselines/visual7w-images

## 4. Configure WandB

In [None]:
import wandb
!wandb login

## 5. Run Training

In [None]:
!python ITM_Classifier-baselines/train_vqa_colab.py

## 6. Monitor Results

The training progress can be monitored in:
1. The output above
2. The WandB dashboard
3. Saved models will be in your Google Drive under 'ML4_models/'

Note: If you encounter any package version conflicts, try restarting the runtime after installations.