# CLIP Model - Getting Started Guide

This guide explains how to run the complete pipeline in Google Colab.

## üìã Pipeline Overview

The complete pipeline consists of these steps:

1. **Setup** - Upload project and install dependencies
2. **Dataset Exploration** (Optional) - Understand the data
3. **Training** - Train the CLIP model
4. **Evaluation** - Evaluate model performance
5. **Export Embeddings** - Precompute embeddings for faster retrieval
6. **Retrieval** - Use the model for image/text search

## üöÄ Quick Start

**You only need to run notebooks 01, 02, 05, 06, 07, 08 in sequence!**

- Notebooks 01-04 are interactive/exploratory
- Notebooks 05-08 are the script equivalents (use these for the full pipeline)


## Step 1: Setup Project in Colab

First, you need to get the project code into Colab. Choose one method:


In [None]:
# OPTION 1: Clone from GitHub (if you've pushed to GitHub)
# !git clone https://github.com/yourusername/CLIP_model.git /content/CLIP_model

# OPTION 2: Upload project folder to Google Drive, then mount
# from google.colab import drive
# drive.mount('/content/drive')
# Then copy from Drive to Colab:
# !cp -r /content/drive/MyDrive/CLIP_model /content/CLIP_model

# OPTION 3: Upload directly to Colab (for small projects)
# Use Colab's file upload feature, then unzip if needed

# Verify project structure
import os
from pathlib import Path

BASE_DIR = Path('/content/CLIP_model')
if BASE_DIR.exists():
    print("‚úÖ Project found!")
    print(f"Project directory: {BASE_DIR}")
    
    # Check important directories
    required_dirs = ['src', 'configs', 'notebooks']
    for dir_name in required_dirs:
        dir_path = BASE_DIR / dir_name
        if dir_path.exists():
            print(f"‚úÖ {dir_name}/ exists")
        else:
            print(f"‚ùå {dir_name}/ missing!")
else:
    print("‚ùå Project not found! Please upload/clone the project first.")


## Step 2: Prepare COCO Dataset

You need the COCO 2017 dataset. The project expects this structure:

```
CLIP_model/
‚îú‚îÄ‚îÄ images/
‚îÇ   ‚îú‚îÄ‚îÄ annotations_trainval2017/
‚îÇ   ‚îÇ   ‚îî‚îÄ‚îÄ annotations/
‚îÇ   ‚îÇ       ‚îú‚îÄ‚îÄ captions_train2017.json
‚îÇ   ‚îÇ       ‚îî‚îÄ‚îÄ captions_val2017.json
‚îÇ   ‚îú‚îÄ‚îÄ train2017.1/train2017/  (or train2017/)
‚îÇ   ‚îî‚îÄ‚îÄ val2017/
```

**Options:**
1. Download COCO dataset and upload to Colab/Drive
2. Use a subset if you have limited storage
3. Mount from Google Drive if you have it there


In [None]:
# Check if dataset is available
from pathlib import Path

BASE_DIR = Path('/content/CLIP_model')
required_files = [
    'images/annotations_trainval2017/annotations/captions_train2017.json',
    'images/annotations_trainval2017/annotations/captions_val2017.json',
]

print("Checking dataset files...")
all_found = True
for file_path in required_files:
    full_path = BASE_DIR / file_path
    if full_path.exists():
        print(f"‚úÖ {file_path}")
    else:
        print(f"‚ùå {file_path} - NOT FOUND")
        all_found = False

if all_found:
    print("\n‚úÖ Dataset files found! You're ready to train.")
else:
    print("\n‚ùå Some dataset files are missing. Please download COCO dataset.")


## Step 3: Run the Pipeline

### **Essential Notebooks (Run in this order):**

1. **`05_train_script.ipynb`** ‚≠ê **REQUIRED**
   - Trains the CLIP model
   - Saves checkpoints to `checkpoints/`
   - This is the main training step

2. **`06_eval_script.ipynb`** ‚≠ê **REQUIRED**
   - Evaluates the trained model
   - Computes Recall@1, Recall@5, Recall@10
   - Saves results to `results/`

3. **`07_export_embeddings.ipynb`** ‚≠ê **REQUIRED for retrieval**
   - Precomputes embeddings for faster search
   - Saves to `embeddings/` directory

4. **`08_retrieve_script.ipynb`** ‚≠ê **REQUIRED for using the model**
   - Text-to-image search
   - Image-to-text search
   - Uses precomputed embeddings

### **Optional/Exploratory Notebooks:**

- **`01_dataset_exploration.ipynb`** - Explore dataset (optional)
- **`02_training.ipynb`** - Interactive training (alternative to 05)
- **`03_evaluation.ipynb`** - Interactive evaluation (alternative to 06)
- **`04_inference_retrieval.ipynb`** - Interactive retrieval (alternative to 08)

**Note:** Notebooks 01-04 are more interactive/educational. Notebooks 05-08 are the script equivalents and are what you need for the full pipeline.


## üìù Important Notes

### About Python Files (`src/` directory)

**You DON'T run Python files directly!** They are imported by the notebooks.

- `src/` contains the actual implementation code
- Notebooks import from `src/` (e.g., `from src.models.clip_model import CLIPModel`)
- The notebooks are the entry points - they call the code in `src/`

### Minimal Pipeline (What you actually need):

```
1. Upload project to Colab (src/, configs/, notebooks/)
2. Upload/download COCO dataset
3. Run 05_train_script.ipynb ‚Üí trains model
4. Run 06_eval_script.ipynb ‚Üí evaluates model  
5. Run 07_export_embeddings.ipynb ‚Üí precomputes embeddings
6. Run 08_retrieve_script.ipynb ‚Üí use the model for search
```

That's it! Everything else is optional.


## üîß Configuration

Before training, you can adjust the config file:

- `configs/clip_coco_tiny.yaml` - Very small (for testing, ~5 min)
- `configs/clip_coco_small.yaml` - Small (for development, ~1 hour)
- `configs/clip_coco_medium.yaml` - Medium (for production, ~6 hours)
- `configs/clip_coco_full.yaml` - Full dataset (~2-3 days)

Edit the `CONFIG_PATH` variable in notebook 05 to use a different config.
