# CNN Augmentation Advanced Experiments - Colab Setup

This notebook runs the systematic augmentation experiments on Google Colab GPU.

**Steps:**
1. ‚úÖ Verify GPU is available
2. üìÇ Mount Google Drive (for dataset access)
3. üì¶ Clone repository to Colab
4. üöÄ Run the augmentation script
5. üíæ Results will be saved in the repo (can push back to GitHub)

## Step 1: Verify GPU

In [1]:
import torch

print(f"üîç Checking GPU availability...")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
else:
    print("‚ö†Ô∏è  WARNING: No GPU detected! Training will be very slow.")
    print("   Make sure you selected GPU runtime when connecting.")

üîç Checking GPU availability...
CUDA available: True
‚úÖ GPU: Tesla T4
   Memory: 14.56 GB


## Step 2: Mount Google Drive

This will prompt you to authorize. Click the link, sign in, and copy the authorization code.

In [15]:
from google.colab import drive
drive.mount('/content/drive')

# Verify dataset exists
import os
data_path = '/content/drive/MyDrive/datasets/cifar-10-batches-py'

if os.path.exists(data_path):
    print(f"‚úÖ Dataset found at: {data_path}")
    print(f"   Files: {os.listdir(data_path)}")
else:
    print(f"‚ùå Dataset NOT found at: {data_path}")
    print("   Please ensure CIFAR-10 dataset is uploaded to this path in your Drive.")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
‚úÖ Dataset found at: /content/drive/MyDrive/datasets/cifar-10-batches-py
   Files: ['test_batch', 'data_batch_3', 'data_batch_2', 'data_batch_5', 'data_batch_1', 'data_batch_4', 'batches.meta', 'readme.html']


## Step 3: Clone Repository to Colab

**Option A:** If your repo is on GitHub (replace with your repo URL)

In [16]:
# Clone or update the repo directly in Drive so results persist
import os

REPO_DIR = '/content/drive/MyDrive/repos/cv-transition-lab'
os.makedirs('/content/drive/MyDrive/repos', exist_ok=True)

if os.path.isdir(os.path.join(REPO_DIR, '.git')):
    !git -C /content/drive/MyDrive/repos/cv-transition-lab pull
else:
    !git clone https://github.com/marwanmans360/cv-transition-lab.git /content/drive/MyDrive/repos/cv-transition-lab

%cd /content/drive/MyDrive/repos/cv-transition-lab

# Verify the script exists
script_path = 'src/training/CNN_Augmentation_Advanced.py'

if os.path.exists(script_path):
    print(f"‚úÖ Script found: {script_path}")
else:
    print(f"‚ùå Script NOT found: {script_path}")
    print("   Current directory:", os.getcwd())
    print("   Files:", os.listdir('.'))

Already up to date.
/content/drive/MyDrive/repos/cv-transition-lab
‚úÖ Script found: src/training/CNN_Augmentation_Advanced.py


**Option B:** If your repo is in Google Drive (sync from Drive to Colab)

In [None]:
# Copy repo from Drive to Colab (faster execution)
repo_path_in_drive = '/content/drive/MyDrive/repos/cv-transition-lab'

!cp -r {repo_path_in_drive} /content/
%cd /content/cv-transition-lab

# Verify
import os
if os.path.exists('src/training/CNN_Augmentation_Advanced.py'):
    print("‚úÖ Script found and ready to run!")
else:
    print("‚ùå Script not found. Check the path above.")

: 

## Step 4: Install Dependencies (if needed)

The script requires matplotlib, seaborn, torchvision. These should already be in Colab, but just in case:

In [None]:
# Install/upgrade if needed (usually already available in Colab)
!pip install -q matplotlib seaborn torch torchvision
print("‚úÖ Dependencies ready")

## Step 5: Run a CV Script! üöÄ

Set the script path below and run it. This runner is now generic and can execute any CV training script in the repo.

Examples:
- `src/training/CNN_Augmentation_Advanced.py`
- `src/training/CNN_Plain_Deep_vs_ResNet.py`
- `src/training/Lecture_04_NN_BackProp.py`

Results will be saved wherever the script writes them (prefer Drive paths).

In [None]:
# Run any CV script by setting the path below
RUN_SCRIPT = 'src/training/CNN_Plain_Deep_vs_ResNet.py'  # Change this as needed

# Ensure CIFAR-10 path is available to scripts that expect it
import os
if 'data_path' in globals():
    os.environ['CIFAR10_PATH'] = data_path
    print(f"CIFAR10_PATH set to: {data_path}")

print(f"Running: {RUN_SCRIPT}")
%run $RUN_SCRIPT

Running: src/training/CNN_Plain_Deep_vs_ResNet.py

PLAIN DEEP CNN vs RESNET COMPARISON


ARCHITECTURE COMPARISON: Plain CNN vs ResNet

PLAIN DEEP CNN (Traditional):
----------------------------------------------------------------------

    Input
      ‚Üì
    Conv ‚Üí BN ‚Üí ReLU
      ‚Üì
    Conv ‚Üí BN ‚Üí ReLU
      ‚Üì
    Conv ‚Üí BN ‚Üí ReLU
      ‚Üì
    Conv ‚Üí BN ‚Üí ReLU
      ‚Üì
      ...
      ‚Üì
    Output
    
    Problem: Gradients get weaker as they backpropagate through many layers
    Result:  Hard to train, may suffer degradation
    

----------------------------------------------------------------------
RESNET (With Skip Connections):
----------------------------------------------------------------------

    Input (x)
      ‚Üì‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê                ‚Üê Skip connection!
    Conv ‚Üí BN ‚Üí ReLU             ‚Üê F(x) path
      ‚Üì         ‚îÇ
    Conv ‚Üí BN   ‚îÇ
      ‚Üì         ‚îÇ
      +‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò               

  \_________F(x)_________/    |


FileNotFoundError: [Errno 2] No such file or directory: 'data/cifar-10-batches-py/data_batch_1'

## Step 6: Save Results

**Option A:** Push to GitHub (if you cloned from GitHub)

In [1]:
# Configure git
!git config --global user.email "marro.mans360@gmail.com"
!git config --global user.name "marwanmans360"

# Add and commit results
!git add results_advanced_aug/
!git commit -m "Add augmentation experiment results from Colab"

# Push (you may need to authenticate with GitHub token)
!git push

fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git


**Option B:** Copy results to Google Drive

In [None]:
# Results are already saved in Drive because the repo lives there
RESULTS_DIR = '/content/drive/MyDrive/repos/cv-transition-lab/results_advanced_aug'
!ls {RESULTS_DIR}
print("‚úÖ Results saved in Google Drive!")
print("   Location:", RESULTS_DIR)

‚úÖ Results copied to Google Drive!
   Location: /content/drive/MyDrive/cv-transition-lab/results_CNN/


**Option C:** Download results as ZIP

In [None]:
# Zip and download results (from Drive repo)
RESULTS_DIR = '/content/drive/MyDrive/repos/cv-transition-lab/results_advanced_aug'
ZIP_NAME = 'results_advanced_aug.zip'

!zip -r {ZIP_NAME} {RESULTS_DIR}

from google.colab import files
files.download(ZIP_NAME)

print("‚úÖ Download started! Check your browser's downloads folder.")


zip error: Nothing to do! (try: zip -r results_CNN.zip . -i results_CNN/)


FileNotFoundError: Cannot find file: results_CNN.zip

---

## üìä Results Summary

After the script completes, you'll find:

- **JSON files**: `step0_baseline.json`, `step2_cutout_tuning.json`, `summary.json`
- **Model checkpoints**: `.pth` files for each experiment and seed
- **Visualizations**: 
  - `comparison.png` - Test accuracy and gap comparison
  - `training_curves.png` - Training dynamics
- **Console output**: Comprehensive summary tables and key insights

The final output will show:
- üèÜ Best augmentation strategy
- üìà Performance improvements over baseline
- üéì Key insights and recommendations