# Phase 1: Dataset Preparation & Environment Setup
Outdoor Object Detection & Face Recognition System

This notebook handles:
- Environment setup and dependency installation
- Dataset download (LFW, WiderFace, RTTS, BDD100K)
- Preprocessing (resize to 640x640, train/val/test split)
- Data augmentation (fog, rain, low-light, motion blur)
- Dataset statistics and verification

**Runtime**: GPU (T4) recommended for faster processing
**Storage**: Results saved to Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

import os
PROJECT_DIR = '/content/drive/MyDrive/computer_vision'
os.makedirs(PROJECT_DIR, exist_ok=True)
print(f"Project directory: {PROJECT_DIR}")

In [None]:
%cd /content
!rm -rf computer_vision_expirement
!git clone https://github.com/Ib-Programmer/computer_vision_expirement.git
%cd computer_vision_expirement
!pip install -q -r requirements.txt
!pip install -q gdown

## 1.1 Dataset Download

In [None]:
%cd /content/computer_vision_expirement
!python scripts/download_datasets.py

## 1.2 Preprocessing

In [None]:
%cd /content/computer_vision_expirement
!python scripts/preprocess_data.py

## 1.3 Data Augmentation

In [None]:
%cd /content/computer_vision_expirement
!python scripts/augment_data.py

## 1.4 Dataset Statistics

In [None]:
%cd /content/computer_vision_expirement
!python scripts/dataset_stats.py

## 1.5 Save to Google Drive

In [None]:
import shutil

src = '/content/computer_vision_expirement/datasets'
dst = f'{PROJECT_DIR}/datasets'

if os.path.exists(dst):
    print(f"Datasets already exist at {dst}, skipping copy")
else:
    print(f"Copying datasets to Google Drive...")
    shutil.copytree(src, dst)
    print("Done!")

# Also copy augmented outputs
src_aug = '/content/computer_vision_expirement/outputs'
dst_aug = f'{PROJECT_DIR}/outputs'
if os.path.exists(src_aug):
    if os.path.exists(dst_aug):
        print(f"Augmented data already exists at {dst_aug}, skipping copy")
    else:
        print(f"Copying augmented data to Google Drive...")
        shutil.copytree(src_aug, dst_aug)
        print("Done!")

print("\nPhase 1 Complete! Data saved to Google Drive.")
print(f"Location: {PROJECT_DIR}")

## Next Steps
- Open **Phase2_Image_Enhancement.ipynb** to evaluate enhancement models
- Datasets are saved in Google Drive and will persist across sessions