# üî¨ Smart Lens v2 ‚Äî Fine-Tune Existing Model with More Data

**Strategy: Transfer Learning from your trained `best.pt`**

This notebook does NOT start from scratch. It loads your already-trained model and fine-tunes it with additional open-source datasets merged with your existing data.

### Why this works:
- Your `best.pt` already has learned feature representations for Fighting, Fire, Gun, Knife
- Fine-tuning preserves those features while improving from new data
- Much faster than training from scratch (fewer epochs needed)
- Results in better accuracy because the model starts from a good baseline

### Pipeline:
1. ‚úÖ Mount Drive & upload your `best.pt`
2. ‚úÖ Download your existing dataset (Roboflow)
3. ‚úÖ Download additional open-source datasets for each class
4. ‚úÖ Merge & deduplicate all datasets into unified YOLOv8 format
5. ‚úÖ Fine-tune from `best.pt` with merged dataset
6. ‚úÖ Evaluate & compare with v1
7. ‚úÖ Export & download improved model

---
## üõ†Ô∏è Section 1: Environment Setup

In [None]:
# ============================================================
# 1A: Install dependencies
# ============================================================
!pip install -q ultralytics roboflow opencv-python-headless

import os, shutil, glob, yaml, json, random, time
from pathlib import Path
from datetime import datetime
from collections import Counter

import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_mem / 1024**3:.1f} GB")

In [None]:
# ============================================================
# 1B: Mount Google Drive (to upload/download model files)
# ============================================================
from google.colab import drive
drive.mount('/content/drive')

# Create project folder in Drive
DRIVE_DIR = '/content/drive/MyDrive/Smart-Lens-FYP'
os.makedirs(DRIVE_DIR, exist_ok=True)
os.makedirs(f'{DRIVE_DIR}/models', exist_ok=True)
print(f'Drive directory: {DRIVE_DIR}')

In [None]:
# ============================================================
# 1C: Upload your existing best.pt model
# ============================================================
from google.colab import files

MODEL_PATH = '/content/best.pt'

# Check if already in Drive
drive_model = f'{DRIVE_DIR}/models/best_v1.pt'
if os.path.exists(drive_model):
    shutil.copy(drive_model, MODEL_PATH)
    print(f'Loaded model from Drive: {drive_model}')
elif not os.path.exists(MODEL_PATH):
    print('Upload your best.pt file:')
    uploaded = files.upload()
    for name in uploaded:
        shutil.move(name, MODEL_PATH)
    # Save backup to Drive
    shutil.copy(MODEL_PATH, drive_model)
    print(f'Saved backup to Drive: {drive_model}')

print(f'Model ready: {MODEL_PATH} ({os.path.getsize(MODEL_PATH)/1024/1024:.1f} MB)')

---
## üìä Section 2: Download & Merge Datasets

### How to add additional datasets:

1. Go to [Roboflow Universe](https://universe.roboflow.com/)
2. Search for: `gun detection`, `knife detection`, `fire detection`, `violence detection`
3. Pick a dataset (look for 1000+ images, good ratings)
4. Click **Download Dataset** ‚Üí Format: **YOLOv8** ‚Üí **show download code**
5. You'll get code like:
```python
rf = Roboflow(api_key="YOUR_KEY")
project = rf.workspace("WORKSPACE").project("PROJECT")
version = project.version(VERSION)
dataset = version.download("yolov8")
```
6. Add the workspace, project, version to the `ADDITIONAL_DATASETS` list in Cell 2B

All datasets are remapped to match your class IDs:
- `0` = Fighting
- `1` = Fire
- `2` = Gun
- `3` = Knife

In [None]:
# ============================================================
# 2A: Download your ORIGINAL dataset
# ============================================================
from roboflow import Roboflow

rf = Roboflow(api_key="7QsEv54uizzlrvPZ972Z")
project = rf.workspace("fpy").project("smart-survellaince-lens-2")
version = project.version(1)
original_ds = version.download("yolov8", location='/content/datasets/original')
print(f'Original dataset downloaded to: /content/datasets/original')

In [None]:
# ============================================================
# 2B: Download ADDITIONAL open-source datasets
# ============================================================
# Each entry downloads a dataset from Roboflow Universe and remaps
# its classes to match YOUR Smart Lens classes (0-3).
#
# HOW TO ADD A DATASET:
# 1. Go to universe.roboflow.com -> find a dataset
# 2. Click Download -> YOLOv8 -> "show download code"
# 3. Copy workspace, project, version from the code
# 4. Add an entry below with the correct class_map
#
# If a dataset fails to download, it's automatically skipped.
#
# class_map: maps SOURCE class names -> YOUR target class ID
#   Fighting=0, Fire=1, Gun=2, Knife=3

ADDITIONAL_DATASETS = [
    # ‚îÄ‚îÄ UNCOMMENT & EDIT with real values from Roboflow Universe: ‚îÄ‚îÄ
    #
    # {
    #     "name": "Gun Detection Dataset",
    #     "workspace": "paste-workspace-from-download-code",
    #     "project": "paste-project-from-download-code",
    #     "version": 1,
    #     "class_map": {"handgun": 2, "gun": 2, "pistol": 2, "weapon": 2},
    # },
    # {
    #     "name": "Knife Detection Dataset",
    #     "workspace": "paste-workspace-from-download-code",
    #     "project": "paste-project-from-download-code",
    #     "version": 1,
    #     "class_map": {"knife": 3, "blade": 3},
    # },
    # {
    #     "name": "Fire Smoke Detection",
    #     "workspace": "paste-workspace-from-download-code",
    #     "project": "paste-project-from-download-code",
    #     "version": 1,
    #     "class_map": {"fire": 1, "smoke": 1, "flame": 1},
    # },
    # {
    #     "name": "Violence Fighting Detection",
    #     "workspace": "paste-workspace-from-download-code",
    #     "project": "paste-project-from-download-code",
    #     "version": 1,
    #     "class_map": {"fighting": 0, "violence": 0, "fight": 0},
    # },
]

# Download each dataset
downloaded_paths = []
for i, ds_info in enumerate(ADDITIONAL_DATASETS):
    name = ds_info['name']
    print(f'\n[{i+1}/{len(ADDITIONAL_DATASETS)}] Downloading: {name}...')
    try:
        rf2 = Roboflow(api_key="7QsEv54uizzlrvPZ972Z")
        proj = rf2.workspace(ds_info['workspace']).project(ds_info['project'])
        ver = proj.version(ds_info['version'])
        dl_path = f'/content/datasets/extra_{i}'
        ver.download('yolov8', location=dl_path)
        ds_info['local_path'] = dl_path
        downloaded_paths.append(ds_info)
        n_imgs = len(glob.glob(f'{dl_path}/**/images/*', recursive=True))
        print(f'  \u2705 Downloaded {n_imgs} images to {dl_path}')
    except Exception as e:
        print(f'  \u274c Failed: {e}')
        print(f'  Skipping, continuing...')

if len(ADDITIONAL_DATASETS) == 0:
    print('No additional datasets configured.')
    print('Fine-tuning will use your original 1793 images only.')
    print('This still helps! Lower LR extracts more accuracy from existing data.')
    print('\nTo add more data: see instructions above this cell.')
else:
    print(f'\n\u2705 {len(downloaded_paths)}/{len(ADDITIONAL_DATASETS)} additional datasets downloaded')

In [None]:
# ============================================================
# 2B-ALT: Manual dataset upload (OPTIONAL)
# ============================================================
# If you downloaded a dataset ZIP manually from Roboflow Universe,
# upload it here and it will be merged automatically.
#
# Set UPLOAD_DATASETS = True to enable

import zipfile

UPLOAD_DATASETS = False  # Set to True to upload ZIP datasets manually

if UPLOAD_DATASETS:
    print('Upload YOLOv8 dataset ZIP files:')
    uploaded = files.upload()
    for zip_name in uploaded:
        extract_path = f'/content/datasets/uploaded_{Path(zip_name).stem}'
        os.makedirs(extract_path, exist_ok=True)
        with zipfile.ZipFile(zip_name, 'r') as z:
            z.extractall(extract_path)
        n = len(glob.glob(f'{extract_path}/**/images/*', recursive=True))
        print(f'Extracted {n} images to: {extract_path}')
        
        # EDIT class_map to match the source dataset!
        downloaded_paths.append({
            'name': f'Uploaded: {zip_name}',
            'local_path': extract_path,
            'class_map': {
                'gun': 2, 'handgun': 2, 'pistol': 2, 'weapon': 2,
                'knife': 3, 'blade': 3,
                'fire': 1, 'smoke': 1, 'flame': 1,
                'fighting': 0, 'violence': 0, 'fight': 0,
            },
        })
else:
    print('Manual upload disabled. Set UPLOAD_DATASETS=True to enable.')

In [None]:
# ============================================================
# 2C: MERGE all datasets into one unified dataset
# ============================================================

MERGED_DIR = '/content/merged_dataset'
os.makedirs(f'{MERGED_DIR}/train/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/train/labels', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/valid/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/valid/labels', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/test/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/test/labels', exist_ok=True)

TARGET_CLASSES = {0: 'Fighting', 1: 'Fire', 2: 'Gun', 3: 'Knife'}
stats = {'total_images': 0, 'total_labels': 0, 'per_class': Counter(), 'per_source': Counter()}

def copy_dataset(src_dir, prefix, class_remap=None, source_data_yaml=None):
    """Copy images+labels from a YOLOv8 dataset to merged dir."""
    src_class_names = {}
    if source_data_yaml and os.path.exists(source_data_yaml):
        with open(source_data_yaml) as f:
            src_yaml = yaml.safe_load(f)
            names = src_yaml.get('names', [])
            if isinstance(names, list):
                src_class_names = {i: n for i, n in enumerate(names)}
            elif isinstance(names, dict):
                src_class_names = names
    
    copied = 0
    for split in ['train', 'valid', 'test']:
        img_dir = os.path.join(src_dir, split, 'images')
        lbl_dir = os.path.join(src_dir, split, 'labels')
        if not os.path.isdir(img_dir):
            continue
        target_split = split
        for img_file in glob.glob(os.path.join(img_dir, '*')):
            img_name = os.path.basename(img_file)
            stem = Path(img_name).stem
            ext = Path(img_name).suffix
            lbl_file = os.path.join(lbl_dir, f'{stem}.txt')
            new_name = f'{prefix}_{stem}'
            new_img = os.path.join(MERGED_DIR, target_split, 'images', f'{new_name}{ext}')
            new_lbl = os.path.join(MERGED_DIR, target_split, 'labels', f'{new_name}.txt')
            if not os.path.exists(lbl_file):
                continue
            if class_remap and src_class_names:
                remapped_lines = []
                with open(lbl_file) as f:
                    for line in f:
                        parts = line.strip().split()
                        if len(parts) < 5:
                            continue
                        src_cls_id = int(parts[0])
                        src_cls_name = src_class_names.get(src_cls_id, '')
                        target_cls = None
                        for src_key, tgt_id in class_remap.items():
                            if src_key.lower() == src_cls_name.lower():
                                target_cls = tgt_id
                                break
                        if target_cls is None and src_cls_id in class_remap.values():
                            target_cls = src_cls_id
                        if target_cls is not None:
                            parts[0] = str(target_cls)
                            remapped_lines.append(' '.join(parts))
                            stats['per_class'][TARGET_CLASSES[target_cls]] += 1
                if not remapped_lines:
                    continue
                shutil.copy2(img_file, new_img)
                with open(new_lbl, 'w') as f:
                    f.write('\n'.join(remapped_lines) + '\n')
            else:
                shutil.copy2(img_file, new_img)
                shutil.copy2(lbl_file, new_lbl)
                with open(lbl_file) as f:
                    for line in f:
                        parts = line.strip().split()
                        if parts:
                            cls_id = int(parts[0])
                            cls_name = TARGET_CLASSES.get(cls_id, f'Unknown_{cls_id}')
                            stats['per_class'][cls_name] += 1
            copied += 1
    return copied

# Copy original dataset
print('Copying original dataset...')
n = copy_dataset('/content/datasets/original', prefix='orig')
stats['per_source']['Original'] = n
stats['total_images'] += n
print(f'  \u2705 {n} images from original dataset')

# Copy additional datasets with class remapping
for ds_info in downloaded_paths:
    name = ds_info['name']
    path = ds_info['local_path']
    remap = ds_info['class_map']
    prefix = name.replace(' ', '_').lower()[:15]
    data_yaml = os.path.join(path, 'data.yaml')
    print(f'\nMerging: {name}...')
    if os.path.exists(data_yaml):
        with open(data_yaml) as f:
            src_yaml = yaml.safe_load(f)
            print(f'  Source classes: {src_yaml.get("names", "?")}')
    n = copy_dataset(path, prefix=prefix, class_remap=remap, source_data_yaml=data_yaml)
    stats['per_source'][name] = n
    stats['total_images'] += n
    print(f'  \u2705 {n} images merged')

print(f'\n{"="*60}')
print(f'MERGE COMPLETE')
print(f'{"="*60}')
print(f'Total images: {stats["total_images"]}')
print(f'\nPer source:')
for src, count in stats['per_source'].items():
    print(f'  {src}: {count}')
print(f'\nPer class (annotations):')
for cls, count in stats['per_class'].most_common():
    print(f'  {cls}: {count}')

In [None]:
# ============================================================
# 2D: Re-split and create data.yaml
# ============================================================
all_train_imgs = glob.glob(f'{MERGED_DIR}/train/images/*')
all_valid_imgs = glob.glob(f'{MERGED_DIR}/valid/images/*')
all_test_imgs = glob.glob(f'{MERGED_DIR}/test/images/*')

print(f'Before re-split: train={len(all_train_imgs)} valid={len(all_valid_imgs)} test={len(all_test_imgs)}')

total = len(all_train_imgs) + len(all_valid_imgs) + len(all_test_imgs)
target_valid = int(total * 0.15)
target_test = int(total * 0.05)

if len(all_valid_imgs) < target_valid * 0.5:
    print('Redistributing to 80/15/5 split...')
    for imgs_path in [all_valid_imgs, all_test_imgs]:
        for img in imgs_path:
            ext = Path(img).suffix
            lbl = img.replace('/images/', '/labels/').replace(ext, '.txt')
            shutil.move(img, f'{MERGED_DIR}/train/images/{Path(img).name}')
            if os.path.exists(lbl):
                shutil.move(lbl, f'{MERGED_DIR}/train/labels/{Path(lbl).name}')
    all_imgs = glob.glob(f'{MERGED_DIR}/train/images/*')
    random.seed(42)
    random.shuffle(all_imgs)
    valid_imgs = all_imgs[:target_valid]
    test_imgs = all_imgs[target_valid:target_valid + target_test]
    for imgs, split in [(valid_imgs, 'valid'), (test_imgs, 'test')]:
        for img in imgs:
            ext = Path(img).suffix
            lbl = img.replace('/images/', '/labels/').replace(ext, '.txt')
            shutil.move(img, f'{MERGED_DIR}/{split}/images/{Path(img).name}')
            if os.path.exists(lbl):
                shutil.move(lbl, f'{MERGED_DIR}/{split}/labels/{Path(lbl).name}')

final_train = len(glob.glob(f'{MERGED_DIR}/train/images/*'))
final_valid = len(glob.glob(f'{MERGED_DIR}/valid/images/*'))
final_test = len(glob.glob(f'{MERGED_DIR}/test/images/*'))
print(f'\nFinal split: train={final_train} valid={final_valid} test={final_test}')
print(f'Total: {final_train + final_valid + final_test} images')

data_yaml = {
    'names': ['Fighting', 'Fire', 'Gun', 'Knife'],
    'nc': 4,
    'train': f'{MERGED_DIR}/train/images',
    'val': f'{MERGED_DIR}/valid/images',
    'test': f'{MERGED_DIR}/test/images',
}
data_yaml_path = f'{MERGED_DIR}/data.yaml'
with open(data_yaml_path, 'w') as f:
    yaml.dump(data_yaml, f, default_flow_style=False)

print(f'\n\u2705 data.yaml created: {data_yaml_path}')
print(yaml.dump(data_yaml, default_flow_style=False))

---
## üéØ Section 3: Fine-Tune Model from `best.pt`

### Key differences from training from scratch:
- `model = YOLO('best.pt')` instead of `YOLO('yolov8s.pt')`
- **Lower learning rate** (0.0005 vs 0.001) ‚Äî preserves learned features
- **Fewer epochs** (100 vs 200) ‚Äî model already has a good baseline
- **Less aggressive augmentation** ‚Äî don't distort what it already knows

In [None]:
# ============================================================
# 3A: Configure fine-tuning parameters
# ============================================================
STRATEGY = 'full'  # 'full' or 'freeze_backbone'

FINETUNE_CONFIG = {
    'model': MODEL_PATH,
    'data': data_yaml_path,
    'epochs': 100,
    'patience': 30,
    'batch': 16,
    'imgsz': 640,
    'device': 0,
    'optimizer': 'AdamW',
    'lr0': 0.0005,
    'lrf': 0.01,
    'weight_decay': 0.0005,
    'warmup_epochs': 3,
    'hsv_h': 0.015,
    'hsv_s': 0.5,
    'hsv_v': 0.3,
    'degrees': 5.0,
    'translate': 0.15,
    'scale': 0.4,
    'shear': 2.0,
    'flipud': 0.0,
    'fliplr': 0.5,
    'mosaic': 1.0,
    'mixup': 0.1,
    'copy_paste': 0.1,
    'erasing': 0.3,
    'close_mosaic': 15,
    'freeze': 10 if STRATEGY == 'freeze_backbone' else None,
    'dropout': 0.05,
    'save': True,
    'save_period': 10,
    'plots': True,
    'project': '/content/runs',
    'name': 'smart_lens_v2',
    'exist_ok': True,
}

print(f'Strategy: {STRATEGY}')
print(f'Starting from: {FINETUNE_CONFIG["model"]}')
print(f'Dataset: {FINETUNE_CONFIG["data"]}')
print(f'Epochs: {FINETUNE_CONFIG["epochs"]}')
print(f'Learning rate: {FINETUNE_CONFIG["lr0"]}')

In [None]:
# ============================================================
# 3B: START FINE-TUNING
# ============================================================
from ultralytics import YOLO

model = YOLO(FINETUNE_CONFIG['model'])
print(f'Model loaded: {FINETUNE_CONFIG["model"]}')
print(f'Classes: {model.names}')
print(f'\n--- Starting fine-tuning... ---\n')

start_time = time.time()
train_args = {k: v for k, v in FINETUNE_CONFIG.items() if v is not None and k != 'model'}
results = model.train(**train_args)

training_time = (time.time() - start_time) / 60
print(f'\n\u2705 Fine-tuning complete! Time: {training_time:.1f} minutes')

---
## üìä Section 4: Evaluate & Compare v1 vs v2

In [None]:
# ============================================================
# 4A: Validate the fine-tuned model
# ============================================================
from ultralytics import YOLO

best_v2_path = '/content/runs/smart_lens_v2/weights/best.pt'
model_v2 = YOLO(best_v2_path)
metrics_v2 = model_v2.val(data=data_yaml_path, split='val')

print(f'\n{"="*60}')
print(f'  Fine-tuned Model (v2) Metrics')
print(f'{"="*60}')
print(f'  mAP50:     {metrics_v2.box.map50:.4f}')
print(f'  mAP50-95:  {metrics_v2.box.map:.4f}')
print(f'  Precision: {metrics_v2.box.mp:.4f}')
print(f'  Recall:    {metrics_v2.box.mr:.4f}')
print(f'\n  Per-class AP50:')
for i, cls_name in enumerate(TARGET_CLASSES.values()):
    if i < len(metrics_v2.box.ap50):
        print(f'    {cls_name:10s}: {metrics_v2.box.ap50[i]:.4f}')
print(f'{"="*60}')

In [None]:
# ============================================================
# 4B: Compare v1 vs v2
# ============================================================
v1_metrics = {
    'mAP50': 0.7255,
    'mAP50_95': 0.3322,
    'precision': 0.8410,
    'recall': 0.6213,
}
v2_metrics = {
    'mAP50': metrics_v2.box.map50,
    'mAP50_95': metrics_v2.box.map,
    'precision': metrics_v2.box.mp,
    'recall': metrics_v2.box.mr,
}

print(f'\n{"="*60}')
print(f'{"Metric":<15} {"v1 (Original)":>15} {"v2 (Fine-tuned)":>15} {"Change":>10}')
print(f'{chr(9472)*60}')
for key in ['mAP50', 'mAP50_95', 'precision', 'recall']:
    v1 = v1_metrics[key]
    v2 = v2_metrics[key]
    delta = v2 - v1
    arrow = '\u2191' if delta > 0 else '\u2193' if delta < 0 else '\u2192'
    print(f'{key:<15} {v1:>15.4f} {v2:>15.4f} {arrow} {abs(delta):.4f}')
print(f'{"="*60}')

if v2_metrics['mAP50'] > v1_metrics['mAP50']:
    print('\n\ud83c\udf89 v2 is BETTER than v1!')
else:
    print('\n\u26a0\ufe0f v2 metrics are similar/lower. This is expected when fine-tuning')
    print('on the same data. Add more diverse data for bigger improvements.')

---
## üì¶ Section 5: Export & Save Model

In [None]:
# ============================================================
# 5A: Export to ONNX + Save to Drive
# ============================================================
model_v2.export(format='onnx', simplify=True)
print('\u2705 ONNX export complete')

timestamp = datetime.now().strftime('%Y%m%d_%H%M')
output_dir = f'{DRIVE_DIR}/models/smart_lens_v2_{timestamp}'
os.makedirs(output_dir, exist_ok=True)

shutil.copy(best_v2_path, f'{output_dir}/best.pt')
shutil.copy(best_v2_path.replace('.pt', '.onnx'), f'{output_dir}/best.onnx')
shutil.copy('/content/runs/smart_lens_v2/weights/last.pt', f'{output_dir}/last.pt')

metrics_data = {
    'mAP50': float(metrics_v2.box.map50),
    'mAP50_95': float(metrics_v2.box.map),
    'precision': float(metrics_v2.box.mp),
    'recall': float(metrics_v2.box.mr),
    'training_time_min': round(training_time, 1),
    'timestamp': timestamp,
    'base_model': 'smart_lens_v1 (fine-tuned)',
    'strategy': STRATEGY,
    'total_images': stats['total_images'],
    'classes': list(TARGET_CLASSES.values()),
}
with open(f'{output_dir}/metrics.json', 'w') as f:
    json.dump(metrics_data, f, indent=2)

with open(f'{output_dir}/training_config.json', 'w') as f:
    json.dump(FINETUNE_CONFIG, f, indent=2, default=str)

print(f'\n\u2705 All files saved to Google Drive:')
print(f'   {output_dir}/')
for fname in os.listdir(output_dir):
    size = os.path.getsize(f'{output_dir}/{fname}') / 1024 / 1024
    print(f'   \u251c\u2500\u2500 {fname} ({size:.1f} MB)')

In [None]:
# ============================================================
# 5B: Download model to local machine
# ============================================================
from google.colab import files
files.download(f'{output_dir}/best.pt')
print('Place the downloaded best.pt in:')
print('  Smart-Lens-FYP/trained_models/smart_lens_v2_XXXX/best.pt')

---
## üí° Section 6: Quick Test

In [None]:
# ============================================================
# 6: Visual test on sample images
# ============================================================
import matplotlib.pyplot as plt

test_images = glob.glob(f'{MERGED_DIR}/test/images/*')[:8]
fig, axes = plt.subplots(2, 4, figsize=(20, 10))
axes = axes.flatten()

for i, img_path in enumerate(test_images):
    results = model_v2.predict(img_path, conf=0.4, verbose=False)
    annotated = results[0].plot()
    axes[i].imshow(annotated[:, :, ::-1])
    axes[i].set_title(Path(img_path).stem[:30], fontsize=8)
    axes[i].axis('off')

plt.suptitle('Smart Lens v2 - Test Predictions', fontsize=16)
plt.tight_layout()
plt.savefig(f'{output_dir}/test_predictions.png', dpi=150)
plt.show()

---
## ‚úÖ Done! Next Steps

1. **Download `best.pt`** from the output above (or from Google Drive)
2. **Place it** in `Smart-Lens-FYP/trained_models/smart_lens_v2_XXXX/`
3. **Test locally** with:
   ```bash
   python smart_lens_v2.py --source 0 --model trained_models/smart_lens_v2_XXXX/best.pt
   ```

### To add more data and train v3:
1. Go to [universe.roboflow.com](https://universe.roboflow.com/)
2. Search for gun/knife/fire/fighting datasets
3. Click Download ‚Üí YOLOv8 ‚Üí show download code
4. Copy workspace + project + version from the download code
5. Add entries to Cell 2B's `ADDITIONAL_DATASETS` list
6. Upload the new `best.pt` (from v2) as the starting model
7. Re-run this notebook ‚Üí v3!

# üî¨ Smart Lens v2 ‚Äî Fine-Tune Existing Model with More Data

**Strategy: Transfer Learning from your trained `best.pt`**

This notebook does NOT start from scratch. It loads your already-trained model and fine-tunes it with additional open-source datasets merged with your existing data.

### Why this works:
- Your `best.pt` already has learned feature representations for Fighting, Fire, Gun, Knife
- Fine-tuning preserves those features while improving from new data
- Much faster than training from scratch (fewer epochs needed)
- Results in better accuracy because the model starts from a good baseline

### Pipeline:
1. ‚úÖ Mount Drive & upload your `best.pt`
2. ‚úÖ Download your existing dataset (Roboflow)
3. ‚úÖ Download additional open-source datasets for each class
4. ‚úÖ Merge & deduplicate all datasets into unified YOLOv8 format
5. ‚úÖ Fine-tune from `best.pt` with merged dataset
6. ‚úÖ Evaluate & compare with v1
7. ‚úÖ Export & download improved model

---
## üõ†Ô∏è Section 1: Environment Setup

In [None]:
# ============================================================
# 1A: Install dependencies
# ============================================================
!pip install -q ultralytics roboflow opencv-python-headless

import os, shutil, glob, yaml, json, random, time
from pathlib import Path
from datetime import datetime
from collections import Counter

import torch
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_mem / 1024**3:.1f} GB")

In [None]:
# ============================================================
# 1B: Mount Google Drive (to upload/download model files)
# ============================================================
from google.colab import drive
drive.mount('/content/drive')

# Create project folder in Drive
DRIVE_DIR = '/content/drive/MyDrive/Smart-Lens-FYP'
os.makedirs(DRIVE_DIR, exist_ok=True)
os.makedirs(f'{DRIVE_DIR}/models', exist_ok=True)
print(f'Drive directory: {DRIVE_DIR}')

In [None]:
# ============================================================
# 1C: Upload your existing best.pt model
# ============================================================
# Option 1: Upload from local machine
from google.colab import files

MODEL_PATH = '/content/best.pt'

# Check if already in Drive
drive_model = f'{DRIVE_DIR}/models/best_v1.pt'
if os.path.exists(drive_model):
    shutil.copy(drive_model, MODEL_PATH)
    print(f'Loaded model from Drive: {drive_model}')
elif not os.path.exists(MODEL_PATH):
    print('Upload your best.pt file:')
    uploaded = files.upload()
    for name in uploaded:
        shutil.move(name, MODEL_PATH)
    # Save backup to Drive
    shutil.copy(MODEL_PATH, drive_model)
    print(f'Saved backup to Drive: {drive_model}')

print(f'Model ready: {MODEL_PATH} ({os.path.getsize(MODEL_PATH)/1024/1024:.1f} MB)')

---
## üìä Section 2: Download & Merge Datasets

### Additional Open-Source Datasets (Roboflow Universe)

We'll merge your existing data with these high-quality labeled datasets:

| Class | Additional Dataset | Source | Why |
|-------|-------------------|--------|-----|
| **Gun** | Handgun Detection | Roboflow Universe | More gun angles, CCTV-style |
| **Gun** | Weapons Detection | Roboflow Universe | Various firearms |
| **Knife** | Knife Detection | Roboflow Universe | More knife variations |
| **Fire** | Fire & Smoke | Roboflow Universe | Indoor/outdoor fire scenes |
| **Fighting** | Violence Detection | Roboflow Universe | Surveillance-style fighting |

All datasets are remapped to match your class IDs:
- `0` = Fighting
- `1` = Fire
- `2` = Gun
- `3` = Knife

In [None]:
# ============================================================
# 2A: Download your ORIGINAL dataset
# ============================================================
from roboflow import Roboflow

rf = Roboflow(api_key="7QsEv54uizzlrvPZ972Z")
project = rf.workspace("fpy").project("smart-survellaince-lens-2")
version = project.version(1)
original_ds = version.download("yolov8", location='/content/datasets/original')
print(f'Original dataset downloaded to: /content/datasets/original')

In [None]:
# ============================================================
# 2B: Download ADDITIONAL open-source datasets
# ============================================================
# Method: Direct download from Roboflow Universe using public URLs.
# The Roboflow Python SDK can only access YOUR workspaces, so for
# public Universe datasets we download via URL + unzip instead.
#
# To add your own: go to universe.roboflow.com, find a dataset,
# click "Download" ‚Üí YOLOv8 ‚Üí copy the download URL.

import requests, zipfile, io

ADDITIONAL_DATASETS = [
    # ‚îÄ‚îÄ Gun datasets ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    {
        "name": "Gun Detection (Roboflow Universe)",
        "url": "https://universe.roboflow.com/ds/PLACEHOLDER_GUN?key=7QsEv54uizzlrvPZ972Z",
        "class_map": {"handgun": 2, "Handgun": 2, "gun": 2, "Gun": 2, "pistol": 2, "Pistol": 2, "weapon": 2, "Weapon": 2, "firearm": 2, "Firearm": 2, "rifle": 2, "Rifle": 2},
        "target_classes": [2],
    },
    # ‚îÄ‚îÄ Knife datasets ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    {
        "name": "Knife Detection (Roboflow Universe)",
        "url": "https://universe.roboflow.com/ds/PLACEHOLDER_KNIFE?key=7QsEv54uizzlrvPZ972Z",
        "class_map": {"knife": 3, "Knife": 3, "blade": 3, "Blade": 3},
        "target_classes": [3],
    },
    # ‚îÄ‚îÄ Fire datasets ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    {
        "name": "Fire Detection (Roboflow Universe)",
        "url": "https://universe.roboflow.com/ds/PLACEHOLDER_FIRE?key=7QsEv54uizzlrvPZ972Z",
        "class_map": {"fire": 1, "Fire": 1, "smoke": 1, "Smoke": 1, "flame": 1, "Flame": 1},
        "target_classes": [1],
    },
]

# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
# üîß HOW TO GET THE REAL DOWNLOAD URLS:
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê
# 1. Go to https://universe.roboflow.com
# 2. Search for "gun detection", "knife detection", "fire detection"
# 3. Pick a dataset (look for 1000+ images, good ratings)
# 4. Click the dataset ‚Üí "Download Dataset" ‚Üí Format: "YOLOv8"
# 5. Choose "download zip to computer" or "show download code"
# 6. Copy the URL (looks like: https://universe.roboflow.com/ds/XXXXX?key=YYYY)
# 7. Replace the PLACEHOLDER URLs above with the real ones
#
# ALTERNATIVELY: Use the manual download method below (Cell 2B-ALT)
# ‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê‚ïê

downloaded_paths = []
for i, ds_info in enumerate(ADDITIONAL_DATASETS):
    name = ds_info["name"]
    url = ds_info["url"]
    dl_path = f'/content/datasets/extra_{i}'

    print(f'\n[{i+1}/{len(ADDITIONAL_DATASETS)}] Downloading: {name}...')

    if "PLACEHOLDER" in url:
        print(f'  ‚è≠Ô∏è Skipping ‚Äî URL is placeholder. Replace with real URL from Roboflow Universe.')
        continue

    try:
        os.makedirs(dl_path, exist_ok=True)
        r = requests.get(url, stream=True, timeout=120)
        r.raise_for_status()
        z = zipfile.ZipFile(io.BytesIO(r.content))
        z.extractall(dl_path)
        ds_info["local_path"] = dl_path
        downloaded_paths.append(ds_info)
        # Count images
        n_imgs = len(glob.glob(f'{dl_path}/**/images/*', recursive=True))
        print(f'  ‚úÖ Downloaded {n_imgs} images to {dl_path}')
    except Exception as e:
        print(f'  ‚ùå Failed: {e}')
        print(f'  Skipping, continuing...')

print(f'\n‚úÖ {len(downloaded_paths)}/{len(ADDITIONAL_DATASETS)} additional datasets downloaded')
if len(downloaded_paths) == 0:
    print('\n‚ö†Ô∏è No additional datasets downloaded.')
    print('The model will fine-tune on your original dataset only (still beneficial!).')
    print('To add more data next time, replace PLACEHOLDER URLs above ‚Äî see instructions.')

In [None]:
# ============================================================
# 2C: MERGE all datasets into one unified dataset
# ============================================================
# This is the critical step: remap class IDs, copy images+labels,
# handle duplicates, and create a unified data.yaml

MERGED_DIR = '/content/merged_dataset'
os.makedirs(f'{MERGED_DIR}/train/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/train/labels', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/valid/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/valid/labels', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/test/images', exist_ok=True)
os.makedirs(f'{MERGED_DIR}/test/labels', exist_ok=True)

# Our target class mapping
TARGET_CLASSES = {0: 'Fighting', 1: 'Fire', 2: 'Gun', 3: 'Knife'}
stats = {'total_images': 0, 'total_labels': 0, 'per_class': Counter(), 'per_source': Counter()}

def copy_dataset(src_dir, prefix, class_remap=None, source_data_yaml=None):
    """Copy images+labels from a YOLOv8 dataset to merged dir.
    
    class_remap: dict mapping source_class_name -> target_class_id
                 If None, assume same class IDs as target.
    source_data_yaml: path to source data.yaml to read class names
    """
    # Read source class names if we need to remap
    src_class_names = {}
    if source_data_yaml and os.path.exists(source_data_yaml):
        with open(source_data_yaml) as f:
            src_yaml = yaml.safe_load(f)
            names = src_yaml.get('names', [])
            if isinstance(names, list):
                src_class_names = {i: n for i, n in enumerate(names)}
            elif isinstance(names, dict):
                src_class_names = names
    
    copied = 0
    for split in ['train', 'valid', 'test']:
        img_dir = os.path.join(src_dir, split, 'images')
        lbl_dir = os.path.join(src_dir, split, 'labels')
        
        if not os.path.isdir(img_dir):
            continue
        
        # Map all extra valid/test to train (we'll re-split later)
        target_split = split
        
        for img_file in glob.glob(os.path.join(img_dir, '*')):
            img_name = os.path.basename(img_file)
            stem = Path(img_name).stem
            ext = Path(img_name).suffix
            lbl_file = os.path.join(lbl_dir, f'{stem}.txt')
            
            # New unique name with prefix
            new_name = f'{prefix}_{stem}'
            new_img = os.path.join(MERGED_DIR, target_split, 'images', f'{new_name}{ext}')
            new_lbl = os.path.join(MERGED_DIR, target_split, 'labels', f'{new_name}.txt')
            
            # Skip if no label file
            if not os.path.exists(lbl_file):
                continue
            
            # Remap labels if needed
            if class_remap and src_class_names:
                remapped_lines = []
                with open(lbl_file) as f:
                    for line in f:
                        parts = line.strip().split()
                        if len(parts) < 5:
                            continue
                        src_cls_id = int(parts[0])
                        src_cls_name = src_class_names.get(src_cls_id, '')
                        
                        # Try to find target class ID
                        target_cls = None
                        for src_key, tgt_id in class_remap.items():
                            if src_key.lower() == src_cls_name.lower():
                                target_cls = tgt_id
                                break
                        # Also try direct ID mapping
                        if target_cls is None and src_cls_id in class_remap.values():
                            target_cls = src_cls_id
                        
                        if target_cls is not None:
                            parts[0] = str(target_cls)
                            remapped_lines.append(' '.join(parts))
                            stats['per_class'][TARGET_CLASSES[target_cls]] += 1
                
                if not remapped_lines:
                    continue  # No valid labels after remap
                
                shutil.copy2(img_file, new_img)
                with open(new_lbl, 'w') as f:
                    f.write('\n'.join(remapped_lines) + '\n')
            else:
                # Direct copy (same class structure)
                shutil.copy2(img_file, new_img)
                shutil.copy2(lbl_file, new_lbl)
                # Count per class
                with open(lbl_file) as f:
                    for line in f:
                        parts = line.strip().split()
                        if parts:
                            cls_id = int(parts[0])
                            cls_name = TARGET_CLASSES.get(cls_id, f'Unknown_{cls_id}')
                            stats['per_class'][cls_name] += 1
            
            copied += 1
    
    return copied


# ‚îÄ‚îÄ Step 1: Copy original dataset ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
print('Copying original dataset...')
n = copy_dataset('/content/datasets/original', prefix='orig')
stats['per_source']['Original'] = n
stats['total_images'] += n
print(f'  ‚úÖ {n} images from original dataset')

# ‚îÄ‚îÄ Step 2: Copy additional datasets with class remapping ‚îÄ‚îÄ‚îÄ‚îÄ
for ds_info in downloaded_paths:
    name = ds_info['name']
    path = ds_info['local_path']
    remap = ds_info['class_map']
    prefix = name.replace(' ', '_').lower()[:15]
    
    # Find data.yaml in downloaded dataset
    data_yaml = os.path.join(path, 'data.yaml')
    
    print(f'\nMerging: {name}...')
    if os.path.exists(data_yaml):
        with open(data_yaml) as f:
            src_yaml = yaml.safe_load(f)
            print(f'  Source classes: {src_yaml.get("names", "?")}')
    
    n = copy_dataset(path, prefix=prefix, class_remap=remap, source_data_yaml=data_yaml)
    stats['per_source'][name] = n
    stats['total_images'] += n
    print(f'  ‚úÖ {n} images merged (remapped to Smart Lens classes)')

print(f'\n{"="*60}')
print(f'MERGE COMPLETE')
print(f'{"="*60}')
print(f'Total images: {stats["total_images"]}')
print(f'\nPer source:')
for src, count in stats['per_source'].items():
    print(f'  {src}: {count}')
print(f'\nPer class (annotations):')
for cls, count in stats['per_class'].most_common():
    print(f'  {cls}: {count}')

In [None]:
# ============================================================
# 2D: Re-split merged dataset (80/15/5) and create data.yaml
# ============================================================
# Some extra datasets may have put everything in train/.
# Let's do a clean 80/15/5 split of the full merged set.

all_train_imgs = glob.glob(f'{MERGED_DIR}/train/images/*')
all_valid_imgs = glob.glob(f'{MERGED_DIR}/valid/images/*')
all_test_imgs = glob.glob(f'{MERGED_DIR}/test/images/*')

print(f'Before re-split: train={len(all_train_imgs)} valid={len(all_valid_imgs)} test={len(all_test_imgs)}')

# If valid/test are too small, redistribute
total = len(all_train_imgs) + len(all_valid_imgs) + len(all_test_imgs)
target_valid = int(total * 0.15)
target_test = int(total * 0.05)

if len(all_valid_imgs) < target_valid * 0.5:
    print('Redistributing to 80/15/5 split...')
    # Move all back to train first
    for imgs_path in [all_valid_imgs, all_test_imgs]:
        for img in imgs_path:
            stem = Path(img).stem
            ext = Path(img).suffix
            lbl = img.replace('/images/', '/labels/').replace(ext, '.txt')
            shutil.move(img, f'{MERGED_DIR}/train/images/{Path(img).name}')
            if os.path.exists(lbl):
                shutil.move(lbl, f'{MERGED_DIR}/train/labels/{Path(lbl).name}')
    
    # Shuffle and split
    all_imgs = glob.glob(f'{MERGED_DIR}/train/images/*')
    random.seed(42)
    random.shuffle(all_imgs)
    
    valid_imgs = all_imgs[:target_valid]
    test_imgs = all_imgs[target_valid:target_valid + target_test]
    
    for imgs, split in [(valid_imgs, 'valid'), (test_imgs, 'test')]:
        for img in imgs:
            stem = Path(img).stem
            ext = Path(img).suffix
            lbl = img.replace('/images/', '/labels/').replace(ext, '.txt')
            shutil.move(img, f'{MERGED_DIR}/{split}/images/{Path(img).name}')
            if os.path.exists(lbl):
                shutil.move(lbl, f'{MERGED_DIR}/{split}/labels/{Path(lbl).name}')

# Final counts
final_train = len(glob.glob(f'{MERGED_DIR}/train/images/*'))
final_valid = len(glob.glob(f'{MERGED_DIR}/valid/images/*'))
final_test = len(glob.glob(f'{MERGED_DIR}/test/images/*'))
print(f'\nFinal split: train={final_train} valid={final_valid} test={final_test}')
print(f'Total: {final_train + final_valid + final_test} images')

# ‚îÄ‚îÄ Create data.yaml ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
data_yaml = {
    'names': ['Fighting', 'Fire', 'Gun', 'Knife'],
    'nc': 4,
    'train': f'{MERGED_DIR}/train/images',
    'val': f'{MERGED_DIR}/valid/images',
    'test': f'{MERGED_DIR}/test/images',
}

data_yaml_path = f'{MERGED_DIR}/data.yaml'
with open(data_yaml_path, 'w') as f:
    yaml.dump(data_yaml, f, default_flow_style=False)

print(f'\n‚úÖ data.yaml created: {data_yaml_path}')
print(yaml.dump(data_yaml, default_flow_style=False))

---
## üéØ Section 3: Fine-Tune Model from `best.pt`

### Key differences from training from scratch:
- `model = YOLO('best.pt')` instead of `YOLO('yolov8s.pt')`
- **Lower learning rate** (0.0005 vs 0.001) ‚Äî preserves learned features
- **Fewer epochs** (100 vs 200) ‚Äî model already has a good baseline
- **Less aggressive augmentation** ‚Äî don't distort what it already knows
- **Freeze early layers** option ‚Äî lock backbone, only train detection head

In [None]:
# ============================================================
# 3A: Configure fine-tuning parameters
# ============================================================

# ‚îÄ‚îÄ Choose your strategy ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# STRATEGY 1: Full fine-tune (recommended - best results)
#   - All layers trainable, low learning rate
#   - Best when you have significant new data (>500 images)
#
# STRATEGY 2: Freeze backbone (faster, less risk of forgetting)
#   - Only train detection head layers
#   - Best when new data is small or very different

STRATEGY = 'full'  # Change to 'freeze_backbone' if needed

# Training config
FINETUNE_CONFIG = {
    # ‚îÄ‚îÄ Core ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    'model': MODEL_PATH,           # YOUR trained model, NOT yolov8s.pt
    'data': data_yaml_path,        # Merged dataset
    'epochs': 100,                 # Fewer epochs needed for fine-tune
    'patience': 30,                # Early stopping
    'batch': 16,                   # Adjust if OOM
    'imgsz': 640,
    'device': 0,
    
    # ‚îÄ‚îÄ Learning Rate (LOWER for fine-tuning) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    'optimizer': 'AdamW',
    'lr0': 0.0005,                 # Half of v1's 0.001
    'lrf': 0.01,
    'weight_decay': 0.0005,
    'warmup_epochs': 3,            # Shorter warmup
    
    # ‚îÄ‚îÄ Augmentation (LIGHTER for fine-tuning) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    'hsv_h': 0.015,
    'hsv_s': 0.5,                  # Reduced from 0.7
    'hsv_v': 0.3,                  # Reduced from 0.4
    'degrees': 5.0,                # Reduced from 10.0
    'translate': 0.15,
    'scale': 0.4,                  # Reduced from 0.5
    'shear': 2.0,                  # Reduced from 5.0
    'flipud': 0.0,
    'fliplr': 0.5,
    'mosaic': 1.0,
    'mixup': 0.1,                  # Reduced from 0.15
    'copy_paste': 0.1,
    'erasing': 0.3,                # Reduced from 0.4
    'close_mosaic': 15,
    
    # ‚îÄ‚îÄ Backbone freeze (Strategy 2 only) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    'freeze': 10 if STRATEGY == 'freeze_backbone' else None,
    
    # ‚îÄ‚îÄ Other ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    'dropout': 0.05,               # Lower dropout for fine-tune
    'save': True,
    'save_period': 10,
    'plots': True,
    'project': '/content/runs',
    'name': 'smart_lens_v2',
    'exist_ok': True,
}

print(f'Strategy: {STRATEGY}')
print(f'Starting from: {FINETUNE_CONFIG["model"]}')
print(f'Dataset: {FINETUNE_CONFIG["data"]}')
print(f'Epochs: {FINETUNE_CONFIG["epochs"]}')
print(f'Learning rate: {FINETUNE_CONFIG["lr0"]}')
if FINETUNE_CONFIG['freeze']:
    print(f'Frozen layers: first {FINETUNE_CONFIG["freeze"]} layers')

In [None]:
# ============================================================
# 3B: START FINE-TUNING
# ============================================================
from ultralytics import YOLO

# Load YOUR trained model (not a pretrained base model)
model = YOLO(FINETUNE_CONFIG['model'])
print(f'\nModel loaded: {FINETUNE_CONFIG["model"]}')
print(f'Model type: {model.type}')
print(f'Classes: {model.names}')
print(f'\n--- Starting fine-tuning... ---\n')

start_time = time.time()

# Build training args (remove None values)
train_args = {k: v for k, v in FINETUNE_CONFIG.items() if v is not None and k != 'model'}

results = model.train(**train_args)

training_time = (time.time() - start_time) / 60
print(f'\n‚úÖ Fine-tuning complete! Time: {training_time:.1f} minutes')

---
## üìä Section 4: Evaluate & Compare v1 vs v2

In [None]:
# ============================================================
# 4A: Validate the fine-tuned model
# ============================================================
from ultralytics import YOLO

# Load best fine-tuned model
best_v2_path = '/content/runs/smart_lens_v2/weights/best.pt'
model_v2 = YOLO(best_v2_path)

# Validate on merged validation set
metrics_v2 = model_v2.val(data=data_yaml_path, split='val')

print(f'\n{"="*60}')
print(f'  Fine-tuned Model (v2) Metrics')
print(f'{"="*60}')
print(f'  mAP50:     {metrics_v2.box.map50:.4f}')
print(f'  mAP50-95:  {metrics_v2.box.map:.4f}')
print(f'  Precision: {metrics_v2.box.mp:.4f}')
print(f'  Recall:    {metrics_v2.box.mr:.4f}')
print(f'\n  Per-class AP50:')
for i, cls_name in enumerate(TARGET_CLASSES.values()):
    if i < len(metrics_v2.box.ap50):
        print(f'    {cls_name:10s}: {metrics_v2.box.ap50[i]:.4f}')
print(f'{"="*60}')

In [None]:
# ============================================================
# 4B: Compare v1 vs v2 side by side
# ============================================================

# v1 metrics (from your first training)
v1_metrics = {
    'mAP50': 0.7255,
    'mAP50_95': 0.3322,
    'precision': 0.8410,
    'recall': 0.6213,
}

v2_metrics = {
    'mAP50': metrics_v2.box.map50,
    'mAP50_95': metrics_v2.box.map,
    'precision': metrics_v2.box.mp,
    'recall': metrics_v2.box.mr,
}

print(f'\n{"="*60}')
print(f'{"Metric":<15} {"v1 (Original)":>15} {"v2 (Fine-tuned)":>15} {"Change":>10}')
print(f'{"‚îÄ"*60}')
for key in ['mAP50', 'mAP50_95', 'precision', 'recall']:
    v1 = v1_metrics[key]
    v2 = v2_metrics[key]
    delta = v2 - v1
    arrow = '‚Üë' if delta > 0 else '‚Üì' if delta < 0 else '‚Üí'
    color_delta = f'{arrow} {abs(delta):.4f}'
    print(f'{key:<15} {v1:>15.4f} {v2:>15.4f} {color_delta:>10}')
print(f'{"="*60}')

if v2_metrics['mAP50'] > v1_metrics['mAP50']:
    print('\nüéâ v2 is BETTER than v1! Improvement achieved.')
else:
    print('\n‚ö†Ô∏è v2 did not improve over v1. Consider:')
    print('  - Adding more diverse data')
    print('  - Training for more epochs')
    print('  - Using freeze_backbone strategy')

---
## üì¶ Section 5: Export & Save Model

In [None]:
# ============================================================
# 5A: Export to ONNX + Save to Drive
# ============================================================
import json

# Export ONNX
model_v2.export(format='onnx', simplify=True)
print('‚úÖ ONNX export complete')

# Create output directory
timestamp = datetime.now().strftime('%Y%m%d_%H%M')
output_dir = f'{DRIVE_DIR}/models/smart_lens_v2_{timestamp}'
os.makedirs(output_dir, exist_ok=True)

# Copy model files
shutil.copy(best_v2_path, f'{output_dir}/best.pt')
shutil.copy(best_v2_path.replace('.pt', '.onnx'), f'{output_dir}/best.onnx')
shutil.copy('/content/runs/smart_lens_v2/weights/last.pt', f'{output_dir}/last.pt')

# Save metrics
metrics_data = {
    'mAP50': float(metrics_v2.box.map50),
    'mAP50_95': float(metrics_v2.box.map),
    'precision': float(metrics_v2.box.mp),
    'recall': float(metrics_v2.box.mr),
    'training_time_min': round(training_time, 1),
    'timestamp': timestamp,
    'base_model': 'smart_lens_v1 (fine-tuned)',
    'strategy': STRATEGY,
    'total_images': stats['total_images'],
    'classes': list(TARGET_CLASSES.values()),
    'per_class_annotations': dict(stats['per_class']),
    'v1_comparison': {
        'mAP50_delta': float(v2_metrics['mAP50'] - v1_metrics['mAP50']),
        'recall_delta': float(v2_metrics['recall'] - v1_metrics['recall']),
    }
}
with open(f'{output_dir}/metrics.json', 'w') as f:
    json.dump(metrics_data, f, indent=2)

# Save training config
with open(f'{output_dir}/training_config.json', 'w') as f:
    json.dump(FINETUNE_CONFIG, f, indent=2, default=str)

print(f'\n‚úÖ All files saved to Google Drive:')
print(f'   {output_dir}/')
for f in os.listdir(output_dir):
    size = os.path.getsize(f'{output_dir}/{f}') / 1024 / 1024
    print(f'   ‚îú‚îÄ‚îÄ {f} ({size:.1f} MB)')

In [None]:
# ============================================================
# 5B: Download model files to local machine
# ============================================================
from google.colab import files

# Download best.pt
files.download(f'{output_dir}/best.pt')
print('Download best.pt ‚Äî place it in:')
print('  Smart-Lens-FYP/trained_models/smart_lens_v2_XXXX/best.pt')

# Optionally download ONNX too
# files.download(f'{output_dir}/best.onnx')

---
## üí° Section 6: Quick Test on Sample Images

In [None]:
# ============================================================
# 6: Quick visual test on test images
# ============================================================
import matplotlib.pyplot as plt
from PIL import Image

test_images = glob.glob(f'{MERGED_DIR}/test/images/*')[:8]

fig, axes = plt.subplots(2, 4, figsize=(20, 10))
axes = axes.flatten()

for i, img_path in enumerate(test_images):
    results = model_v2.predict(img_path, conf=0.4, verbose=False)
    annotated = results[0].plot()
    axes[i].imshow(annotated[:, :, ::-1])  # BGR to RGB
    axes[i].set_title(Path(img_path).stem[:30], fontsize=8)
    axes[i].axis('off')

plt.suptitle('Smart Lens v2 ‚Äî Test Predictions', fontsize=16)
plt.tight_layout()
plt.savefig(f'{output_dir}/test_predictions.png', dpi=150)
plt.show()
print('Test predictions saved to Drive.')

---
## ‚úÖ Done! Next Steps

1. **Download `best.pt`** from the output above (or from Google Drive)
2. **Place it** in `Smart-Lens-FYP/trained_models/smart_lens_v2_XXXX/`
3. **Test locally** with:
   ```bash
   python smart_lens_v2.py --source 0 --model trained_models/smart_lens_v2_XXXX/best.pt
   ```

### If results still need improvement:
- **Add more data**: Search Roboflow Universe for more gun/fighting datasets
- **Run this notebook again**: Upload the new `best.pt` from v2 and fine-tune to v3
- **Try YOLOv8m**: Change base model for +3-5% mAP (slower but more accurate)
- **Increase epochs**: Set to 150-200 if the model is still improving at epoch 100