# üîß Werkplek Inspectie AI - YOLO Training

Training notebook voor Google Colab

**Stappen:**
1. Setup omgeving
2. Upload dataset
3. Prepareer data
4. Train YOLO model
5. Download getraind model

**‚ö†Ô∏è BELANGRIJK: Zet Runtime op GPU!**
- Runtime ‚Üí Change runtime type ‚Üí GPU (T4)

## 1Ô∏è‚É£ Setup Omgeving

In [None]:
# Check GPU
!nvidia-smi

In [None]:
# Installeer dependencies
!pip install ultralytics opencv-python pillow -q

In [None]:
# Imports
import os
import shutil
from pathlib import Path
import random
from ultralytics import YOLO
import torch

print(f"‚úÖ PyTorch versie: {torch.__version__}")
print(f"‚úÖ CUDA beschikbaar: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")

## 2Ô∏è‚É£ Upload Dataset

**Optie A: Vanuit Google Drive**
1. Upload je dataset folder naar Google Drive
2. Mount Drive hieronder

**Optie B: Direct upload (kleine datasets)**
- Gebruik de file upload hieronder

In [None]:
# OPTIE A: Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Pas aan naar jouw Drive locatie
DATASET_SOURCE = '/content/drive/MyDrive/AI afbeeldingen'

# Kopieer naar Colab (sneller)
!cp -r "{DATASET_SOURCE}" /content/dataset_raw
print("‚úÖ Dataset gekopieerd naar Colab")

In [None]:
# OPTIE B: ZIP Upload (als je dataset < 100MB is)
# 1. Zip je 'AI afbeeldingen' folder lokaal
# 2. Upload hier:

from google.colab import files
import zipfile

print("Upload je dataset.zip...")
uploaded = files.upload()

# Unzip
for filename in uploaded.keys():
    if filename.endswith('.zip'):
        with zipfile.ZipFile(filename, 'r') as zip_ref:
            zip_ref.extractall('/content/dataset_raw')
        print(f"‚úÖ {filename} uitgepakt")

In [None]:
# Check dataset
!ls -la /content/dataset_raw

## 3Ô∏è‚É£ Prepareer Dataset voor YOLO

In [None]:
# Dataset configuratie
RAW_DATA_DIR = Path("/content/dataset_raw")
OUTPUT_DIR = Path("/content/yolo_dataset")
TRAIN_SPLIT = 0.8

# Class mapping - PAS AAN als je andere folders hebt!
CLASS_MAPPING = {
    "Afbeeldingen OK": 0,
    "Afbeeldingen NOK alles weg": 1,
    "Afbeeldingen NOK hamer weg": 2,
    "Afbeeldingen NOK schaar weg": 3,
    "Afbeeldingen NOK schaar en sleutel weg": 4,
    "Afbeeldingen NOK sleutel weg": 5
}

CLASS_NAMES = [
    "ok",
    "nok_alles_weg",
    "nok_hamer_weg",
    "nok_schaar_weg",
    "nok_schaar_sleutel_weg",
    "nok_sleutel_weg"
]

In [None]:
# Prepareer dataset functie
def create_yolo_dataset():
    """Converteer dataset naar YOLO format"""
    
    # Maak structuur
    for split in ['train', 'val']:
        (OUTPUT_DIR / split / 'images').mkdir(parents=True, exist_ok=True)
        (OUTPUT_DIR / split / 'labels').mkdir(parents=True, exist_ok=True)
    
    print("‚úÖ Directory structuur aangemaakt\n")
    
    # Verzamel alle afbeeldingen
    all_images = []
    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp'}
    
    for folder_name, class_id in CLASS_MAPPING.items():
        # Probeer verschillende locaties
        possible_paths = [
            RAW_DATA_DIR / folder_name,
            RAW_DATA_DIR / "AI afbeeldingen" / folder_name,
        ]
        
        folder_path = None
        for path in possible_paths:
            if path.exists():
                folder_path = path
                break
        
        if not folder_path:
            print(f"‚ö†Ô∏è  Folder niet gevonden: {folder_name}")
            continue
        
        images = []
        for file in folder_path.glob('*'):
            if file.suffix.lower() in image_extensions:
                images.append(file)
        
        print(f"‚úÖ {folder_name}: {len(images)} afbeeldingen (class {class_id})")
        
        for img_path in images:
            all_images.append((img_path, class_id, folder_name))
    
    # Shuffle en split
    random.seed(42)
    random.shuffle(all_images)
    
    split_idx = int(len(all_images) * TRAIN_SPLIT)
    train_images = all_images[:split_idx]
    val_images = all_images[split_idx:]
    
    print(f"\n‚úÖ Dataset split: {len(train_images)} train, {len(val_images)} val\n")
    
    # Kopieer en maak labels
    for split_name, image_list in [('train', train_images), ('val', val_images)]:
        for idx, (img_path, class_id, folder_name) in enumerate(image_list):
            # Nieuwe bestandsnaam
            new_name = f"{split_name}_{folder_name.replace(' ', '_')}_{idx}{img_path.suffix}"
            
            # Kopieer afbeelding
            dst_img = OUTPUT_DIR / split_name / 'images' / new_name
            shutil.copy2(img_path, dst_img)
            
            # Maak label file
            label_name = new_name.replace(img_path.suffix, '.txt')
            dst_label = OUTPUT_DIR / split_name / 'labels' / label_name
            with open(dst_label, 'w') as f:
                f.write(f"{class_id}\n")
    
    print("‚úÖ Dataset verwerkt!")
    
    # Maak data.yaml
    yaml_content = f"""# Werkplek Inspectie Dataset
path: {OUTPUT_DIR.absolute()}
train: train/images
val: val/images

# Classes
nc: {len(CLASS_NAMES)}
names: {CLASS_NAMES}

# Task type
task: classify
"""
    
    yaml_path = OUTPUT_DIR / 'data.yaml'
    with open(yaml_path, 'w') as f:
        f.write(yaml_content)
    
    print(f"‚úÖ data.yaml aangemaakt: {yaml_path}")
    
    return yaml_path

# Run preprocessing
print("üöÄ Start dataset preprocessing...\n")
data_yaml_path = create_yolo_dataset()
print("\n‚úÖ Dataset klaar voor training!")

In [None]:
# Check resultaat
!ls -la /content/yolo_dataset/train/images | head -10
!echo "\n---\n"
!cat /content/yolo_dataset/data.yaml

## 4Ô∏è‚É£ Train YOLO Model

Nu gaan we het model trainen! Dit duurt ~15-20 minuten op GPU.

In [None]:
# Training configuratie
EPOCHS = 100          # Aantal training epochs
BATCH_SIZE = 16       # Batch size (verhoog als je veel GPU memory hebt)
IMAGE_SIZE = 640      # Image size
MODEL_SIZE = 'n'      # 'n' (nano), 's' (small), 'm' (medium) - nano is snelst

print("üéØ Training Configuratie:")
print(f"   Epochs: {EPOCHS}")
print(f"   Batch size: {BATCH_SIZE}")
print(f"   Image size: {IMAGE_SIZE}")
print(f"   Model: YOLOv8{MODEL_SIZE}-cls")

In [None]:
# Laad YOLO model
model = YOLO(f'yolov8{MODEL_SIZE}-cls.pt')
print("‚úÖ YOLO model geladen")

In [None]:
# START TRAINING! üöÄ
print("\n" + "="*60)
print("üöÄ START TRAINING")
print("="*60 + "\n")

results = model.train(
    data=str(data_yaml_path),
    epochs=EPOCHS,
    batch=BATCH_SIZE,
    imgsz=IMAGE_SIZE,
    device=0,  # GPU
    project='runs/classify',
    name='werkplek_inspect',
    exist_ok=True,
    patience=20,  # Early stopping
    save=True,
    plots=True,
    verbose=True,
    val=True
)

print("\n" + "="*60)
print("‚úÖ TRAINING COMPLEET!")
print("="*60)

## 5Ô∏è‚É£ Evaluatie & Resultaten

In [None]:
# Evalueer op validatie set
print("üìä Evaluatie op validatie set...\n")
metrics = model.val()

print("\n" + "="*60)
print("üìà RESULTATEN")
print("="*60)
print(f"Top-1 Accuracy: {metrics.top1:.2%}")
print(f"Top-5 Accuracy: {metrics.top5:.2%}")
print("="*60)

In [None]:
# Bekijk training plots
from IPython.display import Image, display
import os

results_dir = 'runs/classify/werkplek_inspect'

print("üìä Training Plots:\n")

plots = [
    'results.png',
    'confusion_matrix.png',
    'val_batch0_pred.jpg'
]

for plot in plots:
    plot_path = os.path.join(results_dir, plot)
    if os.path.exists(plot_path):
        print(f"\n{plot}:")
        display(Image(filename=plot_path, width=800))
    else:
        print(f"‚ö†Ô∏è  {plot} niet gevonden")

In [None]:
# Test op enkele voorbeelden
import glob

print("üß™ Test op enkele validatie afbeeldingen:\n")

# Laad beste model
best_model = YOLO('runs/classify/werkplek_inspect/weights/best.pt')

# Test op eerste 5 validatie images
val_images = glob.glob('/content/yolo_dataset/val/images/*.jpg')[:5]

for img_path in val_images:
    results = best_model(img_path)
    
    for result in results:
        top_class = result.probs.top1
        confidence = result.probs.top1conf.item()
        class_name = result.names[top_class]
        
        print(f"üì∏ {os.path.basename(img_path)}")
        print(f"   Voorspelling: {class_name}")
        print(f"   Zekerheid: {confidence:.2%}\n")
        
        # Toon afbeelding met voorspelling
        display(Image(filename=img_path, width=400))

## 6Ô∏è‚É£ Download Getraind Model

In [None]:
# Kopieer beste model
best_model_path = 'runs/classify/werkplek_inspect/weights/best.pt'
output_model_path = '/content/werkplek_classifier.pt'

!cp {best_model_path} {output_model_path}

# Check file size
import os
size_mb = os.path.getsize(output_model_path) / (1024 * 1024)
print(f"‚úÖ Model opgeslagen: {output_model_path}")
print(f"   Bestandsgrootte: {size_mb:.1f} MB")

In [None]:
# Download model naar je computer
from google.colab import files

print("‚¨áÔ∏è  Downloading model...")
files.download(output_model_path)
print("‚úÖ Model gedownload!")
print("\nüìÅ Plaats het bestand in: backend/models/werkplek_classifier.pt")

In [None]:
# OPTIONEEL: Sla ook op in Google Drive
drive_output = '/content/drive/MyDrive/werkplek_classifier.pt'

try:
    !cp {output_model_path} "{drive_output}"
    print(f"‚úÖ Model ook opgeslagen in Google Drive: {drive_output}")
except:
    print("‚ö†Ô∏è  Google Drive niet gemount, skip opslaan naar Drive")

## 7Ô∏è‚É£ Download Alle Training Results (Optioneel)

In [None]:
# Zip alle resultaten voor download
!zip -r training_results.zip runs/classify/werkplek_inspect/

print("‚¨áÔ∏è  Downloading training results...")
files.download('training_results.zip')
print("‚úÖ Training results gedownload!")

## ‚úÖ Klaar!

**Volgende stappen:**

1. ‚úÖ Download `werkplek_classifier.pt`
2. üìÅ Plaats het in: `backend/models/werkplek_classifier.pt`
3. üöÄ Start de backend: `cd backend && python main.py`
4. üåê Start de frontend: `cd frontend && npm start`
5. üéâ Test je applicatie!

**Model performance:**
- Check de plots hierboven
- Top-1 Accuracy is je belangrijkste metric
- Confusion matrix laat zien waar het model moeite mee heeft

**Tips voor verbetering:**
- üì∏ Verzamel meer foto's (50-100 per class)
- üîÑ Probeer data augmentatie
- üìà Train langer (meer epochs)
- ü§ñ Gebruik groter model (yolov8s-cls ipv yolov8n-cls)