# YOLO Tree Detection Training - Vineyard Robot

This notebook trains a YOLOv8 model to detect:
- **Vineyard**: trunk, canopy, grape clusters
- **Tangerine**: trunk, canopy, fruit

## Steps:
1. Install dependencies
2. Upload and organize your labeled images
3. Train the model
4. Download trained weights for Raspberry Pi

In [None]:
# Cell 1: Install YOLOv8 (Ultralytics)
!pip install ultralytics
!pip install roboflow  # Optional: for easier dataset management

from ultralytics import YOLO
import os
import shutil
from google.colab import files, drive
import yaml

print("Ultralytics installed successfully!")

In [None]:
# Cell 2: Mount Google Drive (to access your photo folders)
drive.mount('/content/drive')

# Check your folders - update these paths to match your Drive structure
# Example: /content/drive/MyDrive/TreePhotos/vineyard
#          /content/drive/MyDrive/TreePhotos/tangerine

print("\nLooking for image folders...")
!ls -la /content/drive/MyDrive/

In [None]:
# Cell 3: Configure your dataset
# ================================
# EDIT THESE SETTINGS:

# Choose which tree type to train
TREE_TYPE = "vineyard"  # or "tangerine"

# Path to your images in Google Drive
# These should be folders containing your photos
IMAGES_PATH = f"/content/drive/MyDrive/TreePhotos/{TREE_TYPE}"

# Classes to detect (adjust based on your tree type)
if TREE_TYPE == "vineyard":
    CLASSES = ['vineyard_trunk', 'vineyard_canopy', 'grape_cluster', 'post']
elif TREE_TYPE == "tangerine":
    CLASSES = ['tangerine_trunk', 'tangerine_canopy', 'tangerine_fruit', 'post']
else:
    CLASSES = ['trunk', 'canopy', 'fruit', 'post']

print(f"Training for: {TREE_TYPE}")
print(f"Classes: {CLASSES}")
print(f"Images path: {IMAGES_PATH}")

## Option A: Use Roboflow for Easy Labeling (Recommended)

1. Go to [Roboflow](https://roboflow.com) and create a free account
2. Create a new project and upload your images
3. Label images by drawing bounding boxes around trunks, canopy, fruit
4. Export as "YOLOv8" format and get the download code

In [None]:
# Cell 4A: Download dataset from Roboflow (if using Roboflow)
# Uncomment and paste your Roboflow export code here:

# from roboflow import Roboflow
# rf = Roboflow(api_key="YOUR_API_KEY")
# project = rf.workspace("your-workspace").project("your-project")
# dataset = project.version(1).download("yolov8")
# DATASET_PATH = dataset.location

print("Uncomment the code above and paste your Roboflow export code")

## Option B: Manual Labeling with LabelImg

If you've already labeled images with LabelImg:
1. Your images should be in a folder with corresponding `.txt` annotation files
2. Each `.txt` file has format: `class_id x_center y_center width height` (normalized 0-1)
3. Upload them to Google Drive

In [None]:
# Cell 4B: Setup dataset from manually labeled images

# Create dataset directory structure
DATASET_PATH = "/content/dataset"
os.makedirs(f"{DATASET_PATH}/images/train", exist_ok=True)
os.makedirs(f"{DATASET_PATH}/images/val", exist_ok=True)
os.makedirs(f"{DATASET_PATH}/labels/train", exist_ok=True)
os.makedirs(f"{DATASET_PATH}/labels/val", exist_ok=True)

# Path to your labeled images and annotations in Drive
# Images: .jpg, .png files
# Labels: .txt files with same name as images

SOURCE_IMAGES = f"/content/drive/MyDrive/TreePhotos/{TREE_TYPE}/images"
SOURCE_LABELS = f"/content/drive/MyDrive/TreePhotos/{TREE_TYPE}/labels"

# Check if paths exist
print(f"Looking for images in: {SOURCE_IMAGES}")
print(f"Looking for labels in: {SOURCE_LABELS}")

if os.path.exists(SOURCE_IMAGES):
    images = [f for f in os.listdir(SOURCE_IMAGES) if f.endswith(('.jpg', '.jpeg', '.png'))]
    print(f"Found {len(images)} images")
else:
    print("Images folder not found! Update SOURCE_IMAGES path.")
    images = []

if os.path.exists(SOURCE_LABELS):
    labels = [f for f in os.listdir(SOURCE_LABELS) if f.endswith('.txt')]
    print(f"Found {len(labels)} label files")
else:
    print("Labels folder not found! Update SOURCE_LABELS path.")
    labels = []

In [None]:
# Cell 5: Split data into train/val and copy to dataset folder
import random

if images:
    # Shuffle and split 80/20
    random.shuffle(images)
    split_idx = int(len(images) * 0.8)
    train_images = images[:split_idx]
    val_images = images[split_idx:]

    print(f"Train: {len(train_images)} images")
    print(f"Val: {len(val_images)} images")

    # Copy files
    for img in train_images:
        shutil.copy(f"{SOURCE_IMAGES}/{img}", f"{DATASET_PATH}/images/train/{img}")
        label = img.rsplit('.', 1)[0] + '.txt'
        if os.path.exists(f"{SOURCE_LABELS}/{label}"):
            shutil.copy(f"{SOURCE_LABELS}/{label}", f"{DATASET_PATH}/labels/train/{label}")

    for img in val_images:
        shutil.copy(f"{SOURCE_IMAGES}/{img}", f"{DATASET_PATH}/images/val/{img}")
        label = img.rsplit('.', 1)[0] + '.txt'
        if os.path.exists(f"{SOURCE_LABELS}/{label}"):
            shutil.copy(f"{SOURCE_LABELS}/{label}", f"{DATASET_PATH}/labels/val/{label}")

    print("\nDataset prepared!")
else:
    print("No images found. Please check your paths.")

In [None]:
# Cell 6: Create dataset.yaml config file

dataset_config = {
    'path': DATASET_PATH,
    'train': 'images/train',
    'val': 'images/val',
    'names': {i: name for i, name in enumerate(CLASSES)}
}

yaml_path = f"{DATASET_PATH}/dataset.yaml"
with open(yaml_path, 'w') as f:
    yaml.dump(dataset_config, f, default_flow_style=False)

print(f"Created {yaml_path}:")
print("="*40)
with open(yaml_path, 'r') as f:
    print(f.read())

## If You Haven't Labeled Yet - Quick Labeling Guide

Use this cell to set up labeling with LabelImg directly in Colab:

In [None]:
# Cell 7: Create a classes.txt file for LabelImg
# Download this and use it when labeling locally with LabelImg

classes_txt = "\n".join(CLASSES)
with open('/content/classes.txt', 'w') as f:
    f.write(classes_txt)

print("Classes for labeling:")
print("="*40)
for i, c in enumerate(CLASSES):
    print(f"  {i}: {c}")

print("\n" + "="*40)
print("LABELING INSTRUCTIONS:")
print("="*40)
print("1. Download LabelImg: pip install labelImg")
print("2. Run: labelImg")
print("3. Open Dir -> select your images folder")
print("4. Change Save Dir -> same folder or /labels subfolder")
print("5. Use 'YOLO' format (not PascalVOC)")
print("6. Draw boxes around: trunks, canopy, fruit, posts")
print("7. Press 'w' to draw box, select class, save (Ctrl+S)")
print("8. Upload labeled images+txt files to Google Drive")

files.download('/content/classes.txt')

In [None]:
# Cell 8: Train YOLOv8 Model
# ==========================

# Load pretrained YOLOv8 nano model (fastest, good for Raspberry Pi)
# Options: yolov8n.pt (nano), yolov8s.pt (small), yolov8m.pt (medium)
model = YOLO('yolov8n.pt')  # Using nano for Raspberry Pi speed

# Train the model
results = model.train(
    data=yaml_path,
    epochs=100,           # Number of training epochs (adjust as needed)
    imgsz=640,            # Image size
    batch=16,             # Batch size (reduce if out of memory)
    patience=20,          # Early stopping patience
    save=True,
    project='/content/runs',
    name=f'{TREE_TYPE}_detector',
    exist_ok=True,
    pretrained=True,
    optimizer='auto',
    verbose=True,
    seed=42,
    deterministic=True,
    single_cls=False,
    rect=False,
    cos_lr=True,          # Cosine learning rate
    close_mosaic=10,      # Disable mosaic last 10 epochs
    resume=False,
    amp=True,             # Mixed precision training
    fraction=1.0,
    profile=False,
    freeze=None,
    lr0=0.01,             # Initial learning rate
    lrf=0.01,             # Final learning rate factor
    momentum=0.937,
    weight_decay=0.0005,
    warmup_epochs=3.0,
    warmup_momentum=0.8,
    warmup_bias_lr=0.1,
    box=7.5,              # Box loss gain
    cls=0.5,              # Class loss gain
    dfl=1.5,              # DFL loss gain
    pose=12.0,
    kobj=1.0,
    label_smoothing=0.0,
    nbs=64,
    hsv_h=0.015,          # Augmentation: HSV-Hue
    hsv_s=0.7,            # Augmentation: HSV-Saturation
    hsv_v=0.4,            # Augmentation: HSV-Value
    degrees=0.0,          # Rotation
    translate=0.1,        # Translation
    scale=0.5,            # Scale
    shear=0.0,
    perspective=0.0,
    flipud=0.0,           # Vertical flip
    fliplr=0.5,           # Horizontal flip
    mosaic=1.0,           # Mosaic augmentation
    mixup=0.0,
    copy_paste=0.0,
)

print("\n" + "="*50)
print("TRAINING COMPLETE!")
print("="*50)

In [None]:
# Cell 9: View training results
from IPython.display import Image, display
import glob

# Find the latest run
run_path = f'/content/runs/{TREE_TYPE}_detector'

# Display training curves
if os.path.exists(f'{run_path}/results.png'):
    display(Image(filename=f'{run_path}/results.png', width=800))

# Display confusion matrix
if os.path.exists(f'{run_path}/confusion_matrix.png'):
    display(Image(filename=f'{run_path}/confusion_matrix.png', width=600))

# Display sample predictions
val_preds = glob.glob(f'{run_path}/val_batch*_pred.jpg')
for pred in val_preds[:3]:
    display(Image(filename=pred, width=800))

In [None]:
# Cell 10: Test the model on a sample image

# Load best weights
best_model = YOLO(f'{run_path}/weights/best.pt')

# Test on a validation image
val_images = glob.glob(f'{DATASET_PATH}/images/val/*.jpg')
if val_images:
    test_img = val_images[0]
    results = best_model(test_img)
    
    # Save and display result
    for r in results:
        im_array = r.plot()
        from PIL import Image as PILImage
        im = PILImage.fromarray(im_array[..., ::-1])  # BGR to RGB
        im.save('/content/test_result.jpg')
    
    display(Image(filename='/content/test_result.jpg', width=640))
    
    # Print detections
    for r in results:
        for box in r.boxes:
            cls = int(box.cls[0])
            conf = float(box.conf[0])
            print(f"Detected: {CLASSES[cls]} ({conf:.2f})")

In [None]:
# Cell 11: Export model for Raspberry Pi
# ======================================

# Export to ONNX (optional - for faster inference)
# best_model.export(format='onnx')

# Copy best weights to Drive for easy transfer to Pi
output_dir = f"/content/drive/MyDrive/TreeModels"
os.makedirs(output_dir, exist_ok=True)

# Copy best.pt
src_weights = f'{run_path}/weights/best.pt'
dst_weights = f'{output_dir}/{TREE_TYPE}_best.pt'
shutil.copy(src_weights, dst_weights)

print(f"\nModel saved to Google Drive: {dst_weights}")
print(f"\nTo use on Raspberry Pi:")
print(f"1. Download: {dst_weights}")
print(f"2. Copy to: /home/pi/Working Code/Updated FULL Code Standalone/models/")
print(f"3. Update config.py: YOLO_MODEL_PATH = 'models/{TREE_TYPE}_best.pt'")

In [None]:
# Cell 12: Download model directly (alternative to Drive)

# Download the trained weights
files.download(f'{run_path}/weights/best.pt')

print("\nDownloading best.pt...")
print("Rename it to match your TREE_TYPE and copy to Raspberry Pi")

## Training Tips

### If training is slow:
- Use `Runtime > Change runtime type > GPU` (T4 is free)
- Reduce `imgsz` to 480 or 320
- Reduce `batch` size

### If accuracy is low:
- Add more training images (aim for 100+ per class)
- Make sure labels are accurate (boxes tight around objects)
- Increase `epochs` to 200-300
- Try `yolov8s.pt` (small) instead of nano

### For Raspberry Pi performance:
- Use YOLOv8n (nano) - fastest
- Use `imgsz=320` during inference
- Consider ONNX export for speed boost

### Class labeling guide:
- **trunk**: Main stem/trunk of tree, draw box around visible trunk
- **canopy**: Leaves and branches, draw box around leaf clusters
- **fruit**: Individual fruits or fruit clusters (grapes, tangerines)
- **post**: Fence posts, poles, stakes (to distinguish from trunks!)

In [None]:
# Cell 13: Train for second tree type (optional)
# Run this if you want to train both vineyard AND tangerine

TREE_TYPE_2 = "tangerine"  # Change to the other type

if TREE_TYPE_2 == "tangerine":
    CLASSES_2 = ['tangerine_trunk', 'tangerine_canopy', 'tangerine_fruit', 'post']
else:
    CLASSES_2 = ['vineyard_trunk', 'vineyard_canopy', 'grape_cluster', 'post']

# Update paths and repeat training...
print(f"Ready to train for: {TREE_TYPE_2}")
print(f"Update SOURCE_IMAGES and SOURCE_LABELS paths, then re-run cells 4B-12")