# Elite Dangerous Core Mining - ML Detection Pipeline

This notebook trains YOLOv8 object-detection models to identify core asteroids
in Elite Dangerous screenshots in real time.

## Labeling tool recommendation

Before you can train, you need labeled screenshots - bounding boxes drawn around
every core asteroid in each image. The best free tools for this:

**Recommended: [Roboflow Annotate](https://roboflow.com)** (web-based)
- Free tier supports unlimited public projects and up to 10k images
- Drag-and-drop upload, clean bounding box UI, keyboard shortcuts
- Exports directly in YOLO format (what we need) or COCO JSON
- Also does automatic dataset splitting (train/val) and augmentation previews
- Easiest to share with others if you want help labeling

**Local alternative: [LabelImg](https://github.com/HumanSignal/labelImg)**
- Runs on Windows, totally offline, no account needed
- `pip install labelImg` then run `labelImg` from command line
- Save in YOLO format directly (one .txt per image, same filename as the image)
- Keyboard shortcut `W` to draw a box, `D` for next image - fast once you get going

**Workflow:**
1. Play Elite Dangerous and take screenshots (F10 by default, saves to
   `%USERPROFILE%\Pictures\Frontier Developments\Elite Dangerous`)
2. Label each screenshot - draw a bounding box around every core you can see
   on screen (including ones the PWA is highlighting)
3. Export as YOLO format and drop files into the dataset folders this notebook creates
4. Run the retrain cell

## Dataset structure this notebook expects

```
datasets/
  ice/
    images/
      train/     <- ~80% of your labeled screenshots
      val/       <- ~20% for validation
    labels/
      train/     <- one .txt per image, same name as the image
      val/
  metallic/
    images/ ...
    labels/ ...
  metal-rich/
    images/ ...
    labels/ ...
  rocky/
    images/ ...
    labels/ ...
  unified/       <- copy of all images+labels combined (optional)
    images/ ...
    labels/ ...
```

Each label .txt file has one line per core:
`0 <cx> <cy> <w> <h>` - class index (always 0 = core), then normalized center-x,
center-y, width, height (all 0.0-1.0). Roboflow and LabelImg both produce this
format automatically when you pick "YOLO" as the export format.

## Strategy

We train one model per ring type (ice, metallic, rocky, metal-rich) because:
- Each ring type has exactly one asteroid shape that can contain a core
- You mine one ring type per session, so you only need one model loaded at a time
- Specialist models outperform a single multi-type model here

We also train a "unified" model on all types combined as a comparison baseline.

For each ring type we compare three YOLOv8 sizes (nano, small, medium) and keep
the one with the best mAP50. The RTX 3070 (8GB VRAM) handles all three sizes
comfortably at batch=16, img_size=640.


## 1. Install dependencies

Run once. Restart the kernel after if prompted.

In [1]:
# ultralytics gives us YOLOv8 + training loop + export tools
# opencv-python for image loading and annotation drawing
# pyyaml for reading/writing data.yaml config files
# matplotlib for result visualization

import subprocess, sys

packages = [
    "ultralytics>=8.0.0",
    "opencv-python",
    "pyyaml",
    "matplotlib",
    "torch",
    "torchvision",
]

subprocess.run(
    [sys.executable, "-m", "pip", "install", "--quiet"] + packages,
    check=True,
)
print("All packages installed.")


All packages installed.


## 2. Imports and GPU check

In [2]:
import os
import sys
import json
import shutil
import yaml
import cv2
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from datetime import datetime
from collections import defaultdict

import torch
from ultralytics import YOLO

print(f"Python:  {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    gpu = torch.cuda.get_device_properties(0)
    vram_gb = gpu.total_memory / 1024 ** 3
    print(f"GPU: {gpu.name}")
    print(f"VRAM: {vram_gb:.1f} GB")
    if vram_gb < 6:
        print("WARNING: less than 6 GB VRAM - reduce BATCH_SIZE to 8 if training crashes")
else:
    print("WARNING: No GPU found. Training on CPU will be very slow.")
    print("Make sure CUDA toolkit is installed and torch was installed with CUDA support.")
    print("Install command: pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121")


Creating new Ultralytics Settings v0.0.6 file  
View Ultralytics Settings with 'yolo settings' or at 'C:\Users\megakruk\AppData\Roaming\Ultralytics\settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
Python:  3.12.4 (tags/v3.12.4:8e8a4ba, Jun  6 2024, 19:30:16) [MSC v.1940 64 bit (AMD64)]
PyTorch: 2.5.1+cu121
CUDA available: True
GPU: NVIDIA GeForce RTX 3070
VRAM: 8.0 GB


## 3. Configuration

All tunable settings live here. Edit before running training.

In [None]:
# ---- MAIN CONFIG - edit this before running anything ----

# Ring types to train models for.
# "unified" trains on all types combined as a baseline comparison.
# Remove any type you don't have data for yet.
RING_TYPES = ["ice", "metallic", "rocky", "metal-rich", "unified"]

# Model sizes to compare per ring type.
# n=fastest/smallest, s=good balance, m=best accuracy but slower.
# All three fit on RTX 3070 8GB at batch=16.
MODEL_SIZES = ["yolov8n", "yolov8s", "yolov8m"]

# Training hyperparameters
IMG_SIZE    = 640   # YOLO standard input size
EPOCHS      = 100   # how many epochs to train (early stopping kicks in via PATIENCE)
BATCH_SIZE  = 16    # lower to 8 if you get CUDA out-of-memory errors
PATIENCE    = 20    # stop training if val mAP doesn't improve for this many epochs

# Confidence threshold for inference (0.0 - 1.0).
# Lower = more detections (more false positives).
# Higher = fewer, more confident detections (might miss some).
# Start at 0.4 and tune based on how the model performs in-game.
CONF_THRESHOLD = 0.4

# We only detect one thing: a core asteroid.
# YOLO treats everything else as background automatically.
CLASS_NAMES = ["core"]

# Folder layout
DATASET_ROOT = Path("datasets")
RUNS_ROOT    = Path("runs")
EXPORTS_ROOT = Path("exports")

print("Config loaded.")
print(f"Ring types: {RING_TYPES}")
print(f"Model sizes: {MODEL_SIZES}")
print(f"Training on: {'GPU' if torch.cuda.is_available() else 'CPU'}")


## 4. Create dataset folder structure

Run this once. Then drop your labeled screenshots into the folders it prints.

In [None]:
def create_dataset_structure():
    """Create train/val image and label folders for each ring type."""
    for ring_type in RING_TYPES:
        for split in ["train", "val"]:
            (DATASET_ROOT / ring_type / "images" / split).mkdir(parents=True, exist_ok=True)
            (DATASET_ROOT / ring_type / "labels" / split).mkdir(parents=True, exist_ok=True)
    RUNS_ROOT.mkdir(exist_ok=True)
    EXPORTS_ROOT.mkdir(exist_ok=True)

    print("Folder structure created. Drop your screenshots and labels here:\n")
    for ring_type in RING_TYPES:
        print(f"  datasets/{ring_type}/images/train/  <- training screenshots (.png or .jpg)")
        print(f"  datasets/{ring_type}/images/val/    <- validation screenshots (about 20% of total)")
        print(f"  datasets/{ring_type}/labels/train/  <- matching YOLO .txt label files")
        print(f"  datasets/{ring_type}/labels/val/")
        print()

create_dataset_structure()


## 5. COCO JSON to YOLO format converter

If your labeling tool exports COCO JSON instead of YOLO .txt files, run this.
Roboflow can export either format - YOLO is preferred and skips this step entirely.

In [None]:
def convert_coco_to_yolo(coco_json_path, output_labels_dir):
    """
    Convert a COCO-format annotations JSON to per-image YOLO .txt label files.

    COCO format: one big JSON file with all annotations
    YOLO format: one .txt file per image, each line = class cx cy w h (normalized 0-1)

    Args:
        coco_json_path: path to the _annotations.coco.json file
        output_labels_dir: folder where .txt files will be written
    """
    output_labels_dir = Path(output_labels_dir)
    output_labels_dir.mkdir(parents=True, exist_ok=True)

    with open(coco_json_path) as f:
        coco = json.load(f)

    # map COCO category ids to our YOLO class indices
    cat_map = {}
    for cat in coco["categories"]:
        if cat["name"] in CLASS_NAMES:
            cat_map[cat["id"]] = CLASS_NAMES.index(cat["name"])

    if not cat_map:
        print(f"WARNING: none of your COCO categories match CLASS_NAMES {CLASS_NAMES}")
        print(f"Found categories: {[c['name'] for c in coco['categories']]}")
        return 0

    images = {img["id"]: img for img in coco["images"]}

    anns_by_image = defaultdict(list)
    for ann in coco["annotations"]:
        if ann["category_id"] in cat_map:
            anns_by_image[ann["image_id"]].append(ann)

    converted = 0
    for img_id, anns in anns_by_image.items():
        img_info = images[img_id]
        W = img_info["width"]
        H = img_info["height"]
        img_name = Path(img_info["file_name"]).stem

        lines = []
        for ann in anns:
            cls = cat_map[ann["category_id"]]
            x, y, w, h = ann["bbox"]  # COCO: top-left x,y + pixel width,height
            cx = (x + w / 2) / W
            cy = (y + h / 2) / H
            nw = w / W
            nh = h / H
            lines.append(f"{cls} {cx:.6f} {cy:.6f} {nw:.6f} {nh:.6f}")

        out_path = output_labels_dir / f"{img_name}.txt"
        out_path.write_text("\n".join(lines))
        converted += 1

    print(f"Converted {converted} images.")
    print(f"Labels written to: {output_labels_dir}")
    return converted


# Example usage - uncomment and adjust paths:
# convert_coco_to_yolo(
#     coco_json_path="my_export/_annotations.coco.json",
#     output_labels_dir="datasets/ice/labels/train",
# )


## 6. Validate datasets

Run this after adding your screenshots and labels. Catches missing files, bad label format, and shows class distribution.

In [None]:
def validate_dataset(ring_type):
    """
    Check a ring type dataset for problems before training.
    Returns (stats dict, list of issues). Empty issues list means all clear.
    """
    ring_dir = DATASET_ROOT / ring_type
    issues = []
    stats = {}

    for split in ["train", "val"]:
        img_dir = ring_dir / "images" / split
        lbl_dir = ring_dir / "labels" / split

        images = list(img_dir.glob("*.png")) + list(img_dir.glob("*.jpg"))
        labels = list(lbl_dir.glob("*.txt"))

        img_stems = {p.stem for p in images}
        lbl_stems = {p.stem for p in labels}

        missing_labels = img_stems - lbl_stems
        orphan_labels  = lbl_stems - img_stems

        if missing_labels:
            issues.append(
                f"[{split}] {len(missing_labels)} image(s) have no label file: "
                + str(sorted(missing_labels)[:5])
            )
        if orphan_labels:
            issues.append(
                f"[{split}] {len(orphan_labels)} label file(s) have no matching image: "
                + str(sorted(orphan_labels)[:5])
            )

        total_cores = 0
        bad_lines = 0
        for lbl in labels:
            for line in lbl.read_text().strip().splitlines():
                line = line.strip()
                if not line:
                    continue
                parts = line.split()
                if len(parts) != 5:
                    bad_lines += 1
                    continue
                try:
                    cls, cx, cy, w, h = int(parts[0]), float(parts[1]), float(parts[2]), float(parts[3]), float(parts[4])
                    if not (0 <= cx <= 1 and 0 <= cy <= 1 and 0 < w <= 1 and 0 < h <= 1):
                        bad_lines += 1
                    elif cls == 0:
                        total_cores += 1
                except ValueError:
                    bad_lines += 1

        if bad_lines:
            issues.append(f"[{split}] {bad_lines} malformed label line(s)")

        stats[split] = {
            "images": len(images),
            "labeled": len(img_stems & lbl_stems),
            "cores": total_cores,
        }

    print(f"\n{'=' * 40}")
    print(f"  {ring_type.upper()} dataset")
    print(f"{'=' * 40}")
    for split, s in stats.items():
        ratio = s["cores"] / s["labeled"] if s["labeled"] else 0
        print(f"  {split:5}: {s['images']:4} images | {s['cores']:5} core annotations | {ratio:.1f} cores/image")

    if issues:
        print("  ISSUES:")
        for iss in issues:
            print(f"    - {iss}")
    else:
        print("  All checks passed.")

    return stats, issues


# Run validation for all ring types
for ring_type in RING_TYPES:
    validate_dataset(ring_type)


## 7. Preview labeled images

Sanity check - visually confirm your labels look correct before training.

In [None]:
def preview_labels(ring_type, split="train", n=4):
    """
    Show the first n labeled images with bounding boxes drawn on them.
    Use this to confirm your labels were imported correctly.
    """
    img_dir = DATASET_ROOT / ring_type / "images" / split
    lbl_dir = DATASET_ROOT / ring_type / "labels" / split

    images = sorted(list(img_dir.glob("*.png")) + list(img_dir.glob("*.jpg")))[:n]

    if not images:
        print(f"No images found in {img_dir}")
        return

    fig, axes = plt.subplots(1, len(images), figsize=(5 * len(images), 5))
    if len(images) == 1:
        axes = [axes]

    for ax, img_path in zip(axes, images):
        img = cv2.imread(str(img_path))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        H, W = img.shape[:2]

        lbl_path = lbl_dir / (img_path.stem + ".txt")
        if lbl_path.exists():
            for line in lbl_path.read_text().strip().splitlines():
                parts = line.split()
                if len(parts) != 5:
                    continue
                _, cx, cy, w, h = float(parts[0]), float(parts[1]), float(parts[2]), float(parts[3]), float(parts[4])
                x1 = int((cx - w / 2) * W)
                y1 = int((cy - h / 2) * H)
                x2 = int((cx + w / 2) * W)
                y2 = int((cy + h / 2) * H)
                cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
                cv2.putText(img, "core", (x1, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)

        ax.imshow(img)
        ax.set_title(img_path.name, fontsize=8)
        ax.axis("off")

    plt.suptitle(f"{ring_type} / {split} - label preview", fontsize=12)
    plt.tight_layout()
    plt.show()


# Change ring_type to whichever dataset you want to inspect
preview_labels("ice", split="train", n=4)


## 8. Generate data.yaml files

YOLO needs a small YAML config file that points to the dataset folders and lists class names.

In [None]:
def generate_yaml(ring_type):
    """Write the data.yaml file for a ring type dataset."""
    ring_dir = DATASET_ROOT / ring_type
    yaml_path = ring_dir / "data.yaml"

    config = {
        "path": str(ring_dir.resolve()),
        "train": "images/train",
        "val": "images/val",
        "nc": len(CLASS_NAMES),
        "names": CLASS_NAMES,
    }

    with open(yaml_path, "w") as f:
        yaml.dump(config, f, default_flow_style=False, allow_unicode=True)

    print(f"Written: {yaml_path}")
    return yaml_path


for ring_type in RING_TYPES:
    generate_yaml(ring_type)


## 9. Train models

This trains every combination of ring type x model size. Expect:
- yolov8n: ~5-10 min per 100 epochs on RTX 3070
- yolov8s: ~10-20 min
- yolov8m: ~20-40 min

Results are saved under `runs/<ring_type>_<model_size>_<timestamp>/`.
The best weights checkpoint is at `weights/best.pt` inside that folder.

If you don't have data for a ring type yet, that type is skipped automatically.


In [None]:
def train_model(ring_type, model_size):
    """
    Train one YOLOv8 model for a given ring type and model size.
    Returns (weights_path, results) or None if skipped.
    """
    yaml_path = DATASET_ROOT / ring_type / "data.yaml"
    if not yaml_path.exists():
        print(f"No data.yaml for {ring_type} - run generate_yaml first.")
        return None

    train_images = (
        list((DATASET_ROOT / ring_type / "images" / "train").glob("*.png"))
        + list((DATASET_ROOT / ring_type / "images" / "train").glob("*.jpg"))
    )
    if len(train_images) < 5:
        print(f"Skipping {ring_type}/{model_size}: only {len(train_images)} training images (need >= 5).")
        return None

    run_name = f"{ring_type}_{model_size}_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
    print(f"\nStarting: {run_name}  ({len(train_images)} training images)")

    model = YOLO(f"{model_size}.pt")  # downloads pretrained COCO weights on first run

    results = model.train(
        data=str(yaml_path),
        epochs=EPOCHS,
        imgsz=IMG_SIZE,
        batch=BATCH_SIZE,
        patience=PATIENCE,
        device=0 if torch.cuda.is_available() else "cpu",
        project=str(RUNS_ROOT),
        name=run_name,

        # -- Augmentation tuned for Elite Dangerous screenshots --
        # Asteroids rotate constantly and are viewed from any angle
        degrees=45,
        # Horizontal flip is always geometrically valid
        fliplr=0.5,
        # Vertical flip less common but fine for 3D space
        flipud=0.3,
        # Cores appear at various distances = various on-screen sizes
        scale=0.5,
        # Mosaic helps the model learn cluttered ring environments
        mosaic=1.0,
        close_mosaic=10,
        # Brightness and saturation shift for lighting variation
        # (shadow side of planet vs star-lit area)
        hsv_v=0.4,
        hsv_s=0.5,
        # Small hue shift - ring tones vary but not wildly
        hsv_h=0.015,
        # Game renders are crisp - don't add blur
        blur=False,
        median_blur=0.0,

        save_period=10,
        verbose=True,
    )

    best_weights = RUNS_ROOT / run_name / "weights" / "best.pt"
    map50 = results.results_dict.get("metrics/mAP50(B)", 0)
    print(f"Done: {run_name}  mAP50={map50:.3f}  weights -> {best_weights}")
    return best_weights, results


In [None]:
# Run all ring type x model size combinations.
# Skips any ring type that doesn't have enough data yet.
# This is the main training cell - takes a while.

all_results = {}

for ring_type in RING_TYPES:
    stats, issues = validate_dataset(ring_type)
    if issues:
        print(f"Skipping {ring_type} - fix dataset issues first.")
        continue

    total_train = stats.get("train", {}).get("images", 0)
    if total_train < 5:
        print(f"Skipping {ring_type} - not enough images yet ({total_train}).")
        continue

    all_results[ring_type] = {}

    for model_size in MODEL_SIZES:
        result = train_model(ring_type, model_size)
        if result is None:
            continue

        weights_path, train_results = result
        rd = train_results.results_dict

        all_results[ring_type][model_size] = {
            "weights":    str(weights_path),
            "mAP50":      float(rd.get("metrics/mAP50(B)",     0)),
            "mAP50_95":   float(rd.get("metrics/mAP50-95(B)",  0)),
            "precision":  float(rd.get("metrics/precision(B)", 0)),
            "recall":     float(rd.get("metrics/recall(B)",    0)),
        }

# Persist results so retrain() can pick up where we left off
results_path = RUNS_ROOT / "all_results.json"
RUNS_ROOT.mkdir(exist_ok=True)
with open(results_path, "w") as f:
    json.dump(all_results, f, indent=2)

print("\nAll training runs complete. Results saved to:", results_path)


## 10. Compare model results

In [None]:
def show_comparison_table():
    """Print a formatted table of all trained model metrics."""
    results_path = RUNS_ROOT / "all_results.json"
    if not results_path.exists():
        print("No results yet - run training first.")
        return {}

    with open(results_path) as f:
        results = json.load(f)

    header = f"{'Ring':<12} {'Model':<12} {'mAP50':>8} {'mAP50-95':>10} {'Precision':>10} {'Recall':>8}"
    print(header)
    print("-" * len(header))

    best_per_ring = {}
    for ring_type, models in results.items():
        best_map = -1
        for model_size, m in models.items():
            marker = ""
            if m["mAP50"] > best_map:
                best_map = m["mAP50"]
                best_per_ring[ring_type] = model_size
                marker = " <-- best"
            print(
                f"{ring_type:<12} {model_size:<12} "
                f"{m['mAP50']:>8.3f} {m['mAP50_95']:>10.3f} "
                f"{m['precision']:>10.3f} {m['recall']:>8.3f}"
                + marker
            )
        print()

    return best_per_ring


best_models = show_comparison_table()
print("Best model per ring type:", best_models)


## 11. Plot training curves

YOLOv8 saves a `results.csv` in each run folder. This plots loss and mAP over epochs.

In [None]:
import csv

def plot_training_curves(ring_type, model_size):
    """Plot loss and mAP curves from a training run's results.csv."""
    results_path = RUNS_ROOT / "all_results.json"
    if not results_path.exists():
        print("No training results found.")
        return

    with open(results_path) as f:
        all_r = json.load(f)

    weights = all_r.get(ring_type, {}).get(model_size, {}).get("weights", "")
    if not weights:
        print(f"No trained model found for {ring_type}/{model_size}")
        return

    run_dir = Path(weights).parent.parent
    csv_path = run_dir / "results.csv"
    if not csv_path.exists():
        print(f"results.csv not found at {csv_path}")
        return

    epochs, box_loss, cls_loss, map50 = [], [], [], []
    with open(csv_path) as f:
        reader = csv.DictReader(f)
        for row in reader:
            row = {k.strip(): v.strip() for k, v in row.items()}
            epochs.append(int(row.get("epoch", 0)))
            box_loss.append(float(row.get("train/box_loss", 0) or row.get("train/dfl_loss", 0) or 0))
            cls_loss.append(float(row.get("train/cls_loss", 0) or 0))
            map50.append(float(row.get("metrics/mAP50(B)", 0) or 0))

    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))

    ax1.plot(epochs, box_loss, label="box loss")
    ax1.plot(epochs, cls_loss, label="cls loss")
    ax1.set_xlabel("Epoch")
    ax1.set_ylabel("Loss")
    ax1.set_title(f"{ring_type} / {model_size} - training loss")
    ax1.legend()
    ax1.grid(True, alpha=0.3)

    ax2.plot(epochs, map50, color="green", label="mAP50")
    ax2.set_xlabel("Epoch")
    ax2.set_ylabel("mAP50")
    ax2.set_title(f"{ring_type} / {model_size} - validation mAP50")
    ax2.legend()
    ax2.grid(True, alpha=0.3)

    plt.tight_layout()
    plt.show()


# Change these to whatever ring type and model you want to inspect
plot_training_curves("ice", "yolov8s")


## 12. Test inference on new screenshots

Test the trained model on screenshots that were NOT in the training set.

In [None]:
def detect_cores(image_path, ring_type, model_size=None, conf=CONF_THRESHOLD):
    """
    Run the trained detector on a single screenshot and display the result.

    Args:
        image_path: path to a .png or .jpg screenshot
        ring_type: which ring type model to use ("ice", "metallic", "rocky", "metal-rich", "unified")
        model_size: "yolov8n" etc, or None to auto-pick the best one
        conf: confidence threshold
    """
    results_path = RUNS_ROOT / "all_results.json"
    if not results_path.exists():
        print("No trained models found. Run training first.")
        return

    with open(results_path) as f:
        all_r = json.load(f)

    ring_models = all_r.get(ring_type, {})
    if not ring_models:
        print(f"No trained model for ring type '{ring_type}'.")
        return

    if model_size is None:
        model_size = max(ring_models, key=lambda m: ring_models[m]["mAP50"])
        print(f"Auto-selected best model: {model_size} (mAP50={ring_models[model_size]['mAP50']:.3f})")

    weights = ring_models.get(model_size, {}).get("weights", "")
    if not weights or not Path(weights).exists():
        print(f"Weights file not found: {weights}")
        return

    model = YOLO(weights)
    img = cv2.imread(str(image_path))
    if img is None:
        print(f"Could not load image: {image_path}")
        return

    preds = model.predict(img, conf=conf, verbose=False)[0]
    annotated = preds.plot()

    plt.figure(figsize=(14, 8))
    plt.imshow(cv2.cvtColor(annotated, cv2.COLOR_BGR2RGB))
    plt.title(f"{ring_type} cores | {model_size} | {len(preds.boxes)} detection(s) | conf>={conf}")
    plt.axis("off")
    plt.tight_layout()
    plt.show()

    print(f"Detections: {len(preds.boxes)}")
    for box in preds.boxes:
        xyxy = [round(v, 1) for v in box.xyxy[0].tolist()]
        print(f"  core | conf={float(box.conf):.3f} | bbox={xyxy}")

    return preds


# Usage - point at any screenshot and pick your ring type:
# detect_cores("my_screenshot.png", ring_type="ice")
# detect_cores("my_screenshot.png", ring_type="metallic", model_size="yolov8s")


## 13. One-button retrain

After adding new labeled screenshots to the dataset folders, just call `retrain()`.
It re-validates, regenerates configs, trains the best-performing model size for each
ring type, and updates the results table.


In [None]:
def retrain(ring_type="all", model_size="best"):
    """
    Retrain after adding new labeled data. Drop new screenshots + labels into
    the dataset folders, then call this.

    Args:
        ring_type: "all" retrains every ring type, or pass one e.g. "ice"
        model_size: "best" reuses the model size that scored highest last time,
                    or pass a specific size like "yolov8s"
    """
    targets = RING_TYPES if ring_type == "all" else [ring_type]

    results_path = RUNS_ROOT / "all_results.json"
    prev_results = {}
    if results_path.exists():
        with open(results_path) as f:
            prev_results = json.load(f)

    for rt in targets:
        stats, issues = validate_dataset(rt)
        if issues:
            print(f"Dataset issues for '{rt}' - fix these before retraining:")
            for iss in issues:
                print(f"  {iss}")
            continue

        total_images = stats.get("train", {}).get("images", 0)
        if total_images < 5:
            print(f"Skipping '{rt}' - not enough images ({total_images}). Need at least 5.")
            continue

        # pick model size
        if model_size == "best":
            ring_prev = prev_results.get(rt, {})
            size = max(ring_prev, key=lambda m: ring_prev[m]["mAP50"]) if ring_prev else "yolov8s"
            print(f"Using best model from previous run: {size}")
        else:
            size = model_size

        generate_yaml(rt)
        result = train_model(rt, size)

        if result:
            weights_path, train_results = result
            rd = train_results.results_dict
            if rt not in prev_results:
                prev_results[rt] = {}
            prev_results[rt][size] = {
                "weights":   str(weights_path),
                "mAP50":     float(rd.get("metrics/mAP50(B)",     0)),
                "mAP50_95":  float(rd.get("metrics/mAP50-95(B)",  0)),
                "precision": float(rd.get("metrics/precision(B)", 0)),
                "recall":    float(rd.get("metrics/recall(B)",    0)),
            }

    with open(results_path, "w") as f:
        json.dump(prev_results, f, indent=2)

    print("\nRetrain complete!")
    show_comparison_table()


# Examples:
# retrain()                      # retrain all ring types using their best model
# retrain("ice")                 # retrain only the ice model
# retrain("ice", "yolov8m")      # force a specific model size


## 14. Export best models to ONNX

ONNX lets the companion app run inference without needing PyTorch installed.
It also opens the door to hardware acceleration via ONNX Runtime (which supports
CUDA, DirectML on Windows, and plain CPU).


In [None]:
def export_best_models():
    """
    Export the highest-scoring model for each ring type to ONNX format.
    Output goes to exports/<ring_type>_best.onnx
    """
    results_path = RUNS_ROOT / "all_results.json"
    if not results_path.exists():
        print("No trained models to export.")
        return

    with open(results_path) as f:
        all_r = json.load(f)

    EXPORTS_ROOT.mkdir(exist_ok=True)

    for ring_type, models in all_r.items():
        if not models:
            continue

        best_size = max(models, key=lambda m: models[m]["mAP50"])
        weights   = models[best_size]["weights"]

        if not Path(weights).exists():
            print(f"Weights missing for {ring_type}/{best_size}: {weights}")
            continue

        print(f"Exporting {ring_type} ({best_size}, mAP50={models[best_size]['mAP50']:.3f})...")
        model = YOLO(weights)
        exported = model.export(
            format="onnx",
            imgsz=IMG_SIZE,
            simplify=True,
            opset=17,
            dynamic=False,  # fixed input size, easier to deploy
        )

        dest = EXPORTS_ROOT / f"{ring_type}_best.onnx"
        shutil.copy(exported, dest)
        size_mb = dest.stat().st_size / 1024 / 1024
        print(f"  Saved: {dest}  ({size_mb:.1f} MB)")

    print("\nExport complete. ONNX models are in:", EXPORTS_ROOT)
    print("Use these with onnxruntime in the companion app.")


export_best_models()


## 15. Batch inference on a folder of screenshots

Useful for testing on a full session's worth of screenshots at once.


In [None]:
def batch_detect(screenshots_folder, ring_type, output_folder=None, conf=CONF_THRESHOLD):
    """
    Run detection on every image in a folder and save annotated copies.

    Args:
        screenshots_folder: folder containing .png/.jpg screenshots
        ring_type: which ring type model to use
        output_folder: where to save annotated images (defaults to screenshots_folder/detected)
        conf: confidence threshold
    """
    screenshots_folder = Path(screenshots_folder)
    if output_folder is None:
        output_folder = screenshots_folder / "detected"
    output_folder = Path(output_folder)
    output_folder.mkdir(exist_ok=True)

    results_path = RUNS_ROOT / "all_results.json"
    if not results_path.exists():
        print("No trained models. Run training first.")
        return

    with open(results_path) as f:
        all_r = json.load(f)

    ring_models = all_r.get(ring_type, {})
    if not ring_models:
        print(f"No model for ring type '{ring_type}'")
        return

    best_size = max(ring_models, key=lambda m: ring_models[m]["mAP50"])
    weights   = ring_models[best_size]["weights"]
    model     = YOLO(weights)

    images = sorted(
        list(screenshots_folder.glob("*.png"))
        + list(screenshots_folder.glob("*.jpg"))
    )
    print(f"Running {best_size} ({ring_type}) on {len(images)} images...")

    total_cores = 0
    for img_path in images:
        img   = cv2.imread(str(img_path))
        preds = model.predict(img, conf=conf, verbose=False)[0]
        annotated = preds.plot()
        out_path = output_folder / img_path.name
        cv2.imwrite(str(out_path), annotated)
        total_cores += len(preds.boxes)

    print(f"Done. {total_cores} total core detections across {len(images)} screenshots.")
    print(f"Annotated images saved to: {output_folder}")


# Usage:
# batch_detect(
#     screenshots_folder=r"C:/Users/YourName/Pictures/Frontier Developments/Elite Dangerous",
#     ring_type="ice",
# )
