# 🏥 Visual AI in Healthcare with FiftyOne – Fine-tuning YOLOv8 for Stenosis Detection  
**Train, evaluate, and visualize a YOLOv8 object detection model using the curated ARCADE dataset**

This notebook is part of the **“Visual AI in Healthcare with FiftyOne”** workshop. In this hands-on exercise, you'll learn how to fine-tune a YOLOv8 object detection model for stenosis detection using a subset of the ARCADE dataset. The subset is curated through embedding-based selection methods described in previous notebooks. This notebook demonstrates how to fine-tune a YOLOv8 model on a curated subset of 300 images with `stenosis` from the ARCADE dataset using FiftyOne and Ultralytics' YOLOv8 integration.

🧠 **What you’ll learn in this notebook:**

- How to **prepare and export a YOLOv8-compatible dataset** from FiftyOne  
- How to **fine-tune YOLOv8** on a small but representative subset  
- How to **run inference on test data** using the fine-tuned model  
- How to **load predictions into FiftyOne** for evaluation  
- How to **compare model predictions** with ground truth annotations  

📚 **Part of the notebook series:**
1. `01_load_arcade_dataset.ipynb` – Load and visualize the ARCADE dataset.  
2. `02_load_deeplesion_balanced.ipynb` – Curate and balance the DeepLesion dataset.  
3. `03_vlms_analysis_arcade.ipynb` – Use VFMs like NVLabs_CRADIOV3 in dataset undersatnding for ARCADE. 
4. `04_finetune_yolo8_stenosis.ipynb` – Train and integrate YOLOv8 for stenosis detection.  
5. `05_medsam2_ct_scan.ipynb` – Run MedSAM2 on CT scans for segmentation.  
6. `06_nvidia_vista_segmentation.ipynb` – Explore NVIDIA-VISTA-3D.  
7. `07_medgemma_vqa.ipynb` – Perform visual question answering and classification with MedGemma.

All notebooks are standalone but are best experienced sequentially.

### ✅ Requirements

Please install all the requeriments for running this notebook. And import the libraries.

In [None]:
# Step 1: Imports
import fiftyone as fo
import fiftyone.utils.random as four
from ultralytics import YOLO
import os

### 📦 Load and Prepare the ARCADE Subset for Fine-Tuning

In this step, we prepare the curated subset of the ARCADE dataset for training a YOLOv8 model.

- We first **check if an existing FiftyOne dataset** with the same name exists and delete it if necessary to avoid conflicts.
- Then we **load the dataset** from disk using the `YOLOv5Dataset` format (supported in FiftyOne for YOLOv5/YOLOv8-style annotations).
- The dataset is then **launched in the FiftyOne App** for visual inspection.
- Next, we **split the dataset** into training and validation sets (80/20) using `fiftyone.utils.random`.
- Finally, we **extract the class labels** from the segmentation annotations and **export** the dataset back to disk in YOLO format, ready for training.

> 📌 Note: This step assumes you previously exported the curated subset of ARCADE using FiftyOne. The loaded dataset will be used to fine-tune a YOLOv8 object detection model.


In [None]:
# Name of the existing dataset
#dataset_name = "arcade_subset_loaded"
dataset_name = "arcade_subset_test"


# Delete it if it exists
if fo.dataset_exists(dataset_name):
    fo.delete_dataset(dataset_name)
    print(f"Deleted existing dataset: {dataset_name}")
else:
    print(f"No dataset found with name: {dataset_name}")


In [None]:
import fiftyone.types as fot

# Define the path to your exported YOLOv8 dataset
dataset_dir = "arcade_yolo_subset"  # <-- this should match your previous export_dir

import fiftyone as fo
import fiftyone.types as fot

dataset = fo.Dataset.from_dir(
    dataset_dir="arcade_yolo_subset",
    dataset_type=fot.YOLOv5Dataset,
    split="train",  # match what was exported
    label_field="segmentations",
    name="arcade_subset_loaded",
)
# Launch FiftyOne App to visualize
session = fo.launch_app(dataset)

In [None]:
# Step 3: Train/Val Split (80/20)
dataset.untag_samples(dataset.distinct("tags"))
four.random_split(dataset, {"train": 0.8, "val": 0.2})

In [None]:
# Step 4: Extract Classes from Segmentations
label_field = "segmentations"
classes = sorted({
    det.label
    for sample in dataset.select_fields(label_field)
    if sample[label_field] is not None
    for det in sample[label_field].detections
})
print("Classes:", classes)

In [None]:
# Step 5: Export to YOLO Format
export_dir = "./arcade_yolo"
export_yolo_data(
    dataset,
    export_dir,
    classes=classes,
    label_field=label_field,
    split=["train", "val"]
)

### 🏋️ Fine-Tune the YOLOv8 Model (Run in Terminal)

Now that the dataset is prepared and exported in YOLO format, we fine-tune a YOLOv8 model using the [Ultralytics CLI](https://docs.ultralytics.com/cli/train/).

The command below uses the `yolov8n.pt` (YOLOv8 Nano) pretrained model and trains it on our exported dataset:

```bash
yolo task=detect mode=train \
  model=yolov8n.pt \
  data=arcade_yolo/dataset.yaml \
  epochs=60 imgsz=640 batch=16
```

Follow this [documentation](https://docs.voxel51.com/tutorials/yolov8.html?highlight=fine%20tune) for Fine-Tuning process with Ultralytics.

In [None]:
!yolo task=detect mode=train model=yolov8n.pt data=arcade_yolo_subset/dataset.yaml epochs=60 imgsz=640 batch=16

### Run Inference with Fine-Tuned YOLOv8 Model

Once training is complete, we can use the best weights to run inference on the validation set. The following command uses the `yolo` CLI to perform predictions:

```bash
!yolo task=detect mode=predict \
  model=/path/ultralyrics/results/runs/detect/train2/weights/best.pt \
  source=arcade_yolo_subset/images/val \
  save_txt=True \
  save_conf=True


In [None]:
!yolo task=detect mode=predict model=/path/ultralyrics/results/runs/detect/train2/weights/best.pt source=arcade_yolo_subset/images/val save_txt=True save_conf=True

### Load Predictions into FiftyOne

After running inference, we now load the predictions into our FiftyOne dataset for visualization and evaluation.

### Step 1: Load the validation split of the YOLOv5-style dataset

We use FiftyOne's `Dataset.from_dir()` method to load the validation set from the `arcade_yolo_subset` directory. The dataset is in `YOLOv5Dataset` format. We assign the labels to the `ground_truth` field for later comparison.

This allows us to later visualize both the ground truth and the predicted detections in the same FiftyOne session.


In [None]:
import fiftyone as fo
import fiftyone.utils.yolo as fouy

# Load original dataset
dataset_ = fo.Dataset.from_dir(
    dataset_dir="arcade_yolo_subset",
    dataset_type=fo.types.YOLOv5Dataset,
    split="val",
    label_field="ground_truth",
    name="arcade_subset_test",
)

### Step 2: Define utility functions to read and convert YOLOv8 predictions

We define the following helper functions:

- `get_prediction_filepath`: builds the filepath to the YOLOv8 prediction `.txt` file corresponding to each image.
- `read_yolo_detections_file`: reads a prediction `.txt` file and extracts bounding box data.
- `convert_yolo_detections_to_fiftyone`: converts YOLO-format detections into FiftyOne's `Detection` objects.
- `_uncenter_boxes`: converts YOLO box format (center x/y, width, height) to FiftyOne box format (top-left x/y, width, height).
- `_get_class_labels`: maps YOLO class indices to class labels using the dataset's class list.

These functions are used to transform YOLOv8 predictions into a format that can be visualized and analyzed within FiftyOne.


In [None]:
def get_prediction_filepath(filepath, run_number = 1):
    run_num_string = ""
    if run_number != 1:
        run_num_string = str(run_number)
    filename = filepath.split("/")[-1].split(".")[0]
    return f"/Users/paularamos/Documents/GitHub/awesome-fiftyone/runs/detect/predict/labels/{filename}.txt"

def add_yolo_detections(
    samples,
    prediction_field,
    prediction_filepath,
    class_list
    ):

    prediction_filepaths = samples.values(prediction_filepath)
    yolo_detections = [read_yolo_detections_file(pf) for pf in prediction_filepaths]
    detections =  [convert_yolo_detections_to_fiftyone(yd, class_list) for yd in yolo_detections]
    samples.set_values(prediction_field, detections)

def read_yolo_detections_file(filepath):
    detections = []
    if not os.path.exists(filepath):
        return np.array([])

    with open(filepath) as f:
        lines = [line.rstrip('\n').split(' ') for line in f]

    for line in lines:
        detection = [float(l) for l in line]
        detections.append(detection)
    return np.array(detections)

def convert_yolo_detections_to_fiftyone(
    yolo_detections,
    class_list
    ):

    detections = []
    if yolo_detections.size == 0:
        return fo.Detections(detections=detections)

    boxes = yolo_detections[:, 1:-1]
    _uncenter_boxes(boxes)

    confs = yolo_detections[:, -1]
    labels = _get_class_labels(yolo_detections[:, 0], class_list)

    for label, conf, box in zip(labels, confs, boxes):
        detections.append(
            fo.Detection(
                label=label,
                bounding_box=box.tolist(),
                confidence=conf
            )
        )

    return fo.Detections(detections=detections)

def _uncenter_boxes(boxes):
    '''convert from center coords to corner coords'''
    boxes[:, 0] -= boxes[:, 2]/2.
    boxes[:, 1] -= boxes[:, 3]/2.

def _get_class_labels(predicted_classes, class_list):
    labels = (predicted_classes).astype(int)
    labels = [class_list[l] for l in labels]
    return labels

### Step 3: Load YOLOv8 predictions and add them to the dataset

We generate the list of prediction filepaths for the images in the validation split using `get_prediction_filepath`.

Then, we use `add_yolo_detections()` to load predictions from disk, convert them to FiftyOne `Detection` objects, and attach them to each sample under the field `yolov8n_arcade`.

This prepares the dataset for qualitative or quantitative evaluation of YOLOv8 predictions within FiftyOne.


In [None]:
import numpy as np

filepaths = dataset_.values("filepath")
prediction_filepaths = [get_prediction_filepath(fp, run_number=2) for fp in filepaths]

dataset_.set_values(
    "yolov8n_arcade_det_filepath",
    prediction_filepaths
)

add_yolo_detections(
    dataset_,
    "yolov8n_arcade",
    "yolov8n_arcade_det_filepath",
    classes
)



In [None]:
session = fo.launch_app(dataset_, port=5151, auto=False)

### Evaluating YOLOv8 Results with FiftyOne Plugins

FiftyOne supports powerful evaluation capabilities through its plugin system. With the [`@voxel51/evaluation`](https://github.com/voxel51/fiftyone-plugins) plugin, you can evaluate detection, classification, segmentation, and regression models directly from the UI or Python SDK.

#### 🔌 Plugin Installation

To install the evaluation plugin, run the following command in your terminal:

```bash
fiftyone plugins download \
  https://github.com/voxel51/fiftyone-plugins \
  --plugin-names @voxel51/evaluation


### Run Detection Evaluation on YOLOv8 Predictions

Now that our YOLOv8 predictions are added to the dataset, we can evaluate them against the ground truth using FiftyOne's built-in `evaluate_detections` method.

In this example, we disable mAP computation (`compute_mAP=False`) since the predictions do not contain confidence scores.

```python
results = dataset_.evaluate_detections(
    "yolov8n_arcade",          # field with model predictions
    gt_field="ground_truth",   # ground truth field
    eval_key="eval_no_conf",   # identifier for this evaluation run
    compute_mAP=False          # skips mAP since we don't have confidence scores
)

results.print_report()


In [None]:
results = dataset_.evaluate_detections(
    "yolov8n_arcade",
    gt_field="ground_truth",
    eval_key="eval_no_conf",
    compute_mAP=False,  # Avoids needing confidence
)
results.print_report()


In [None]:
dataset_.reload()
dataset_.persistent=True
print(dataset_)