Real-time detection and tracking of laparoscopic instruments using YOLOv8s and ByteTracker. Each instrument receives a persistent ID across video frames so you can follow individual tools throughout the entire procedure — even when they briefly leave and re-enter frame.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
results/demo/demo_detections.mp4— annotated detection video
results/tracking/images_tracked.mp4— ByteTracker video with persistent instrument IDs
| Area | Detail |
|---|---|
| Computer Vision | YOLOv8s fine-tuning on a domain-specific 16-class surgical dataset |
| Multi-object tracking | ByteTracker integration for persistent instrument IDs across video frames |
| Data engineering | Roboflow API download, split rebalancing (25 → 442 train images), test-set reuse |
| Model selection | Benchmarked YOLOv8n → YOLOv8s; 0.301 → 0.801 mAP50 with correct split + larger model |
| MLOps | Early-stopping (patience 15), per-class mAP reporting, weight management |
| Python packaging | Clean CLI scripts with argparse, pathlib, OpenCV video pipeline |
| Medical AI | Cholecystectomy instrument + anatomy detection (16 classes, CC BY 4.0 dataset) |
Trained on Roboflow surgery-database/laparoscopic-cholecystectomy v11 — 442 train / 10 val images, 16 classes.
Early stopped at epoch 31 (patience 15). Total training time: 1.6 hrs on CPU (AMD Ryzen 5 9600X).
| Class | mAP50 | mAP50-95 |
|---|---|---|
| Cautery | 0.830 | 0.731 |
| Common bile duct | 0.995 | 0.602 |
| Cystic artery | 0.588 | 0.326 |
| Cystic duct | 0.840 | 0.461 |
| Duodenum | 0.995 | 0.863 |
| Gallbladder | 0.913 | 0.695 |
| Grasper | 0.663 | 0.613 |
| Laparoscopic clip applier | 0.995 | 0.995 |
| Liver | 0.000 | 0.000 |
| Omentum | 0.995 | 0.697 |
| Right colon | 0.995 | 0.696 |
| Overall | 0.801 | 0.607 |
Liver scores 0.000 — only 3 val instances and very similar appearance to Omentum. More diverse images needed.
| Model | Train images | mAP50 | mAP50-95 |
|---|---|---|---|
| YOLOv8n (baseline) | 25 | 0.301 | 0.198 |
| YOLOv8s (final) | 442 | 0.801 | 0.607 |
| Δ | +417 | +166% | +207% |
dataset_setup.py # Download & verify dataset from Roboflow
train.py # Fine-tune YOLOv8 (any size n/s/m/l/x)
track.py # YOLOv8 + ByteTracker on video / image folder / webcam
make_demo_video.py # Build annotated MP4 + still frames from an image folder
data.yaml # Hand-crafted 7-class config (for custom splits)
dataset/data.yaml # Roboflow-generated 16-class config (used for training)
requirements.txt
results/
demo/
demo_detections.mp4 # Annotated detection video
results_grid.jpg # 2×3 contact sheet
stills/ # 6 individual annotated frames
tracking/
images_tracked.mp4 # ByteTracker video with persistent IDs
images_log.csv # Per-frame CSV: frame,track_id,class,conf,x1,y1,x2,y2
pip install -r requirements.txt
# 1 — Download dataset (Roboflow API key required)
python dataset_setup.py --api-key YOUR_KEY
# 2 — Train (reproduces final model)
python train.py --model yolov8s.pt --epochs 50 --patience 15 --device cpu
# 3 — Track instruments in a video
python track.py --source surgery.mp4
# 4 — Build demo MP4 + stills from a folder of images
python make_demo_video.py --images-dir dataset/valid/imagestrack.py uses model.track(tracker="bytetrack.yaml", persist=True) — ByteTracker is bundled with Ultralytics, no extra install needed. persist=True preserves association state so IDs survive momentary occlusions.
Optional --save-csv exports a per-frame log:
frame, track_id, class, conf, x1, y1, x2, y2
0, 1, Duodenum, 0.966, 266, 245, 522, 358
0, 2, Gallbladder, 0.934, 93, 3, 255, 196
0, 3, Common bile duct, 0.871, 374, 158, 447, 274
| ID | Name | Type |
|---|---|---|
| 0 | Cautery | Instrument |
| 1 | Common bile duct | Anatomy |
| 2 | Cystic Plate | Anatomy |
| 3 | Cystic artery | Anatomy |
| 4 | Cystic duct | Anatomy |
| 5 | Duodenum | Anatomy |
| 6 | Gallbladder | Anatomy |
| 7 | Grasper | Instrument |
| 8 | Laparoscopic clip | Instrument |
| 9 | Laparoscopic clip applier | Instrument |
| 10 | Laparoscopic scissors | Instrument |
| 11 | Liver | Anatomy |
| 12 | Omentum | Anatomy |
| 13 | Plane of dissection | Anatomy |
| 14 | Right colon | Anatomy |
| 15 | Triangle of Calot | Anatomy |
| Flag | Default | Notes |
|---|---|---|
--model |
yolov8s.pt |
n/s/m/l/x — larger = more accurate |
--epochs |
50 |
|
--patience |
15 |
Early-stop if val mAP stalls |
--batch |
8 |
Reduce for low-RAM CPUs |
--device |
cpu |
Set to 0 for NVIDIA GPU |
Roboflow surgery-database/laparoscopic-cholecystectomy v11 — CC BY 4.0
452 total images of cholecystectomy (gallbladder removal) procedures.
Dataset page
- Python ≥ 3.9 (type hints in
make_demo_video.pyuselist[T]syntax) - PyTorch — install from pytorch.org to match your hardware (CPU / CUDA)
pip install -r requirements.txt| Package | Min version | Purpose |
|---|---|---|
ultralytics |
8.2.0 | YOLOv8 detection + ByteTracker |
roboflow |
1.0.0 | Dataset download |
opencv-python |
4.8.0 | Video I/O and annotation |
torch |
2.0.0 | Model backend |
torchvision |
0.15.0 | Image transforms |
numpy |
1.24.0 | Array ops |
- GPU support — test and document CUDA training path (
--device 0) - CholecT50 migration — swap to the 50-video dataset for stronger generalisation
- ONNX / TensorRT export — real-time inference on edge hardware
- Per-instrument time-in-frame analytics — use the tracking CSV to chart instrument usage over a procedure
- Larger model benchmarks — compare YOLOv8m / YOLOv8l mAP vs. inference speed
- BoT-SORT tracker option — alternative to ByteTracker for re-ID after long occlusions
This project is for research and educational purposes only and does not constitute medical advice.






