## Problem Description

Bone fractures are a common injury that require an accurate diagnosis in a timely manner. Manual inspection of X-ray images by radiologists can be time-consuming and subject to human error. This project aims to automate fracture detection in X-ray images using deep learning and object detection techniques.

We use a labeled dataset of X-ray images that includes boxes around the fractures, allowing us to train an object detection model using the YOLOv8 architecture.

## Import Libraries

In [20]:
import os
import cv2
import matplotlib.pyplot as plt

## EDA

In [26]:
# Define the base directory (update this to your environment as needed)
base_dir = "/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8"
train_dir = os.path.join(base_dir, "train", "images")
label_dir = os.path.join(base_dir, "train", "labels")

In [25]:
# List available image files (YOLO expects .jpg or .png)
image_files = [f for f in os.listdir(train_dir) if f.endswith(('.jpg', '.png'))]

### Image Preprocessing

In [5]:
def apply_clahe_to_folder(img_folder_path, out_folder_path):
    img_folder = Path(img_folder_path)
    out_folder = Path(out_folder_path)
    out_folder.mkdir(parents=True, exist_ok=True)

    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))

    for img_path in img_folder.glob("*.jpg"):
        img = cv2.imread(str(img_path), cv2.IMREAD_GRAYSCALE)
        if img is not None:
            cl = clahe.apply(img)
            out_path = out_folder / img_path.name
            cv2.imwrite(str(out_path), cl)

In [6]:
original_base = "bone-fracture-detection-computer-vision-project/BoneFractureYolo8"
output_base = "preprocessed"

In [7]:
# Apply CLAHE to train/valid/test images
apply_clahe_to_folder(f"{original_base}/train/images", f"{output_base}/train/images")
apply_clahe_to_folder(f"{original_base}/valid/images", f"{output_base}/valid/images")
apply_clahe_to_folder(f"{original_base}/test/images", f"{output_base}/test/images")

In [8]:
data_yaml_content = """
train: preprocessed/train/images
val: preprocessed/valid/images

nc: 1
names: ['fracture']
"""

with open("preprocessed/data.yaml", "w") as f:
    f.write(data_yaml_content.strip())

## Model Training

In [9]:
!yolo task=detect mode=train model=yolov8n.pt data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml epochs=25 imgsz=640 batch=16

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt'...
100%|██████████████████████████████████████| 6.25M/6.25M [00:00<00:00, 83.0MB/s]
Ultralytics 8.3.156 🚀 Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=25, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4,

## Model Validation

In [10]:
!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml

Ultralytics 8.3.156 🚀 Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 3,007,013 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 8.0±1.1 MB/s, size: 9.5 KB)
[34m[1mval: [0mScanning /kaggle/input/bone-fracture-detection-computer-vision-project/Bone[0m
                 Class     Images  Instances      Box(P          R      mAP50  m
  xa[xa < 0] = -1
  xa[xa < 0] = -1
                   all        348        204      0.315      0.288      0.285      0.105
        elbow positive         28         29       0.13      0.172      0.069     0.0218
      fingers positive         41         48      0.296      0.208      0.222     0.0602
      forearm fracture         37         43      0.491      0.488      0.455      0.212
               humerus         31         36      0.575      0.583      0.633      0.197
     shoulder fracture         19         20      0.254      0.205       0.24      0.1

In [11]:
results_path = "runs/detect/train/results.csv"

# Load into pandas
results_df = pd.read_csv(results_path)

# Show final epoch's metrics
results_df.tail(1)

Unnamed: 0,epoch,time,train/box_loss,train/cls_loss,train/dfl_loss,metrics/precision(B),metrics/recall(B),metrics/mAP50(B),metrics/mAP50-95(B),val/box_loss,val/cls_loss,val/dfl_loss,lr/pg0,lr/pg1,lr/pg2
24,25,994.761,1.4869,1.37578,1.49593,0.35364,0.32408,0.28386,0.09057,2.43787,2.63582,2.30984,4.5e-05,4.5e-05,4.5e-05


As we can see from the precision, recall, and mean average precisions, the model is severely underfitting the data and is generating some false positives as well. We will try to make some improvements.

## Model Improvement

### Change Model Size

We will start by changing the model to yolo8s rather than yolo8n as well as increasing the number of epochs to 50.

In [12]:
!yolo task=detect mode=train model=yolov8s.pt data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml epochs=50 imgsz=640 batch=16

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8s.pt to 'yolov8s.pt'...
100%|███████████████████████████████████████| 21.5M/21.5M [00:00<00:00, 149MB/s]
Ultralytics 8.3.156 🚀 Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=50, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4,

In [13]:
results_path = "runs/detect/train2/results.csv"

# Load into pandas
results_df = pd.read_csv(results_path)

# Show final epoch's metrics
results_df.tail(1)

Unnamed: 0,epoch,time,train/box_loss,train/cls_loss,train/dfl_loss,metrics/precision(B),metrics/recall(B),metrics/mAP50(B),metrics/mAP50-95(B),val/box_loss,val/cls_loss,val/dfl_loss,lr/pg0,lr/pg1,lr/pg2
49,50,3032.93,0.95342,0.62423,1.22729,0.28985,0.30012,0.26691,0.09543,2.79225,3.22024,3.20707,2.7e-05,2.7e-05,2.7e-05


The results went down, so we will go back to yolov8n and add data augmentation.

In [14]:
!yolo task=detect mode=train model=yolov8n.pt data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml epochs=50 imgsz=640 batch=16 hsv_h=0.015 hsv_s=0.7 hsv_v=0.4 flipud=0.5 mosaic=1.0 mixup=0.2 degrees=10

Ultralytics 8.3.156 🚀 Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/input/bone-fracture-detection-computer-vision-project/BoneFractureYolo8/data.yaml, degrees=10, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=50, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.5, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.2, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train3, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_m

In [15]:
results_path = "runs/detect/train3/results.csv"

# Load into pandas
results_df = pd.read_csv(results_path)

# Show final epoch's metrics
results_df.tail(1)

Unnamed: 0,epoch,time,train/box_loss,train/cls_loss,train/dfl_loss,metrics/precision(B),metrics/recall(B),metrics/mAP50(B),metrics/mAP50-95(B),val/box_loss,val/cls_loss,val/dfl_loss,lr/pg0,lr/pg1,lr/pg2
49,50,2302.54,1.63288,1.50366,1.61176,0.5005,0.29735,0.33914,0.11416,2.3789,2.58811,2.22771,2.7e-05,2.7e-05,2.7e-05


The results improved slightly. We will use this model for predicting on the test set.

## Predict on Test Set

In [19]:
!yolo task=detect mode=predict \
  model=runs/detect/train3/weights/best.pt \
  source=preprocessed/test/images \
  save=True

Ultralytics 8.3.156 🚀 Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 3,007,013 parameters, 0 gradients, 8.1 GFLOPs

Traceback (most recent call last):
  File "/usr/local/bin/yolo", line 8, in <module>
    sys.exit(entrypoint())
             ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/ultralytics/cfg/__init__.py", line 985, in entrypoint
    getattr(model, mode)(**overrides)  # default args from model
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/ultralytics/engine/model.py", line 555, in predict
    return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/ultralytics/engine/predictor.py", line 247, in predict_cli
    for _ in gen:  # sourcery skip: remove-empty-nested-block, noqa
  File "/usr/local/lib/python3.11/dist-packages/torch

In [18]:
print(os.listdir("."))

['yolov8n.pt', 'yolo11n.pt', 'runs', 'preprocessed', '.virtual_documents', 'yolov8s.pt']


In [17]:
pred_dir = Path("runs/detect/predict")
pred_images = list(pred_dir.glob("*.jpg"))

# display 5 predicted images
for img_path in pred_images[:5]:
    display(Image(filename=img_path))

## Conclusion

## 📌 Conclusion

This project involved building a deep learning object detection system to identify bone fractures in X-ray images using YOLOv8. The final model was trained using:

- YOLOv8s (small) pretrained weights
- CLAHE preprocessing to enhance contrast
- Advanced data augmentation (Mosaic, MixUp, HSV shifts, flips, rotations)
- 50 training epochs

Despite these enhancements, the final validation performance remained modest:

- **Precision:** 35.4%
- **Recall:** 32.4%
- **mAP@0.5:** 28.4%
- **mAP@0.5:0.95:** 9.1%

This suggests that either the model was unable to extract meaningful patterns from the dataset, or the data itself lacked the volume, quality, or diversity necessary for better generalization.

---

### What Helped:
- A complete YOLO training pipeline with correct formatting and augmentations
- Use of CLAHE to enhance image visibility

### What Didn’t Help:
- Model upgrade and preprocessing did not improve results over baseline (YOLOv8n)
- Augmentations may have introduced noise or been insufficient

---

### Future Work:
- Use higher-resolution input (`imgsz=768` or 1024)
- Train longer (`epochs=100`)
- Try `yolov8m.pt` or `yolov8l.pt` for better capacity
- Apply stricter filtering of annotation quality
- Add more data or try semi-supervised learning

Although final performance was limited, this project demonstrates the full object detection workflow — and provides a strong foundation for further experimentation with fracture detection models.
