# Bone Fracture Detection using Deep Learning

This Jupyter Notebook explores the application of deep learning for bone fracture detection using a comprehensive X-ray image dataset.  The dataset is specifically designed for computer vision projects and aims to facilitate the development and evaluation of automated bone fracture detection algorithms.

## About the Dataset

The dataset encompasses X-ray images categorized into several classes, each representing a specific type of bone fracture within the upper extremities. These classes include:

*   Elbow Positive
*   Fingers Positive
*   Forearm Fracture
*   Humerus Fracture
*   Shoulder Fracture
*   Wrist Positive

Each image is annotated with either bounding boxes or pixel-level segmentation masks, precisely indicating the location and extent of the detected fracture. These annotations are crucial for training and evaluating bone fracture detection algorithms, particularly object detection models.

This dataset provides a valuable resource for researchers and developers working on automated fracture detection. Its diverse range of fracture classes enables the training of robust models capable of accurately identifying fractures in various regions of the upper extremities. The ultimate goal of this dataset is to accelerate the development of computer vision solutions for automated fracture detection, thereby contributing to advancements in medical diagnostics and improved patient care.

**When using this dataset for your research, please cite it using the following DOI:** 10.13140/RG.2.2.14400.34569

**You can also find the dataset on ResearchGate:** [https://www.researchgate.net/publication/382268240_Bone_Fracture_Detection_Computer_Vision_Project](https://www.researchgate.net/publication/382268240_Bone_Fracture_Detection_Computer_Vision_Project)

## Imports

In [24]:

import tensorflow as tf
from tensorflow import keras
import cv2
import numpy as np
import matplotlib.pyplot as plt
tf.config.list_physical_devices('GPU'), tf.__version__

([PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')], '2.18.0')

## Download datasets

In [25]:
# Loading kaggle keys
from dotenv import load_dotenv
load_dotenv()

True

In [26]:
%load_ext dotenv
%dotenv

The dotenv extension is already loaded. To reload it, use:
  %reload_ext dotenv


In [27]:
from hydra import initialize, compose
from omegaconf import OmegaConf
from hydra import main
# https://gist.github.com/bdsaglam/586704a98336a0cf0a65a6e7c247d248

with initialize(version_base=None, config_path="conf"):
    cfg = compose(config_name="config")
    print(cfg.DATASET_DIRS.TRAIN_DIR)

datasets/BoneFractureYolo8/train/


In [28]:
import opendatasets as od
import os

if len(os.listdir(cfg.DATASET.DATASET_DIR)) == 0:
    # Download the dataset
    od.download(dataset_id_or_url=cfg.DATASET.BONE_FRACTURE_DETECTION_DATASET_URL,
                data_dir=cfg.DATASET.DATASET_DIR)

## Loading Images

In [29]:
TRAIN_DIR = cfg.DATASET_DIRS.TRAIN_DIR
VALIDATION_DIR = cfg.DATASET_DIRS.VALIDATION_DIR
TEST_DIR = cfg.DATASET_DIRS.TEST_DIR

TRAIN_IMAGE = f'{TRAIN_DIR}/images'
TRAIN_LABELS = f'{TRAIN_DIR}/labels'

VALID_IMAGE = f'{VALIDATION_DIR}/images'
VALID_LABELS = f'{VALIDATION_DIR}/labels'

TEST_IMAGE = f'{TEST_DIR}/images'
TEST_LABELS = f'{TEST_DIR}/labels'

IMG_SIZE = cfg.TRAIN.IMG_SIZE

In [30]:
def load_image(image_path):
    """Loads and preprocesses an image."""
    img = cv2.imread(image_path)  # Use cv2 for more image format support
    if img is None:
        raise ValueError(f"Could not read image: {image_path}")
    # Convert to RGB (TensorFlow default)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))  # Resize if needed
    img = img / 255.0  # Normalize pixel values (important!)
    return img

In [31]:
def load_yolo_labels(label_path):
    """Loads and parses YOLOv8 labels."""
    try:
        with open(label_path, 'r') as f:
            lines = f.readlines()
    except FileNotFoundError:
        return []  # Handle cases where no label file exists

    labels = []
    for line in lines:
        parts = line.strip().split()
        if len(parts) >= 5:  # check if the line has enough elements
            class_id = int(parts[0])
            x_center = float(parts[1])
            y_center = float(parts[2])
            width = float(parts[3])
            height = float(parts[4])

            # Convert to normalized coordinates (0.0 - 1.0) if necessary.
            # YOLO format is already normalized.
            labels.append([class_id, x_center, y_center, width, height])
        else:
            print(f"Skipping malformed line in {label_path}: {line}")

    return np.array(labels, dtype=np.float32)  # convert labels to numpy array

In [None]:
def create_tf_dataset(image_files, label_files):
    """Creates a TensorFlow Dataset."""

    def generator():
        for image_path, label_path in zip(image_files, label_files):
            try:
                image = load_image(image_path)
            
                labels = load_yolo_labels(label_path)

                # Convert labels to TensorFlow tensor.
                labels_tf = tf.convert_to_tensor(labels, dtype=tf.float32)

                yield image, labels_tf
            except Exception as e:
                print(f"Error processing {image_path} or {label_path}: {e}")
                continue  # Skip to the next file

    dataset = tf.data.Dataset.from_generator(
        generator,
        output_signature=(
            tf.TensorSpec(shape=(IMG_SIZE, IMG_SIZE, 3),
                          dtype=tf.float32),  # Image shape
            # Labels shape (variable number of boxes)
            tf.TensorSpec(shape=(None, 5), dtype=tf.float32)
        )
    )

    return dataset

### Training Dataset setup

In [59]:

import glob
import pathlib


image_files = [str(f) for f in sorted(pathlib.Path(TRAIN_IMAGE).glob('*.jpg'))]
label_files = [str(f)
               for f in sorted(pathlib.Path(TRAIN_LABELS).glob('*.txt'))]


# print(label_files)
# Ensure that you have same number of images and labels
if len(image_files) != len(label_files):
    raise ValueError("Number of image files and label files do not match.")


# --- 4. Create and Preprocess Dataset ---
dataset = create_tf_dataset(image_files, label_files)


In [58]:
# # Batching, shuffling, prefetching (Essential for training)
BATCH_SIZE = 32
dataset = dataset.shuffle(buffer_size=len(image_files))  # Shuffle the dataset
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.prefetch(
    buffer_size=tf.data.AUTOTUNE)  # Optimize data loading

print(dataset.take(1))
# for images, labels in dataset.take(1):  # take only one batch
#     print("Image shape:", images.shape)
#     # print("Labels shape:", labels.shape)
#     print("Example labels:", labels.numpy())  # Access the label data

<_TakeDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, None, 5), dtype=tf.float32, name=None))>
