kneeseg is a Python reimplementation of the paper "Semantic Context Forests for Learning-Based Knee Cartilage Segmentation in 3D MR Images". [paper] [slides] [poster]
| Original MICCAI Workshop Paper | This Implementation | |
|---|---|---|
| Bone Segmentation | Active Shape Model (Siemens proprietary) | Dense Auto-Context Random Forest (Python) |
| Cartilage Segmentation | Semantic Context Forest (C++) | Semantic Context Forest (Python) |
| Dataset | Osteoarthritis Initiative (OAI) | SKI10 |
| Bone DSC | ~95% | ~92% |
| Cartilage DSC | ~83% | ~66% |
- Installation
- Dataset Support
- Pretrained Models
- Usage
- Configuration
- Experiments
- Algorithm Details
- Citation
You can install the package via pip:
pip install kneesegThis package supports two major dataset structures for knee segmentation:
The standard dataset used for the MICCAI 2010 challenge.
- Target structures: Femur, Tibia, Femoral Cartilage, Tibial Cartilage.
- Label Mapping:
0: Background1: Femur2: Femoral Cartilage3: Tibia4: Tibial Cartilage
The Osteoarthritis Initiative dataset, refactored to extend the SKI10 schema.
- Target structures: Adds Patella and Patellar Cartilage.
- Label Mapping:
0: Background1: Femur2: Femoral Cartilage3: Tibia4: Tibial Cartilage5: Patella6: Patellar Cartilage
Note: The pipeline adapts its behavior (number of classes, target structures) based on the
target_bonesconfiguration.
We provide pretrained models for both SKI10 and downsampled OAI datasets on huggingface.
Models are trained on all 100 SKI10 training images, supporting segmentation of femur, tibia, femoral cartilage, and tibial cartilage:
Models are trained on 128 OAI training images, supporting segmentation of femur, tibia, patella, femoral cartilage, tibial cartilage, and patellar cartilage:
You can use kneeseg modules directly in your Python scripts to load data, train models, or run inference.
import os
from kneeseg.io import load_volume, save_volume
from kneeseg.bone_rf import BoneClassifier
# 1. Load Data
# Data should be .mhd/.raw or .hdr/.img
image_path = 'data/image-001.mhd'
image, spacing = load_volume(image_path, return_spacing=True)
# 2. Initialize Model
# Example: Initialize the first-pass Bone Classifier
bone_rf = BoneClassifier(n_estimators=100, max_depth=25)
# 3. Predict (assuming pre-trained model loaded or trained)
# bone_rf.load('models/bone_rf_p1.joblib')
pred_mask, prob_map = bone_rf.predict(image)
# 4. Save Result
save_volume(pred_mask, 'output/prediction.mhd', metadata={'spacing': spacing})To use models you have trained or downloaded (e.g., from the Hugging Face release), simply use the load() method:
from kneeseg.bone_rf import BoneClassifier
from kneeseg.rf_seg import CartilageClassifier
# 1. Initialize empty classifiers
# (Parameters must match training, or just use defaults if standard)
bone_p1 = BoneClassifier()
bone_p2 = BoneClassifier()
cart_p1 = CartilageClassifier()
cart_p2 = CartilageClassifier()
# 2. Load the weights
bone_p1.load("path/to/bone_rf_p1.joblib")
bone_p2.load("path/to/bone_rf_p2.joblib")
cart_p1.load("path/to/cartilage_rf_p1.joblib")
cart_p2.load("path/to/cartilage_rf_p2.joblib")
# 3. Predict (Example: Bone Pass 1)
pred_p1, prob_p1 = bone_p1.predict(image)The package provides a command-line interface kneeseg-pipeline to orchestrate the full training and inference workflow using the SKI10 dataset split.
Prerequisites:
- Data Structure: The pipeline expects two directories:
.../images: Contains.mhdimage files..../images_labels: Contains.mhdlabel files (folder name must beimage_dir+_labels).
- File Naming: Files must match the SKI10 naming convention (e.g.,
image-001.mhd,labels-001.mhd) as defined inkneeseg/data/ski10_full_split.json.
Command:
# Point to your image directory. Expects sibling directory with "_labels" suffix.
kneeseg-pipeline --data-dir /path/to/SKI10/data/imagesWorkflow:
- Training: Checks if models exist in
experiments/models. If not, trains using the 60 training cases. - Inference: Checks if
evaluation_report.jsonexists inexperiments/predictions. If not, runs inference on the 20 evaluation cases.
For advanced usage and reproduction scripts (e.g., training models from scratch), please refer to the Experiments Documentation.
The pipeline relies on a JSON configuration file to define data paths and model parameters. You can create your own config file for custom experiments.
A valid configuration file has three main sections:
data_config: Paths to your data and split files.training_config: Parameters for Random Forest training (e.g., number of trees).model_config: Configuration for model architecture (target bones) and storage directory.output_config: Directory for saving predictions.
{
"data_config": {
"image_directory": "/path/to/images",
"label_directory": "/path/to/labels",
"split_file": "/path/to/split.json"
},
"training_config": {
"augmentation": true,
"bone_parameters": {
"n_estimators": 100,
"max_depth": 25,
"n_jobs": -1
},
"cartilage_parameters": {
"n_estimators": 100,
"max_depth": 20,
"training_proximity_mm": 15.0
}
},
"model_config": {
"target_bones": ["femur", "tibia", "patella"],
"model_directory": "/path/to/save/models",
"dtype": "bfloat16"
},
"output_config": {
"prediction_directory": "/path/to/save/predictions"
}
}target_bones: List of bones to segment.- Default for SKI10:
["femur", "tibia"] - Default for OAI:
["femur", "tibia", "patella"] - Cartilage is closely coupled: "femur" includes "femoral cartilage".
- Default for SKI10:
dtype: (Optional) Data type for feature extraction matrices.- Options:
"float32"(default),"bfloat16". - Recommedation: Use
"bfloat16"to reduce memory usage by ~50%. Requiresml_dtypes.
- Options:
Note: The
split_fileshould be a JSON containing{"train": ["file1.mhd", ...], "eval": ["file2.mhd", ...]}.
The experiments/ directory contains reproduceable scripts for the SKI10 expeirments, and will store the output models (models/) and predictions (predictions/) if you run the scripts provided there. See experiments/README.md for details.
Since the SKI10 dataset doesn not provide the ground truth labels for its default testing set, we evaluated the pipeline on a 20% hold-out set (20 cases) from the SKI10 training data (Total 100 cases: 80 Train, 20 Eval).
| Structure | Dice Similarity Coefficient (DSC) |
|---|---|
| Femur | 0.9046 ± 0.0361 |
| Tibia | 0.9292 ± 0.0260 |
| Femoral Cartilage | 0.6767 ± 0.0481 |
| Tibial Cartilage | 0.6411 ± 0.0540 |
The following 20 cases were held out for evaluation:
image-004, image-005, image-012, image-014, image-015, image-018, image-028, image-029, image-030, image-032, image-036, image-055, image-065, image-070, image-076, image-082, image-087, image-089, image-095, image-098.
After downsampling and filtering, we performed a 80%-20% split, using 128 images for training, and 31 images for evaluation.
| Structure | Dice Similarity Coefficient (DSC) |
|---|---|
| Femur | 0.7130 ± 0.0673 |
| Tibia | 0.7545 ± 0.0598 |
| Patella | 0.5209 ± 0.0831 |
| Femoral Cartilage | 0.5171 ± 0.0716 |
| Tibial Cartilage | 0.4134 ± 0.0888 |
| Patellar Cartilage | 0.3633 ± 0.1406 |
-
Pass 1: Dense Random Forest voxel classification.
-
Features: Normalized Intensity, Gaussian Smoothed Intensity (
$\sigma=2.0, 4.0$ ), Spatial Coordinates, RSID (20 offsets). - Target: 3-class classification (Background, Femur, Tibia).
-
Features: Normalized Intensity, Gaussian Smoothed Intensity (
-
Pass 2 (Refinement): Auto-Context Random Forest.
- Features: All Pass 1 features + Probabilities from Pass 1.
- Performance: Achieves >0.90 DSC on Bones.
- Pass 1: Initial Semantic Context Forest.
- Features: Signed Distance Transforms (SDT) from Bones, RSID, Texture, Gaussian.
- Performance: ~0.60 DSC.
- Pass 2 (Refinement): Auto-Context Random Forest.
- Features: All Pass 1 features + Probabilities from Cartilage Pass 1.
- Performance: Achieves ~0.70 DSC (Femoral) and ~0.68 DSC (Tibial).
Plain Text:
Quan Wang, Dijia Wu, Le Lu, Meizhu Liu, Kim L. Boyer, and Shaohua Kevin Zhou. "Semantic Context Forests for Learning-Based Knee Cartilage Segmentation in 3D MR Images." MICCAI 2013: Workshop on Medical Computer Vision.
Quan Wang. Exploiting Geometric and Spatial Constraints for Vision and Lighting Applications. Ph.D. dissertation, Rensselaer Polytechnic Institute, 2014.
BibTeX:
@inproceedings{wang2013semantic,
title={Semantic context forests for learning-based knee cartilage segmentation in 3D MR images},
author={Wang, Quan and Wu, Dijia and Lu, Le and Liu, Meizhu and Boyer, Kim L and Zhou, Shaohua Kevin},
booktitle={International MICCAI Workshop on Medical Computer Vision},
pages={105--115},
year={2013},
organization={Springer}
}
@phdthesis{wang2014exploiting,
title={Exploiting Geometric and Spatial Constraints for Vision and Lighting Applications},
author={Quan Wang},
year={2014},
school={Rensselaer Polytechnic Institute},
}