Excited to introduce CourtKeyNet, an open-sourced deep learning architecture stemming from sports video analysis. Positioned as a top-tier court detection model, CourtKeyNet offers the following features.
- High-Fidelity Feature Extraction: It maintains high fidelity in capturing both fine court details and global structural context effortlessly using a novel Octave Feature Extractor.
- Robust Boundary Attention: It enables precise boundary localization by mapping spatial relationships in polar coordinates via our Polar Transform Attention.
- Geometric Consistency & Open Access: It supports structurally valid outputs, strictly ensuring proper quadrilateral properties through a dedicated Constraint Module and Geometric Consistency Loss. We provide public access to the code and pre-trained models. We believe our release will empower the community with practical applications across areas like sports video analysis, match statistics generation, and automated broadcasting systems.
The datasets utilized for CourtKeyNet are located in the datasets folder, which is linked as a submodule to the primary dataset repository:
Note: The dataset contains badminton court images for keypoint detection, and the main repository contains the custom annotation tool for geometric keypoints labeling.
Set up a conda environment and install dependencies:
# 1. Clone the repository
git clone https://github.com/adithyanraj03/CourtKeyNet.git
cd CourtKeyNet
# 2. Create and activate a Conda environment
conda create -n courtkeynet python=3.10 -y
conda activate courtkeynet
# 3. Install requirements
pip install -r requirements.txt
# 4. Login to Weights & Biases (optional, for experiment tracking)
wandb loginClick to hide: How Confidence Detection Works (Visual Explanation)
Model (CourtKeyNet) works like this:
Problem: It has no "court detector" - it assumes every image IS a court!
When you run model(image), it returns a dictionary with these components:
outputs = {
'heatmaps': Tensor[B, 4, 160, 160], # 4 gaussian peaks (one per corner)
'kpts_init': Tensor[B, 4, 2], # Initial keypoints from heatmaps
'kpts_refined': Tensor[B, 4, 2], # Final refined keypoints
'features': Tensor[B, 256, 20, 20] # Feature maps (optional)
}For a real court image:
Heatmap for Corner 0 (Top-Left):
For a non-court image (e.g., random person):
Heatmap for Corner 0:
What it measures: How "peaky" the heatmap is
max_values = heatmaps.max(dim=(2,3)) # Find highest value in each heatmap
conf_heatmap = max_values.mean() # Average across 4 cornersVisual comparison:
What it measures: How "spread out" the probability is
# Entropy = -Σ(p * log(p))
# Low entropy = focused (good)
# High entropy = random noise (bad)Visual comparison:
What it checks: Does the quad look like a real court?
Checklist:
✓ Are corners in correct positions? (TL upper-left, BR lower-right)
✓ Is the quad convex? (no crossed lines)
✓ Is the area reasonable? (not too tiny, not entire image)
✓ Is aspect ratio court-like? (not a thin line)
Visual examples:
CourtKeyNet/
├── courtkeynet/
│ ├── configs/
│ │ ├── courtkeynet.yaml # Model hyperparameters
│ │ └── dataset.yaml # Dataset configuration
│ ├── models/
│ │ ├── __init__.py
│ │ ├── courtkeynet.py # Main architecture
│ │ ├── octave.py # Octave Feature Extractor
│ │ ├── polar.py # Polar Transform Attention
│ │ └── qcm.py # Quadrilateral Constraint Module
│ ├── losses/
│ │ ├── __init__.py
│ │ └── geometric_loss.py # Geometric Consistency Loss
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── dataloader.py # dataset loader
│ │ └── metrics.py # Evaluation metrics
│ ├── train.py # Training script with wandb
│ ├── finetune.py # Finetuning script
│ └── inference.py # Inference/visualization
├── datasets/ # Datasets (submodule)
├── requirements.txt
└── README.md
| Model | Details | Resolution | Download Links |
|---|---|---|---|
| CourtKeyNet-Base | Full Architecture | Native | 🤗 HuggingFace |
| CourtKeyNet-Fast | Light Architecture | Native | To be released |
Download models using huggingface-cli:
pip install "huggingface_hub[cli]"
huggingface-cli download Cracked-ANJ/CourtKeyNet --local-dir ./courtkeynet-baseThe training scripts are located within the courtkeynet/ directory.
If you downloaded the HuggingFace weights, use finetune.py to adapt the model to your specific dataset:
cd courtkeynet
python finetune.py(Note: The pre-trained weights are explicitly trained on badminton court and tennis court images.)
To train a completely new model from scratch, use train.py:
cd courtkeynet
python train.py --data_root path/to/dataset(Note: Training from scratch requires a minimum of 140,000 images for the model to learn features effectively. Fine-tuning to any court or similar geometry requires only 5,000 to 7,000 images.)
Depending on your workflow, you can specify custom configurations. There are two provided configurations:
configs/courtkeynet.yaml: The primary configuration for training dynamically from scratch.configs/finetune.yaml: Tailored automatically for fine-tuning pre-trained weights (automatically loaded byfinetune.py).
cd courtkeynet
python train.py \
--data_root path/to/dataset \
--cfg configs/courtkeynet.yaml \
--data_cfg configs/dataset.yamlcd courtkeynet
python train.py \
--data_root path/to/dataset \
--resume runs/courtkeynet/exp/epoch_50.ptTo open the CourtKeyNet Inference Studio (GUI), run:
cd courtkeynet
python inference.py(Note: The application will open a graphical interface allowing you to easily select both your model weights and target media files directly through the GUI menus).
Processes visual information at multiple frequency bands:
- High-frequency path: Fine court details using Court-Specific Shape Kernels. (Note: To optimize learning, explicit L-shaped boundary kernels are turned off when training from scratch, as enforcing them too early hinders the model's ability to grasp broader contextual features and textures. They are meant to be utilized primarily during fine-tuning).
- Mid-frequency path: Structural patterns with Non-Local Self-Similarity
- Low-frequency path: Global context via Fourier Feature Encoder
Enhances boundary detection by processing features in polar coordinates, naturally suited for detecting court boundaries radiating from the center.
Combined loss function:
L_total = λ_kpt·L_kpt + λ_hm·L_hm + λ_edge·L_edge + λ_diag·L_diag + λ_angle·L_angle
| GPU | Batch Size | Total Duration (120 epochs) |
|---|---|---|
| RTX 5090 | 32 | ~48 hours (2 Days) |
| RTX 4090 | 24 | ~72 hours (3 Days) |
| RTX 3090 | 16 | ~108 hours (4.5 Days) |
| GPU | Batch Size | Total Duration |
|---|---|---|
| RTX 5090 | 32 | ~35 hours (~1.4 Days) |
| RTX 4090 | 24 | ~52 hours (~2.2 Days) |
| RTX 3090 | 16 | ~79 hours (~3.3 Days) |
If you use CourtKeyNet in your research, please cite:
Paper Link: CourtKeyNet: A novel octave-based architecture for precision badminton court detection with geometric constraints
DOI: https://doi.org/10.1016/j.mlwa.2026.100884
@article{NRAJ2026100884,
title = {CourtKeyNet: A novel octave-based architecture for precision badminton court detection with geometric constraints},
journal = {Machine Learning with Applications},
volume = {24},
pages = {100884},
year = {2026},
issn = {2666-8270},
doi = {https://doi.org/10.1016/j.mlwa.2026.100884},
url = {https://www.sciencedirect.com/science/article/pii/S2666827026000496},
author = {Adithya N Raj and Prethija G.}
}This project is released under the MIT License, suitable for both academic and commercial use.
For questions or collaboration opportunities, please contact at adithyanraj03@gmail.com







