Zero-annotation and supervised rally segmentation from untrimmed table tennis match video. Companion code for the paper "Annotation-Free Rally Detection in Table Tennis Match Video: From Signal Fusion to Supervised Refinement."
Paper: Annotation-Free Rally Detection in Table Tennis Match Video: From Signal Fusion to Supervised Refinement — Yuwei Ba (2026). Preprint available: LaTeX source (arXiv version) (coming soon: arXiv link)
| Model | Seg-F1 | bMAE (s) |
|---|---|---|
| Zero-annotation fusion | 0.476 | 3.42 |
| Sklearn (93-dim features) | 0.607 | 2.15 |
| Single-stage TCN | 0.636 | 1.93 |
| BiGRU | 0.662 | 1.65 |
| Temporal Transformer | 0.648 | 1.78 |
| MS-TCN (5-run avg) | 0.695 | 0.52 |
| MS-TCN + TS + LS (5-run avg) | 0.712 | 0.55 |
All results: leave-one-video-out cross-validation on 12 untrimmed match videos (Extended OpenTTGames, 644K frames at 120 fps).
@article{ba2026rally,
title = {Annotation-Free Rally Detection in Table Tennis Match Video:
From Signal Fusion to Supervised Refinement},
author = {Ba, Yuwei},
year = {2026}
}Requires Python ≥ 3.11 and uv.
git clone --recurse-submodules https://github.com/ibigbug/spin-detector.git
cd spin-detector
uv syncuv run python download_dataset.pyThis downloads the 12 match recordings into data/train/ and data/test/.
uv run python run_experiment.py signals --split allExtracts ball trajectory, player motion, and court activity signals, caches them in .cache/scores/, and evaluates the fused zero-annotation baseline.
All models are available as subcommands of run_experiment.py (or as standalone modules):
# Sklearn baseline
uv run python run_experiment.py sklearn --seed 42 --sweep
# Single-stage TCN
uv run python run_experiment.py tcn --epochs 80 --seed 42 --sweep
# BiGRU
uv run python run_experiment.py bigru --seed 42 --sweep
# Temporal Transformer
uv run python run_experiment.py transformer --seed 42 --sweep
# MS-TCN best model (deterministic baseline)
uv run python run_experiment.py mstcn \
--stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 --sweep
# MS-TCN + temperature scaling + label smoothing (best result: 0.712 Seg-F1)
uv run python run_experiment.py mstcn \
--stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 \
--temp-scale --label-smooth 0.2 --sweep
# MS-TCN + signal attention
uv run python run_experiment.py mstcn \
--stages 4 --hidden 128 --levels 10 --epochs 120 --attention --sweep
# Self-training from pseudo-labels
uv run python run_experiment.py self-train --rounds 2 --sweepAll supervised models use LOVO cross-validation and require pre-cached signals from step 1.
# Save per-video probabilities (best config: TS + LS)
uv run python run_experiment.py mstcn \
--stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 \
--temp-scale --label-smooth 0.2 \
--save-proba .cache/proba/ts_ls --sweep
# Generate prediction timeline figure
uv run python -m src.visualization.plot_timeline \
--proba-dir .cache/proba/ts_ls \
--output paper/figures/timeline.pdfsrc/
signals/ # Zero-annotation signal extraction
ball_kinematics.py MOG2 background subtraction + blob tracking
player_motion.py YOLOv8n detection + Farneback optical flow
court_activity.py Bilateral frame-differencing
supervised/ # Supervised sequence models (LOVO)
common.py Shared constants, data loading, evaluation
model.py Sklearn GBT+LogReg (93-dim features)
tcn_model.py Single-stage dilated TCN
mstcn_model.py Multi-stage TCN (best model)
bigru_model.py Bidirectional GRU + local attention
transformer_model.py Temporal Transformer
self_train.py Self-training from pseudo-labels
fusion/
combine.py Weighted fusion + adaptive thresholding
evaluation/
metrics.py Seg-F1, boundary MAE, frame-level metrics
visualization/
plot_timeline.py Prediction-vs-GT timeline figures
benchmark/
derive_rallies.py Ground-truth parsing from annotations
run_experiment.py Zero-annotation experiment runner
download_dataset.py Video downloader
paper/ LaTeX source for the paper
vendor/ Extended OpenTTGames annotations (submodule)
We use the Extended OpenTTGames dataset:
12 match recordings at 120 fps, 278 ground-truth rallies.
The dataset annotations are included as a git submodule in vendor/table_tennis_data/.
Video files must be downloaded separately (see above).
@article{ba2026rally,
title = {Annotation-Free Rally Detection in Table Tennis Match Video:
From Signal Fusion to Supervised Refinement},
author = {Ba, Yuwei},
year = {2026}
}Code: MIT. Dataset annotations: CC BY-NC-SA 4.0 (see vendor/table_tennis_data/LICENSE).