Skip to content

ibigbug/spin-detector

Repository files navigation

Rally Segmentation in Table Tennis Match Video

Python License: MIT arXiv

Zero-annotation and supervised rally segmentation from untrimmed table tennis match video. Companion code for the paper "Annotation-Free Rally Detection in Table Tennis Match Video: From Signal Fusion to Supervised Refinement."

Paper

Paper: Annotation-Free Rally Detection in Table Tennis Match Video: From Signal Fusion to Supervised Refinement — Yuwei Ba (2026). Preprint available: LaTeX source (arXiv version) (coming soon: arXiv link)

Key Results

Model Seg-F1 bMAE (s)
Zero-annotation fusion 0.476 3.42
Sklearn (93-dim features) 0.607 2.15
Single-stage TCN 0.636 1.93
BiGRU 0.662 1.65
Temporal Transformer 0.648 1.78
MS-TCN (5-run avg) 0.695 0.52
MS-TCN + TS + LS (5-run avg) 0.712 0.55

All results: leave-one-video-out cross-validation on 12 untrimmed match videos (Extended OpenTTGames, 644K frames at 120 fps).

Citation

@article{ba2026rally,
  title   = {Annotation-Free Rally Detection in Table Tennis Match Video:
             From Signal Fusion to Supervised Refinement},
  author  = {Ba, Yuwei},
  year    = {2026}
}

Setup

Requires Python ≥ 3.11 and uv.

git clone --recurse-submodules https://github.com/ibigbug/spin-detector.git
cd spin-detector
uv sync

Download videos

uv run python download_dataset.py

This downloads the 12 match recordings into data/train/ and data/test/.

Usage

1. Extract signals (zero-annotation)

uv run python run_experiment.py signals --split all

Extracts ball trajectory, player motion, and court activity signals, caches them in .cache/scores/, and evaluates the fused zero-annotation baseline.

2. Supervised models

All models are available as subcommands of run_experiment.py (or as standalone modules):

# Sklearn baseline
uv run python run_experiment.py sklearn --seed 42 --sweep

# Single-stage TCN
uv run python run_experiment.py tcn --epochs 80 --seed 42 --sweep

# BiGRU
uv run python run_experiment.py bigru --seed 42 --sweep

# Temporal Transformer
uv run python run_experiment.py transformer --seed 42 --sweep

# MS-TCN best model (deterministic baseline)
uv run python run_experiment.py mstcn \
    --stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 --sweep

# MS-TCN + temperature scaling + label smoothing (best result: 0.712 Seg-F1)
uv run python run_experiment.py mstcn \
    --stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 \
    --temp-scale --label-smooth 0.2 --sweep

# MS-TCN + signal attention
uv run python run_experiment.py mstcn \
    --stages 4 --hidden 128 --levels 10 --epochs 120 --attention --sweep

# Self-training from pseudo-labels
uv run python run_experiment.py self-train --rounds 2 --sweep

All supervised models use LOVO cross-validation and require pre-cached signals from step 1.

3. Visualization

# Save per-video probabilities (best config: TS + LS)
uv run python run_experiment.py mstcn \
    --stages 4 --hidden 128 --levels 10 --epochs 120 --runs 5 --seed 42 \
    --temp-scale --label-smooth 0.2 \
    --save-proba .cache/proba/ts_ls --sweep

# Generate prediction timeline figure
uv run python -m src.visualization.plot_timeline \
    --proba-dir .cache/proba/ts_ls \
    --output paper/figures/timeline.pdf

Project Structure

src/
  signals/           # Zero-annotation signal extraction
    ball_kinematics.py   MOG2 background subtraction + blob tracking
    player_motion.py     YOLOv8n detection + Farneback optical flow
    court_activity.py     Bilateral frame-differencing
  supervised/        # Supervised sequence models (LOVO)
    common.py            Shared constants, data loading, evaluation
    model.py             Sklearn GBT+LogReg (93-dim features)
    tcn_model.py         Single-stage dilated TCN
    mstcn_model.py       Multi-stage TCN (best model)
    bigru_model.py       Bidirectional GRU + local attention
    transformer_model.py Temporal Transformer
    self_train.py        Self-training from pseudo-labels
  fusion/
    combine.py           Weighted fusion + adaptive thresholding
  evaluation/
    metrics.py           Seg-F1, boundary MAE, frame-level metrics
  visualization/
    plot_timeline.py     Prediction-vs-GT timeline figures
  benchmark/
    derive_rallies.py    Ground-truth parsing from annotations
run_experiment.py        Zero-annotation experiment runner
download_dataset.py      Video downloader
paper/                   LaTeX source for the paper
vendor/                  Extended OpenTTGames annotations (submodule)

Dataset

We use the Extended OpenTTGames dataset: 12 match recordings at 120 fps, 278 ground-truth rallies. The dataset annotations are included as a git submodule in vendor/table_tennis_data/. Video files must be downloaded separately (see above).

Citation

@article{ba2026rally,
  title   = {Annotation-Free Rally Detection in Table Tennis Match Video:
             From Signal Fusion to Supervised Refinement},
  author  = {Ba, Yuwei},
  year    = {2026}
}

License

Code: MIT. Dataset annotations: CC BY-NC-SA 4.0 (see vendor/table_tennis_data/LICENSE).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors