cotescore

Coverage, Overlap, Trespass and Excess (COTe) score for Document Layout Analysis

Overview

Document Layout Analysis (DLA) is the process of parsing a page into meaningful elements, typically using machine learning models. Traditional evaluation metrics such as IoU, F1, and mAP were designed for 3D-to-2D image projections (e.g. photographs) and can give misleading results for natively 2D printed documents.

cotescore introduces two complementary ideas:

Structural Semantic Units (SSUs) — a relational labelling approach that shifts focus from the physical bounding boxes of regions to the semantic structure of the content.
COTe Score — a decomposable metric that breaks page-parsing quality into four interpretable components:
- Coverage — how well predictions cover ground-truth regions
- Overlap — redundant predictions within the same ground-truth region
- Trespass — predictions that cross semantic boundaries into a different SSU
- excess — A support metric for predictions that fall on background or white-space

COTe is more informative than traditional metrics, reveals distinct model failure modes, and remains useful even when explicit SSU labels are unavailable.

Installation

pip install cotescore

Optional extras

The benchmarking extras include either PyTorch or PaddlePaddle — do not install both in the same environment. These frameworks require different CUDA versions and will conflict:

Extra	Use case	GPU framework
`cotescore[benchmarks]`	Torch-based DLA models (DocLayout-YOLO, Heron)	PyTorch
`cotescore[paddle-benchmark]`	PaddleOCR / PP-DocLayout	PaddlePaddle

# Torch-based benchmarks
pip install "cotescore[benchmarks]"

# PaddlePaddle benchmarks (install PaddlePaddle first, separately)
# See: https://www.paddlepaddle.org.cn/install/quick
pip install "cotescore[paddle-benchmark]"

Note: PaddlePaddle must be installed before cotescore[paddle-benchmark] — it is not listed as a pip dependency because the correct wheel depends on your CUDA version. Follow the PaddlePaddle installation guide to get the right version.

Quick Start

The library ships with a bundled limerick case study that you can use to try it out immediately.

from cotescore import cote_score, load_limerick_example, extract_ssu_boxes
from cotescore.adapters import boxes_to_gt_ssu_map, boxes_to_pred_masks, eval_shape

# Load the bundled example: ground-truth dict, document image, and example predictions
ground_truth, image, pred_boxes = load_limerick_example()

h, w = image.shape[:2]
_, _, scale = eval_shape(w, h)

# Build tagged SSU-level GT boxes and rasterize to a 2-D SSU id map
gt_boxes = extract_ssu_boxes(ground_truth)
gt_ssu_map = boxes_to_gt_ssu_map(gt_boxes, w, h, scale=scale)
preds = boxes_to_pred_masks(pred_boxes, w, h, scale=scale)

# Compute the COTe score
cote, C, O, T, E = cote_score(gt_ssu_map, preds)
print(f"COTe={cote:.3f}  C={C:.3f}  O={O:.3f}  T={T:.3f}  E={E:.3f}")

COTe pixel-state visualisation: each pixel in the document is classified as Coverage (green), Overlap (yellow), Trespass (red), Trespass AND Overlap (purple), or Excess (blue). Produced by notebooks/limerick_analysis.py.

See notebooks/limerick_analysis.py for a full worked example comparing COTe against F1 and mean IoU at different granularity levels.

Core Metrics

Function	Description
`cote_score(gt_ssu_map, preds)`	Returns `(cote, C, O, T, E)` — the full decomposition
`coverage(gt_ssu_map, preds)`	Fraction of GT area correctly covered `[0, 1]`
`overlap(gt_ssu_map, preds)`	Redundant prediction area within GT `[0, ∞)`
`trespass(gt_ssu_map, preds)`	GT area covered by wrong-SSU predictions `[0, ∞]`
`cote_class(gt_ssu_map, ssu_to_class, preds)`	Per-class interaction matrices (`ClassCOTeResult`)
`iou(box1, box2)`	Intersection over Union for two boxes
`mean_iou(preds, gt)`	Mean IoU across all GT boxes
`f1(preds, gt, threshold)`	F1 score at a given IoU threshold

All functions are importable directly from cotescore:

from cotescore import cote_score, coverage, overlap, iou, mean_iou, cote_class

For an alternative visual overview of the what the different elements of the COTe score mean please see notebooks/metrics_exploration.py

Visualisation

from cotescore import compute_cote_masks, visualize_cote_states
import matplotlib.pyplot as plt

masks = compute_cote_masks(gt_ssu_map, preds)

fig, ax = plt.subplots()
visualize_cote_states(image, masks, ax=ax)
plt.show()

Datasets

The library includes loaders for three DLA datasets used in the paper:

from cotescore.dataset import NCSEDataset, DocLayNetDataset, HNLA2013Dataset

# Bundled toy example (no download required)
from cotescore import load_limerick_example
ground_truth, image = load_limerick_example()

Examples

The primary interactive example is the Marimo notebook at notebooks/limerick_analysis.py. It demonstrates:

Visualising SSU, line, and character-level bounding boxes
The granularity-mismatch problem and how COTe handles it
Side-by-side comparison of COTe, F1, and mean IoU
COTe pixel-state visualisation

To run the notebook (requires Marimo):

pip install marimo
marimo edit notebooks/limerick_analysis.py

Questions and Bug Reports

If you have questions, find a bug, or want to request a feature, please open an issue on GitHub.

Citation

If you use cotescore in your research, please cite:

Bourne, Jonathan, Mwiza Simbeye, and Ishtar Govia. “The COTe Score: A Decomposable Framework for Evaluating Document Layout Analysis Models.” arXiv:2603.12718. Preprint, arXiv, March 13, 2026. https://doi.org/10.48550/arXiv.2603.12718.

@misc{bourne2026cote,
  title         = {The {COTe} Score: A Decomposable Framework for Evaluating {Document Layout Analysis} Models},
  author        = {Bourne, Jonathan and Simbeye, Mwiza and Govia, Ishtar},
  year          = {2026},
  month         = mar,
  publisher     = {arXiv},
  doi           = {10.48550/arXiv.2603.12718},
}

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
models		models
notebooks		notebooks
scripts		scripts
src/cotescore		src/cotescore
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_DEV.md		README_DEV.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cotescore

Overview

Installation

Optional extras

Quick Start

Core Metrics

Visualisation

Datasets

Examples

Questions and Bug Reports

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cotescore

Overview

Installation

Optional extras

Quick Start

Core Metrics

Visualisation

Datasets

Examples

Questions and Bug Reports

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages