feat: per-item HM IoU and quality_percentile in evaluation by ziv-lazarov-nagish · Pull Request #23 · sign-language-processing/segmentation

ziv-lazarov-nagish · 2026-04-16T11:58:50Z

Summary

Compute HM IoU per batch item (average of per-item harmonic means), matching the training validation_step calculation. Previously HM was computed as harmonic mean of the already-averaged sign/sentence IoUs.
Add --quality_percentile CLI arg to filter platform dataset videos by quality score during evaluation.

Files changed

sign_language_segmentation/evaluate.py — per-item HM IoU, quality_percentile arg

Test plan

ruff check . — all checks passed
pytest — 61 passed
python -m sign_language_segmentation.evaluate --checkpoint ... --datasets dgs --split test --device cuda — HM IoU prints correctly
python -m sign_language_segmentation.evaluate --checkpoint ... --datasets platform --split test --device cuda --quality_percentile 0.8 — quality filtering works

AmitMY

i'm confused by quality_percentile

AmitMY · 2026-04-16T12:00:44Z

                        help="drop predicted segments shorter than this many frames (0=off)")
    parser.add_argument("--merge_gap", type=int, default=0,
                        help="merge predicted segments separated by ≤ this many frames (0=off)")
+    parser.add_argument("--quality_percentile", type=float, default=1.0,


used in build_datasets when passing eval_args (acts like the parser variable).

can you link it? i can't see it used - it is added in this PR, but not used in this PR?

this is being used here when cls is AnnotationPlatformSegmentationDataset

on second thought, i can read the quality_percentile that was used in training from the splits_manifest.json file (it's being written when manifest is created) in the checkpoint's directory and remove the argument, but that prevents us from using a different quality_percentile in evaluation. what do you think?

no need, now i get it

AmitMY · 2026-04-16T14:27:27Z

                        help="drop predicted segments shorter than this many frames (0=off)")
    parser.add_argument("--merge_gap", type=int, default=0,
                        help="merge predicted segments separated by ≤ this many frames (0=off)")
+    parser.add_argument("--quality_percentile", type=float, default=1.0,


can you link it? i can't see it used - it is added in this PR, but not used in this PR?

- compute HM IoU per batch item (matching training validation_step) instead of HM of averaged IoUs - add --quality_percentile arg for platform dataset filtering

AmitMY · 2026-04-17T11:53:48Z

                        help="drop predicted segments shorter than this many frames (0=off)")
    parser.add_argument("--merge_gap", type=int, default=0,
                        help="merge predicted segments separated by ≤ this many frames (0=off)")
+    parser.add_argument("--quality_percentile", type=float, default=1.0,


no need, now i get it

…_model evaluate_model now returns hm_IoU computed per item (nagish PR #23). Wrapping it with _add_hm_iou would overwrite that with the less accurate average-of-averages metric.

ziv-lazarov-nagish requested a review from AmitMY April 16, 2026 11:59

AmitMY requested changes Apr 16, 2026

View reviewed changes

feat: per-item HM IoU and quality_percentile in evaluation

8e2a358

- compute HM IoU per batch item (matching training validation_step) instead of HM of averaged IoUs - add --quality_percentile arg for platform dataset filtering

ziv-lazarov-nagish force-pushed the feat/per-video-eval branch from fcec4f8 to 8e2a358 Compare April 17, 2026 11:21

AmitMY approved these changes Apr 17, 2026

View reviewed changes

ziv-lazarov-nagish merged commit 85fb616 into nagish Apr 18, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: per-item HM IoU and quality_percentile in evaluation#23

feat: per-item HM IoU and quality_percentile in evaluation#23
ziv-lazarov-nagish merged 1 commit intonagishfrom
feat/per-video-eval

ziv-lazarov-nagish commented Apr 16, 2026

Uh oh!

AmitMY left a comment

Uh oh!

AmitMY Apr 16, 2026

Uh oh!

ziv-lazarov-nagish Apr 16, 2026

Uh oh!

AmitMY Apr 16, 2026

Uh oh!

ziv-lazarov-nagish Apr 16, 2026

Uh oh!

ziv-lazarov-nagish Apr 16, 2026

Uh oh!

AmitMY Apr 17, 2026

Uh oh!

AmitMY Apr 16, 2026

Uh oh!

AmitMY Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ziv-lazarov-nagish commented Apr 16, 2026

Summary

Files changed

Test plan

Uh oh!

AmitMY left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants