fix spbleu ram issue; loosen check for TestConfidenceFile by mshannon-sil · Pull Request #960 · sillsdev/silnlp

mshannon-sil · 2026-03-06T22:17:42Z

This PR fixes two bugs.

The first is #940, where the spbleu metric was reinitializing the Flores200 tokenizer on every iteration in the loop and consequently eating up all the RAM to the point of failure. I have moved intialization outside the loop to prevent that. The memory issue has gone away, and it has also had the side effect of drastically increasing the speed of the test step when spbleu is included.

The second is #947, where multilingual drafting was not compatible with saving confidence scores. This was only due to the fact that the code to check for a valid filename for a TestConfidenceFile was too restrictive and did not account for filenames where the lang id is inserted toward the beginning. I have loosened the check just to look for the presence of "trg-predictions" in the trg draft file name and confirmed multilingual drafting experiments now complete successfully.

This change is

TaperChipmunk32 · 2026-03-06T22:22:24Z

silnlp/nmt/test.py line 267 at r1 (raw file):

    scorers = scorers.intersection(SUPPORTED_SENTENCE_SCORERS)
    other_scores = {k: v for k, v in other_scores.items() if k.lower() in scorers}
    other_verse_scores: Dict[str, float] = {}

Why was this removed? It seems to still be needed later in this function.

mshannon-sil · 2026-03-06T22:25:06Z

silnlp/nmt/test.py line 267 at r1 (raw file):

Previously, TaperChipmunk32 (Matthew Beech) wrote…

Why was this removed? It seems to still be needed later in this function.

That's because for some reason there was a duplicate definition. I removed this definition, and kept the one inside the loop some lines later in line 298.

TaperChipmunk32

@TaperChipmunk32 made 1 comment.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on mshannon-sil).

TaperChipmunk32 · 2026-03-06T22:29:01Z

silnlp/nmt/test.py line 300 at r1 (raw file):

            other_verse_scores: Dict[str, float] = {}
            if "chrf3" in scorers:
                chrf3_verse_score = sacrebleu.sentence_chrf(

Why was this changed?

mshannon-sil · 2026-03-06T22:32:21Z

silnlp/nmt/test.py line 300 at r1 (raw file):

Previously, TaperChipmunk32 (Matthew Beech) wrote…

Why was this changed?

When I was making the fixes, I noticed that a couple scorers were still using corpus level metrics, even though these are sentence level scores. So I adjusted those scorers to use their sentence-level rather than corpus-level scorers.

TaperChipmunk32

@TaperChipmunk32 made 1 comment and resolved 2 discussions.
Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on mshannon-sil).

fix spbleu ram issue, loosen check for TestConfidenceFile

1579d2f

mshannon-sil requested a review from TaperChipmunk32 March 6, 2026 22:17

mshannon-sil self-assigned this Mar 6, 2026

This was linked to issues Mar 6, 2026

High RAM usage during test step #940

Closed

experiment.py: --save-confidences does not support multilingual drafting configurations #947

Closed

TaperChipmunk32 approved these changes Mar 6, 2026

View reviewed changes

mshannon-sil merged commit 6503ead into master Mar 6, 2026
1 check was pending

mshannon-sil deleted the bug_fixes branch March 6, 2026 22:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix spbleu ram issue; loosen check for TestConfidenceFile#960

fix spbleu ram issue; loosen check for TestConfidenceFile#960
mshannon-sil merged 1 commit intomasterfrom
bug_fixes

mshannon-sil commented Mar 6, 2026 •

edited by ddaspit

Loading

Uh oh!

TaperChipmunk32 commented Mar 6, 2026

Uh oh!

mshannon-sil commented Mar 6, 2026

Uh oh!

TaperChipmunk32 left a comment

Uh oh!

TaperChipmunk32 commented Mar 6, 2026

Uh oh!

mshannon-sil commented Mar 6, 2026

Uh oh!

TaperChipmunk32 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mshannon-sil commented Mar 6, 2026 • edited by ddaspit Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TaperChipmunk32 commented Mar 6, 2026

Uh oh!

mshannon-sil commented Mar 6, 2026

Uh oh!

TaperChipmunk32 left a comment

Choose a reason for hiding this comment

Uh oh!

TaperChipmunk32 commented Mar 6, 2026

Uh oh!

mshannon-sil commented Mar 6, 2026

Uh oh!

TaperChipmunk32 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mshannon-sil commented Mar 6, 2026 •

edited by ddaspit

Loading