BYTF: How Good Are Byte-Level n-Gram F-Scores for Automatic Machine Translation Evaluation?

⚠️ This repo is archived. Please visit https://github.com/prajdabre/sacrebleu

BYTF: How Good Are Byte-Level n-Gram F-Scores for Automatic Machine Translation Evaluation?

1. Overview

This repo is for our BYTF paper which is accepted to MT Summit 2025.

It is a fork of SacreBLEU with the implementation of byte-level metrics:

Metric	Granularity	Flags
`bytf`	byte n-grams	`--bytf-byte-order N`
`bytf+`	byte + character	`--bytf-byte-order N --bytf-char-order M`
`bytf++`	byte + character + word	`--bytf-byte-order N --bytf-char-order M --bytf-word-order L`

2. Installation

git clone https://github.com/prajdabre/sacrebleu
cd sacrebleu
pip install -e .
# Optional: for Indic tokenization support
pip install indic-nlp-library

3. Usage Examples

# Character-level baseline (chrF)
sacrebleu ref.txt -i hyp.txt -m chrf --sentence-level \
                  > hyp.chrf

# Byte-level metric (default orders: byte=6, char=0, word=0)
sacrebleu ref.txt -i hyp.txt -m bytf --sentence-level \
                  > hyp.bytf

# chrF with char-order=6, word-order=2 (chrF++)
sacrebleu ref.txt -i hyp.txt -m chrf --chrf-char-order 6 --chrf-word-order 2 \
                  --sentence-level > hyp.chrf.o6_w2

# BYT_F++ with byte=10, char=6, word=2
sacrebleu ref.txt -i hyp.txt -m bytf \
                  --bytf-byte-order 10 \
                  --bytf-char-order 6 \
                  --bytf-word-order 2 \
                  --sentence-level > hyp.bytf.b10_c6_w2

4. Citation

If you use this work, please cite:

@inproceedings{bytf2025,
  title     = "BYTF: How Good Are Byte-Level n-Gram F-Scores for Automatic Machine Translation Evaluation?",
  author    = "Dabre, Raj and Kaing, Hour and Song, Haiyue",
  booktitle = "Proceedings of Machine Translation Summit XX, Vol. 1: Research Track",
  month     = "jun",
  year      = "2025",
  address   = "Geneva, Switzerland",
  publisher = "European Association for Machine Translation"
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BYTF: How Good Are Byte-Level n-Gram F-Scores for Automatic Machine Translation Evaluation?

1. Overview

2. Installation

3. Usage Examples

4. Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BYTF: How Good Are Byte-Level n-Gram F-Scores for Automatic Machine Translation Evaluation?

1. Overview

2. Installation

3. Usage Examples

4. Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages