Skip to content

tilde-nlp/speech-mt-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speech MT Evaluation

This repository contains scripts for multi-layered evaluation of cascade speech translation systems. The goal of the multi-layered evaluation approach is to determine the error contribution for each component of the system.

Methodology

The evaluation is split into three steps:

  • Automatic speech recognition (ASR)
  • Segmentation
  • Machine translation (MT)

For ASR evaluation we use word error rate (WER) and character error rate (CER) metrics from JiWER. For segmentation evaluation we use segmentation error rate (SER), which measures the proportion between the length of overlapping reference/hypothesis segments and the total length of reference segments. Segment lengths are calculated based on their timestamps in the speech. See the full implementation in metrics.py. Part of the implementation is taken from SimpleDER. For MT evaluation we use translation error rate (TER), ChrF++, and BLEU metrics from sacreBLEU and the COMET metric from COMET.

Given that the hypothesis segmentation can differ from the reference segmentation, we use mwerSegmenter to align the hypothesis to the reference.

Setup

The required dependencies can be installed with pip:

pip install -r requirements.txt

Download mwerSegmenter

Usage

The evaluation script can be run as follows:

python evaluate.py /path/to/ref_dir /path/to/hyp_dir

The directories should have the following structure:

.
├── asr
│   ├── 1.txt
│   ├── 2.txt
│   └── ...
├── mt
│   ├── 1.txt
│   ├── 2.txt
│   └── ...
└── segmentation
    ├── 1.json
    ├── 2.json
    └── ...

Each file represents a separate audio file. The TXT files in asr and mt folders should contain transcriptions/translations separated by newlines according to the segmentation:

This is the transcription of the first segment.
This is the transcription of the second segment.

The JSON files in segmentation folder should contain an array of the segment start/end timestamps:

[
  [5.64, 7.39],
  [9.21, 11.42]
]

Acknowledgements

This is the prototype created in activity 3.2 of the project "AI Assistant for Multilingual Meeting Management" (No. of the Contract/Agreement: 1.1.1.1/19/A/082).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages