Skip to content

A Weakly Supervised Forced Alignment for disluent speech

Notifications You must be signed in to change notification settings

zelaki/DisfluentFA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Forced Alignment for Disfluent Speech

Presented at ISCA Interspeech 2023.

Project Status

This repository is currently under development and is a work in progress. Contributions and feedback are welcome!

Intro

The study of speech disorders can benefit greatly from time-aligned data. However, audio-text mismatches in disfluent speech cause rapid performance degradation for modern speech aligners, hindering the use of automatic approaches. In this work, we propose a simple and effective modification of align- ment graph construction of CTC-based models using Weighted Finite State Transducers. The proposed weakly-supervised ap- proach alleviates the need for verbatim transcription of speech disfluencies for forced alignment. During the graph construc- tion, we allow the modeling of common speech disfluencies, i.e. repetitions and omissions.

Contents

  • Code for weakly-supervised forced alignment of disfluent speech. The models and code for the frame classification model are based on Charsiu
  • Code for the construction of DisfluenTIMIT, a corrupted version of the TIMIT test set with synthesized disfluencies

Usage

Wealky-Supervised Forced Alignment

optional arguments:
  -a AUDIO, --audio AUDIO 	Path to the audio file.
  -t TEXT, --text TEXT  	Path to the text file.
  -w, --write_textgrid  	Specify whether to write a TextGrid file.

About

A Weakly Supervised Forced Alignment for disluent speech

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published