Skip to content

This program trains a model on labelled training data to remove speech disfluencies, such as "um," "like," "you know," etc to keep only the core part of the text.

Notifications You must be signed in to change notification settings

kmeranda/disfluency_remover

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Kelsey Meranda

Instructions to run: disfluency_remover.py trains a bigram model on data/train.txt and tests the model on data/test.txt and outputs to output.log by default. use the "-h" or "--help" flag to see what other options are available.

Note: the trigram model takes a long time to run, so it prints out percent complete as it runs

About

This program trains a model on labelled training data to remove speech disfluencies, such as "um," "like," "you know," etc to keep only the core part of the text.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages