Skip to content

Place to add Automatic Speech Recognition (ASR) and Speech-to-Text (STT) papers.

License

Notifications You must be signed in to change notification settings

will-rice/asr-papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

ASR Papers

2021

Notes


Notes


2020

Notes




2019

Abstract

We present SpecAugment, a simple data augmentation method for speech recognition. SpecAugment is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients). The augmentation policy consists of warping the features, masking blocks of frequency channels, and masking blocks of time steps. We apply SpecAugment on Listen, Attend and Spell networks for end-to-end speech recognition tasks. We achieve state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work. On LibriSpeech, we achieve 6.8% WER on test-other without the use of a language model, and 5.8% WER with shallow fusion with a language model. This compares to the previous state-of-the-art hybrid system of 7.5% WER. For Switchboard, we achieve 7.2%/14.6% on the Switchboard/CallHome portion of the Hub5'00 test set without the use of a language model, and 6.8%/14.1% with shallow fusion, which compares to the previous state-of-the-art hybrid system at 8.3%/17.3% WER.

Notes


2018


2017


2016 and before


2012

[Sequence Transduction with Recurrent Neural Networks]


About

Place to add Automatic Speech Recognition (ASR) and Speech-to-Text (STT) papers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published