Add audio data provider and preprocessor for speech recognition datasets. #2226

xinghai-sun · 2017-05-22T07:08:50Z

Prepare one or more public English speech recognition data sets (e.g. LibriSpeech), and respective baselines.
Convert all audio file formats to .wav format.
Add a file manifest generator for each dataset, and add a merger if there exist more than one datasets. Make this interface unified across different datasets.
Add spectrogram feature extractor, power normalizer etc.
Add transcription text parser (tokenization, dictionary generation etc).
Add batch data reader with SortaGrad.
Refer to the DS2 design doc and update it when necessary.
Please pull your codes and docs into PaddlePaddle/models.

xinghai-sun · 2017-05-24T17:36:13Z

* add gn head fpn, test=dygraph * add gn for cascade * update gn readme, test=dygraph

xinghai-sun self-assigned this May 22, 2017

xinghai-sun mentioned this issue May 22, 2017

Add audio data augmentation process to audio data provider. #2227

Closed

xinghai-sun added this to Feature Requests in Deep Speech 2 May 22, 2017

xinghai-sun moved this from Feature Requests to Developing in Deep Speech 2 May 22, 2017

This was referenced May 22, 2017

Deep Speech 2 on PaddlePaddle: Plan & Task Breakdown PaddlePaddle/models#44

Closed

Add audio data provider and a simplified DeepSpeech2 model configuration. PaddlePaddle/models#55

Merged

xinghai-sun moved this from Developing to Under Reviews in Deep Speech 2 May 24, 2017

lcy-seso closed this as completed in PaddlePaddle/models#55 Jun 2, 2017

xinghai-sun moved this from Under Reviews to Done in Deep Speech 2 Jun 2, 2017

heavengate pushed a commit to heavengate/Paddle that referenced this issue Aug 16, 2021

[dygraph] Add gnfpn and gnhead (PaddlePaddle#2226)

4cd1291

* add gn head fpn, test=dygraph * add gn for cascade * update gn readme, test=dygraph

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add audio data provider and preprocessor for speech recognition datasets. #2226

Add audio data provider and preprocessor for speech recognition datasets. #2226

xinghai-sun commented May 22, 2017

xinghai-sun commented May 24, 2017

Add audio data provider and preprocessor for speech recognition datasets. #2226

Add audio data provider and preprocessor for speech recognition datasets. #2226

Comments

xinghai-sun commented May 22, 2017

xinghai-sun commented May 24, 2017