Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

ADD speech_recognition example#5954

Merged
piiswrong merged 1 commit intoapache:masterfrom
ai-adv-lab:master
Apr 24, 2017
Merged

ADD speech_recognition example#5954
piiswrong merged 1 commit intoapache:masterfrom
ai-adv-lab:master

Conversation

@Soonhwan-Kwon
Copy link
Copy Markdown
Contributor

Removed the implicit LICENSE file in example#5923 and add more details and guides for the example.

This example based on DeepSpeech2 of Baidu helps you to build Speech-To-Text (STT) models at scale using

  • CNNs, fully connected networks, (Bi-) RNNs, (Bi-) LSTMs, and (Bi-) GRUs for network layers,
  • batch-normalization for training efficiency,
  • and a Baidu's WarpCTC for loss calculations.

In order to make your own STT models, besides, all you need is to just edit a configuration file not actual codes.

@minsoo-jade-kim
Copy link
Copy Markdown

We will add performance results in weeks.

@piiswrong
Copy link
Copy Markdown
Contributor

Thanks
I'll merge this first

@piiswrong piiswrong merged commit ff7589c into apache:master Apr 24, 2017
@minsoo-jade-kim
Copy link
Copy Markdown

Many thanks.

@sbodenstein
Copy link
Copy Markdown
Contributor

@Soonhwan-Kwon: MXNet now has a native ctc loss op: #5834

It uses the WarpCTC implementation, so it shouldn't be any slower than using the plugin. It also returns the actual CTC loss, so you don't need to calculate this with custom Python code.

@Soonhwan-Kwon
Copy link
Copy Markdown
Contributor Author

@sbodenstein Thank you for great contribution! I started to using your merge, and I will revise the example.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants