Skip to content

Commit

Permalink
Merge pull request #470 from PaddlePaddle/wwhu-patch-1
Browse files Browse the repository at this point in the history
fix typo
  • Loading branch information
wwhu committed Nov 16, 2017
2 parents 368d16c + 029c773 commit 950f451
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions deep_speech_2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ Six optional augmentation components are provided to be selected, configured and
- Noise Perturbation (need background noise audio files)
- Impulse Response (need impulse audio files)

In order to inform the trainer of what augmentation components are needed and what their processing orders are, it is required to prepare in advance a *augmentation configuration file* in [JSON](http://www.json.org/) format. For example:
In order to inform the trainer of what augmentation components are needed and what their processing orders are, it is required to prepare in advance an *augmentation configuration file* in [JSON](http://www.json.org/) format. For example:

```
[{
Expand Down Expand Up @@ -228,7 +228,7 @@ If you wish to train your own better language model, please refer to [KenLM](htt

#### English LM

The English corpus is from the [Common Crawl Repository](http://commoncrawl.org) and you can download it from [statmt](http://data.statmt.org/ngrams/deduped_en). We use part en.00 to train our English languge model. There are some preprocessing steps before training:
The English corpus is from the [Common Crawl Repository](http://commoncrawl.org) and you can download it from [statmt](http://data.statmt.org/ngrams/deduped_en). We use part en.00 to train our English language model. There are some preprocessing steps before training:

* Characters not in \[A-Za-z0-9\s'\] (\s represents whitespace characters) are removed and Arabic numbers are converted to English numbers like 1000 to one thousand.
* Repeated whitespace characters are squeezed to one and the beginning whitespace characters are removed. Notice that all transcriptions are lowercase, so all characters are converted to lowercase.
Expand Down

0 comments on commit 950f451

Please sign in to comment.