Deepspeech2 validation WER not decreasing

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [x] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- [x] I am reporting the issue to the correct repository. (Model Garden official or research directory)
- [x] I checked to make sure that this issue has not already been filed.

## 1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/deep_speech

## 2. Describe the bug

The wer doesn't decrease, it keeps revolving around 1 (from 0.91 to 1.04).
Initially I tried to with _train-clean-100_ and _dev-clean_, then 
to verify that model is learning I created a very small subset of _dev-clean_, 
which can be found [here](https://gist.github.com/HarshalRohit/8fc195df549f59e3ca0978777666d602) 

Everything else is kept to default values.

## 3. Steps to reproduce

1. Prepare the dev-clean dataset
2. Change the paths in _eval_toy_dataset.csv_
3. Change the following line in `official/utils/model_helpers.py`
`if eval_metric >= stop_threshold:` to `if eval_metric <= stop_threshold:`
**Note:** The training will stop after first step if this is not modified.
4. execute the training bash file
      ```
      toy_dataset="some-prefix/outputs/librispeech_data/eval_dataset_toy.csv"
      
      log_file=same_log_`date +%Y-%m-%d_%H:%M`
      
      nohup python deep_speech.py --train_data_dir=$toy_dataset --eval_data_dir=$toy_dataset --num_gpus=1 \
           --wer_threshold=0.23 --seed=1 --batch_size=16 --train_epochs=30 \
           --model_dir=some-prefix/outputs/same_train_eval \
           --export_dir=some-prefix/outputs/same_train_eval \
           >$log_file 2>&1&
      ```
**Note: ** flag `num_gpus` is just to keep other gpus free. batch_size > 16 gives OOM error.

## 4. Expected behavior

WER should steadily decrease.

## 5. Additional context

Log for one of the runs - [link](https://gist.github.com/HarshalRohit/ca284ecc89660408522ea2a32fe982c4)

## 6. System information

- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.7 LTS
- Mobile device name if the issue happens on a mobile device: N/A
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 2.2
- Python version: 3.8.10
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 10.2
- GPU model and memory: GeForce GTX 1080 Ti 12GB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepspeech2 validation WER not decreasing #10033

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Deepspeech2 validation WER not decreasing #10033

Description

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions