-
Notifications
You must be signed in to change notification settings - Fork 45.3k
Open
Labels
models:researchmodels that come under research directorymodels that come under research directorytype:bugBug in the codeBug in the code
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- I am reporting the issue to the correct repository. (Model Garden official or research directory)
- I checked to make sure that this issue has not already been filed.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/research/deep_speech
2. Describe the bug
The wer doesn't decrease, it keeps revolving around 1 (from 0.91 to 1.04).
Initially I tried to with train-clean-100 and dev-clean, then
to verify that model is learning I created a very small subset of dev-clean,
which can be found here
Everything else is kept to default values.
3. Steps to reproduce
- Prepare the dev-clean dataset
- Change the paths in eval_toy_dataset.csv
- Change the following line in
official/utils/model_helpers.py
if eval_metric >= stop_threshold:toif eval_metric <= stop_threshold:
Note: The training will stop after first step if this is not modified. - execute the training bash file
toy_dataset="some-prefix/outputs/librispeech_data/eval_dataset_toy.csv" log_file=same_log_`date +%Y-%m-%d_%H:%M` nohup python deep_speech.py --train_data_dir=$toy_dataset --eval_data_dir=$toy_dataset --num_gpus=1 \ --wer_threshold=0.23 --seed=1 --batch_size=16 --train_epochs=30 \ --model_dir=some-prefix/outputs/same_train_eval \ --export_dir=some-prefix/outputs/same_train_eval \ >$log_file 2>&1&
**Note: ** flag num_gpus is just to keep other gpus free. batch_size > 16 gives OOM error.
4. Expected behavior
WER should steadily decrease.
5. Additional context
Log for one of the runs - link
6. System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.7 LTS
- Mobile device name if the issue happens on a mobile device: N/A
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): 2.2
- Python version: 3.8.10
- Bazel version (if compiling from source): N/A
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: 10.2
- GPU model and memory: GeForce GTX 1080 Ti 12GB
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
models:researchmodels that come under research directorymodels that come under research directorytype:bugBug in the codeBug in the code