infer_s2s.py: Load dataset (possibly sharded) ??? #69

david-gimeno · 2022-10-09T12:12:02Z

I realized that only part of the test dataset is evaluated when running the "infer_s2s.py". After inspecting the code, I found this comment "Load dataset (possibly sharded)" here. Specifically, the test set of my database has around 400 samples but only 150 are decoded. Why? How can I solve this? I was trying to set different parameters of the dataset/task but with no succes. I would like to get a %WER performance on the whole test set in order to be able for comparing and benchmarking purposes.

chevalierNoir · 2022-10-09T14:23:41Z

Hi,

How many GPUs did you use for decoding? Current script doesn't support multiple GPU for decoding and if you use >1 GPUs only one part of the dataset will be decoded. If you are under multi-gpu environment, you can do CUDA_VISIBLE_DEVICES=0 python infer_s2s.py ... to only use one GPU (index 0).

Besides, if your test set contains long utterances (depending on max_sample_size in fine-tuning config), there longer utterances will also be ignored. You can check how many of them are ignored by seeing the line like [INFO] - max_keep=500, min_keep=0, loaded 1200, skipped 0 short / 0 long from the output decoding log. If there are utterances ignored, you can add one line like task.cfg.max_sample_size=1000000 here in infer_s2s.py to decode all utterances.

PussyCat0700 · 2022-10-22T07:27:33Z

Hi,

How many GPUs did you use for decoding? Current script doesn't support multiple GPU for decoding and if you use >1 GPUs only one part of the dataset will be decoded. If you are under multi-gpu environment, you can do CUDA_VISIBLE_DEVICES=0 python infer_s2s.py ... to only use one GPU (index 0).

Besides, if your test set contains long utterances (depending on max_sample_size in fine-tuning config), there longer utterances will also be ignored. You can check how many of them are ignored by seeing the line like [INFO] - max_keep=500, min_keep=0, loaded 1200, skipped 0 short / 0 long from the output decoding log. If there are utterances ignored, you can add one line like task.cfg.max_sample_size=1000000 here in infer_s2s.py to decode all utterances.

Yes I just found out decoding cannot be run on multiple GPUs(even CPUs, as long as multiprocessing is involved), but it still took me quite an amount of time to find that out when I went deeper into the code.
Therefore, I would suggest adding some warnings to README.md for users like me who may not know infer_s2s.py can only be run under single GPU setting. Would you consider briefly mentioning this little tip in README on your possible recent updates?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

infer_s2s.py: Load dataset (possibly sharded) ??? #69

infer_s2s.py: Load dataset (possibly sharded) ??? #69

david-gimeno commented Oct 9, 2022

chevalierNoir commented Oct 9, 2022

PussyCat0700 commented Oct 22, 2022

infer_s2s.py: Load dataset (possibly sharded) ??? #69

infer_s2s.py: Load dataset (possibly sharded) ??? #69

Comments

david-gimeno commented Oct 9, 2022

chevalierNoir commented Oct 9, 2022

PussyCat0700 commented Oct 22, 2022