Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

infer_s2s.py: Load dataset (possibly sharded) ??? #69

Open
david-gimeno opened this issue Oct 9, 2022 · 2 comments
Open

infer_s2s.py: Load dataset (possibly sharded) ??? #69

david-gimeno opened this issue Oct 9, 2022 · 2 comments

Comments

@david-gimeno
Copy link

I realized that only part of the test dataset is evaluated when running the "infer_s2s.py". After inspecting the code, I found this comment "Load dataset (possibly sharded)" here. Specifically, the test set of my database has around 400 samples but only 150 are decoded. Why? How can I solve this? I was trying to set different parameters of the dataset/task but with no succes. I would like to get a %WER performance on the whole test set in order to be able for comparing and benchmarking purposes.

@chevalierNoir
Copy link
Contributor

Hi,

How many GPUs did you use for decoding? Current script doesn't support multiple GPU for decoding and if you use >1 GPUs only one part of the dataset will be decoded. If you are under multi-gpu environment, you can do CUDA_VISIBLE_DEVICES=0 python infer_s2s.py ... to only use one GPU (index 0).

Besides, if your test set contains long utterances (depending on max_sample_size in fine-tuning config), there longer utterances will also be ignored. You can check how many of them are ignored by seeing the line like [INFO] - max_keep=500, min_keep=0, loaded 1200, skipped 0 short / 0 long from the output decoding log. If there are utterances ignored, you can add one line like task.cfg.max_sample_size=1000000 here in infer_s2s.py to decode all utterances.

@PussyCat0700
Copy link

Hi,

How many GPUs did you use for decoding? Current script doesn't support multiple GPU for decoding and if you use >1 GPUs only one part of the dataset will be decoded. If you are under multi-gpu environment, you can do CUDA_VISIBLE_DEVICES=0 python infer_s2s.py ... to only use one GPU (index 0).

Besides, if your test set contains long utterances (depending on max_sample_size in fine-tuning config), there longer utterances will also be ignored. You can check how many of them are ignored by seeing the line like [INFO] - max_keep=500, min_keep=0, loaded 1200, skipped 0 short / 0 long from the output decoding log. If there are utterances ignored, you can add one line like task.cfg.max_sample_size=1000000 here in infer_s2s.py to decode all utterances.

Yes I just found out decoding cannot be run on multiple GPUs(even CPUs, as long as multiprocessing is involved), but it still took me quite an amount of time to find that out when I went deeper into the code.
Therefore, I would suggest adding some warnings to README.md for users like me who may not know infer_s2s.py can only be run under single GPU setting. Would you consider briefly mentioning this little tip in README on your possible recent updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants