New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switchboard Recipe #1460
Switchboard Recipe #1460
Conversation
Hi! Thanks for this huge contribution. Please start by solving all the tests, and then we will proceed with a proper review :-) You can install the pre-commit and pre-push hooks quite easily following the doc! Also, do you know how your numbers compare to ESPnet for instance? We can easily discuss ways of improving the performance. |
Thank you for this contribution. One tip: you can also run the tests locally with 'pytest', ''tests/.run-linters.sh" (for style issues), ".run-unittests.sh" (for unittests). You can fix most of the style issues of python files with "black" (see https://speechbrain.readthedocs.io/en/latest/contributing.html) |
I would also suggest to merge here the latest version of the development branch. We just merged some additional checks and tests that are needed in this case as well. |
Hi, I merged with the latest development branch, ran the linters and made the appropriate changes, so the formatting should be ok now. Unfortunately, my local unittests kept crashing in ImportError while importing test module '/nfs/scratch/staff/wagnerdo/speechbrain-fork-2/tests/unittests/test_ctc_segmentation.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/unittests/test_ctc_segmentation.py:1: in <module>
from speechbrain.pretrained import EncoderDecoderASR
speechbrain/__init__.py:4: in <module>
from .core import Stage, Brain, create_experiment_directory, parse_arguments
speechbrain/core.py:35: in <module>
from speechbrain.utils.distributed import run_on_main
speechbrain/utils/__init__.py:11: in <module>
from . import * # noqa
speechbrain/utils/profiling.py:11: in <module>
from torch.autograd.profiler_util import ( # pytorch v1.10.1
E ModuleNotFoundError: No module named 'torch.autograd.profiler_util' I'm not sure whether it's a local issue with my environment or whether it's a broader problem. @TParcollet: ESPNet reports WERs of 15.6 (Callhome), 8.4 (Swbd), and 12.0 (eval2000) for their latest transformer model (see https://github.com/espnet/espnet/blob/master/egs2/swbd/asr1/RESULTS.md), so there is definitely room for improvement. |
Thank you @dwgnr! It looks like the performance is not to good at the moment. As far as I remember, switchboard recording are sampled at 8 kHz (and not 16 kHz as the other recipes). Do you consider it? Also, I think a significant improvement can be achieved by adding a language model, right? |
Hi @mravanelli, Yes, Switchboard is sampled at 8kHz and I did indeed consider this in various ways. The seq2seq model is trained on 8 kHz audio. I also tried upsampling to 16 kHz but it didn't make much difference. For the transformer model, the audio data is currently upsampled to 16 kHz, but I haven't tried training it on 8 kHz yet. I fully agree, adding a decent language model should definitely lead to improvements. I tried a lot of different things with RNNLMs on the seq2seq model (e.g. use pretrained LibriSpeech LM, train simple LSTM LM from scratch, finetune pretrained LibriSpeech LM) and I was careful to use the correct Tokenizer for each LM. Are there any known issues with the When looking through the detailed results, I noticed that with the LM in place, the decoder seems to be very eager to output more than just a single word, which leads to a lot of insertions. For example with LM:
This phenomenon is less pronounced for the decoder without LM:
There are actually quite a lot of very short utterances in the corpus, so these errors quickly accumulate. The problem also seems to invert for long utterances. For example with LM:
And without LM:
Do you have any ideas what could cause these weird insertions? |
As for the RNNLM or TransformerLM recipes, these are the same that we use for librispeech (where we noticed a quite big improvement when using the LM). As for the decoding part, one thing that might be important is to tune a bit the hyperparameters of the beamsearcher. According to my experience, this is really important. Fortunately, to tune these hyperparameters you don't have to run the training part again, but only the final decoding part (ideally on the validation set and not on the test one). |
…yms etc into account; add transformer lm
recipes/Switchboard/Tokenizer/hparams/1K_unigram_subword_bpe.yaml
Outdated
Show resolved
Hide resolved
recipes/Switchboard/ASR/transformer/hparams/transformer_small_lm.yaml
Outdated
Show resolved
Hide resolved
@dwgnr thank you for this contribution!
As mentioned abive, before the final commit, please use Btw - do you still get |
@Adel-Moumen @TParcollet what's your take - should we merge without having been able to debug/test recipes, since Switchboard is with restricted access? (From my perspective, we should merge & if someone faces issues, we would welcome their PR. There are little other pragmatic means to wrap this PR up - there are ample of contributions in here already.) |
In case it helps your decision: |
Hi @dwgnr getting back to this. As there are no responses thus far, I'd proceed with this my way. I consider your line of work as completed, yet, we face some technicality issues here, so—let's tackle them. Please let me resolve some side issues (i.e. please do not push meanwhile):
For the latter, I still have to make up my mind of what would be a viable step forward. Perhaps I should work this out in another, new PR and then come back here. (Working on your PR integration in the then-new recipe integration testing PR meanwhile.) For the former, I looked into different options and will push one solution in a bit:
So that I do not need to go over that again, please do not re/format &/or push to this branch ;-) |
Hi @anautsch, I won't do anything to the branch, I promise ;-) Let me know if it gets too messy with all the reformatting issues. |
@dwgnr 🤣 look at the stats of your PR ... from my side the PR is good as it stands—it's environment needs to be brought up to speed. |
Hello, The three models have been merged in the HF SB hub. We are now waiting for this PR to be merged so that we can then switch the visibility of the models to 'public' on the HF hub :-) Many thanks, @dwgnr for your massive work! 😎 |
Hi @dwgnr, |
Hi @mravanelli, fixed it. |
Thank you! @anautsch do you think everything is ready to be merged? |
Hi @mravanelli - the PR lgtm as said before. @dwgnr I'll attend first to this error - it's coming up across all PRs right now (I suspect some dependency updated and broke things). Then, let's get this done already; I didn't think about simply working out the test debug flags before - sorry -> that'd have already satisfied what I need to have. |
@dwgnr, could you please merge here the latest version of the develop branch? |
Hi @mravanelli, I merged the latest version and it auto-closed the PR. Was it supposed to do that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dwgnr the same happened with me in another PR; don't worry, re-opening is just fine.
(happens.)
Hi @dwgnr thank you for updating the latest merge conflict. Please take a look at the is there a 'minimal' set of test flags for your scripts, so we can test them w/o having the actual data available? It's using minimal data available at: https://github.com/speechbrain/speechbrain/tree/develop/tests/samples If you could provide the The PR #1600 demands more time from me than I expected. In there, the testing is made database specific and available for all recipes in the sense that we also know the workflows are still functional (think: dependencies may update and their interfaces could break over time). |
Hi @anautsch, no problem. I'll see if I can find a minimal set of |
Hope this command helps you to just run the checks for your recipes
|
Hi @anautsch, I managed to get the tests running under the new recipe testing environment🎉 In #1600, I attached the files below: I pushed the changes related to the recipes here. Hope that's fine. |
Hi @dwgnr - that's a great help, thank you! Your recipe should not change too much; adequate dummy data needs to be there. I think it is that what you refer to. Right now, there is no merge conflict to develop. As for the other PR, I'll need to update it to include recipes that were released meanwhile just as well as this one. The changes you made look like what I needed to do with other recipes, too. I'll need to get back to this when wrapping up the other PR — meanwhile: lgtm |
Hi @anautsch, thanks for merging! This was a lot of fun :-) |
I turned public the models on the HuggingFace hub. Thanks a lot again for your work!! :-) |
Hey everybody,
I made a recipe for the Switchboard corpus.
The data preparation steps mostly follow Kaldi's s5c recipe.
The recipe includes the following models:
ASR
LM
hparams
files for finetuning existing LibriSpeech LMs on Switchboard and Fisher data, one for an RNNLM and the other for a Transformer LMTokenizer
Performance
The model performance is as follows:
As you can see, the performance is currently comparable to Kaldi's chain systems without i-vectors.
However, they need some refinement to be on par with the best Kaldi systems available (WER should be around 18 on the full eval2000 testset).
If you have any suggestions for improvements, I'd be happy to implement them.
I can also provide the trained models in case you are interested (I might need some help with this whole Huggingface thing though).
Best,
Dominik
ps Thanks for all the great work you've done here! :)