Training 8000 kHZ language identification model: Same language during inference #2049

kirillkoncha · 2023-06-23T13:52:16Z

kirillkoncha
Jun 23, 2023

Hello!

I am training a model for language identification on 8kHZ audio. During the training, EER falls to 0.10. However, when I am testing model on the same validation set I used during training, the model output is the same language for all the audio files.

I encountered the similar problem during finetuning 16kHZ model. It was solved by shuffling batches. I made sure that batches are shuffled during current training.

It seems to me that the inference could be the problem. I wonder why the documentation of Encoder Classifier states that the audio must be 16000 kHZ?

wavs : torch.Tensor
Batch of waveforms [batch, time, channels] or [batch, time]
depending on the model. Make sure the sample rate is fs=16000 Hz.

kirillkoncha · 2023-06-28T11:53:14Z

kirillkoncha
Jun 28, 2023
Author

@TParcollet

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training 8000 kHZ language identification model: Same language during inference #2049

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Training 8000 kHZ language identification model: Same language during inference #2049

kirillkoncha Jun 23, 2023

Replies: 1 comment

kirillkoncha Jun 28, 2023 Author

kirillkoncha
Jun 23, 2023

kirillkoncha
Jun 28, 2023
Author