Cannot reproduce reported SDR & retrain the speaker embedding #30

nnbtam99 · 2022-01-27T03:38:19Z

Hello, I have two questions about the implementation.

I cannot reproduce the results reported in the README.
I have trained for around > 400k steps on Librispeech 360h + 100h clean dataset, using the embedder provided in this repo.
However, I can only obtain up to a maximum SDR of 5.5.

To obtain data from the Librispeech 360h + 100h, I generate the mixed audios for 360h and 100h separately, then add them together in another folder. Is this the right way when I want to use more data to train the voice filter module?

I got worse results when retraining the speaker embedding
I retrained the embedder using the following repo: Speaker verification on 3 datasets: Librispeech, VoxCeleb1, VoxCeleb2.

Theoretically, I expect the voice filter module will benefit from the embedder trained on more data, but the results got even worse. Can you share how you train this embedder?

Thank you in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce reported SDR & retrain the speaker embedding #30

Cannot reproduce reported SDR & retrain the speaker embedding #30

nnbtam99 commented Jan 27, 2022

Cannot reproduce reported SDR & retrain the speaker embedding #30

Cannot reproduce reported SDR & retrain the speaker embedding #30

Comments

nnbtam99 commented Jan 27, 2022