[Bug]: Custom SpeakerID model troubles #1996

andresgvargas · 2023-05-02T22:11:12Z

andresgvargas
May 2, 2023

Describe the bug

Hi everyone!
I-m having some trouble using a custom model that I trained with my own audios. After the training I used some of the colabs tutorials to load and use my custom model.

If I use the 'clasify_batch' it correctly ID's the speaker, as shown here:

But if i tried to compare audios with 'verify_files' it always returns a tensor of 0, as shown here:

This behaviour just happens if I try to do it with my own model, as the one made from VoxCeleb that's on HugginFace works just fine, so I'd like some guidance to know where am I failing or if it is something about my use of SpeechBrain

Expected behaviour

I expected values diferrent from 0 when using 'verify_files'

To Reproduce

Inference YAML

pretrain folders:

pretrained_path: best_model/

Model parameters

n_mels: 40
sample_rate: 48000
n_classes: 33 # In this case, we have 28 speakers
emb_dim: 512 # dimensionality of the embeddings

Feature extraction

compute_features: !new:speechbrain.lobes.features.Fbank
n_mels: !ref <n_mels>

Mean and std normalization of the input features

mean_var_norm: !new:speechbrain.processing.features.InputNormalization
norm_type: sentence
std_norm: False

Mean and std normalization of the input features

mean_var_norm_emb: !new:speechbrain.processing.features.InputNormalization
norm_type: sentence
std_norm: False

embedding_model: !new:custom_model.Xvector
in_channels: !ref <n_mels>
activation: !name:torch.nn.LeakyReLU
tdnn_blocks: 5
tdnn_channels: [512, 512, 512, 512, 1500]
tdnn_kernel_sizes: [5, 3, 3, 1, 1]
tdnn_dilations: [1, 2, 3, 1, 1]
lin_neurons: !ref <emb_dim>

classifier: !new:custom_model.Classifier
input_shape: [null, null, !ref <emb_dim>]
activation: !name:torch.nn.LeakyReLU
lin_blocks: 1
lin_neurons: !ref <emb_dim>
out_neurons: !ref <n_classes>

label_encoder: !new:speechbrain.dataio.encoder.CategoricalEncoder

modules:
compute_features: !ref <compute_features>
embedding_model: !ref <embedding_model>
classifier: !ref
mean_var_norm: !ref <mean_var_norm>
# mean_var_norm_emb: !ref <mean_var_norm_emb>

pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer
loadables:
embedding_model: !ref <embedding_model>
classifier: !ref
label_encoder: !ref <label_encoder>
mean_var_norm: !ref <mean_var_norm>
# mean_var_norm_emb: !ref <mean_var_norm_emb>
paths:
embedding_model: !ref <pretrained_path>/embedding_model.ckpt
classifier: !ref <pretrained_path>/classifier.ckpt
label_encoder: !ref <pretrained_path>/label_encoder.txt
mean_var_norm: !ref <pretrained_path>/normalizer.ckpt
# mean_var_norm_emb: !ref <pretrained_path>/normalizer.ckpt

Versions

No response

Relevant log output

No response

Additional context

No response

Adel-Moumen · 2023-05-24T15:26:27Z

Adel-Moumen
May 24, 2023
Maintainer

Hey @andresgvargas,

I convert your issue to a discussion as it does not seems to be related to an issue with SpeechBrain.

Could you please share some of your training log regarding your model please ? What are you training data? thanks.

0 replies

andresgvargas · 2023-07-13T03:09:56Z

andresgvargas
Jul 13, 2023
Author

Hi @Adel-Moumen !
This is the Training log for one of my experiments. I modified the mls_prepare to adapt it to my dataset (Colombian Spanish Google), that contains 33 speakers and a mean of 120 samples per speaker:
log.txt

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Custom SpeakerID model troubles #1996

{{title}}

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

[Bug]: Custom SpeakerID model troubles #1996

andresgvargas May 2, 2023

Describe the bug

Expected behaviour

To Reproduce

pretrain folders:

Model parameters

Feature extraction

Mean and std normalization of the input features

Mean and std normalization of the input features

Versions

Relevant log output

Additional context

Replies: 2 comments

Adel-Moumen May 24, 2023 Maintainer

andresgvargas Jul 13, 2023 Author

andresgvargas
May 2, 2023

Adel-Moumen
May 24, 2023
Maintainer

andresgvargas
Jul 13, 2023
Author