Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train SEA model #4

Open
cyxomo opened this issue Aug 10, 2021 · 14 comments
Open

How to train SEA model #4

cyxomo opened this issue Aug 10, 2021 · 14 comments

Comments

@cyxomo
Copy link

cyxomo commented Aug 10, 2021

The pretrained model sea.ckpt just fit dataset which have 82 speaker, However, I have a huge dataset including 300 speaker at least. How could I train a corresponding SAE model?

@auspicious3000
Copy link
Owner

Do you mean SEA?

You refer to the SEA paper for training details.

@cyxomo
Copy link
Author

cyxomo commented Aug 10, 2021

I seem to have fallen into a mistake. Actually , in preparing data , the Encoder part of SEA model just be used. But I'm not sure that changing the speaker will make a difference.

@cyxomo
Copy link
Author

cyxomo commented Aug 10, 2021

Does it matter if I take my own data and extract the features from the SEA model of 82 speakers that you pre-trained

@cyxomo
Copy link
Author

cyxomo commented Aug 10, 2021

Do you mean SEA?

You refer to the SEA paper for training details.

Yeah, sorry for spelling mistake

@cyxomo cyxomo changed the title How to train SAE model How to train SEA model Aug 10, 2021
@auspicious3000
Copy link
Owner

The performance might degrade, but feel free to try.

@cyxomo
Copy link
Author

cyxomo commented Aug 10, 2021

The performance might degrade, but feel free to try.

So the right thing to do is to train an SEA model with my own data and then extract the features. Could the sea part training code be provided?

@auspicious3000
Copy link
Owner

The majority of the code for SEA is here. You just need a data loader and an optimizer.

@cyxomo
Copy link
Author

cyxomo commented Aug 10, 2021

The majority of the code for SEA is here. You just need a data loader and an optimizer.

OK, do you use the loss function like
image

@auspicious3000
Copy link
Owner

Yes

@vasyarv
Copy link

vasyarv commented Sep 5, 2021

@auspicious3000 what is c_trg in model_sea.Generator.forward ? It is part of Decoder's LSTM, dimension is same as hparams.dim_spk which is 82, but still no idea how to get it ...

@auspicious3000
Copy link
Owner

It is the one-hot speaker embedding.

@stalevna
Copy link

stalevna commented Nov 8, 2021

Do you mean SEA?

You refer to the SEA paper for training details.

Hi! Could you point me to the SEA paper? I want to make sure I am reading the right one

@auspicious3000
Copy link
Owner

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery

@wang1612
Copy link

wang1612 commented Nov 16, 2021

@auspicious3000
Could you check my codes of SEA training loss below:

mask_sp_real = ~sequence_mask(len_real, cep_real0.size(1))# cep_real0 is MFCC that do not cut by [:, 0:20]
mask = (~mask_sp_real).float()
self.P = self.P.train()
mel_outputs , mel_outputs_B= self.P(cep_real, spk_emb, mask)#mel_outputs_B is output of decoder with input of self Expressing autoencoded Z
loss_A = F.mse_loss(mel_outputs, cep_real0,reduction='mean')
loss_B = F.mse_loss(mel_outputs_B, cep_real0,reduction='mean')
p_loss = loss_A + loss_B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants