Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to apply this repo for another emotion task? #22

Closed
tsly123 opened this issue Nov 29, 2018 · 4 comments
Closed

How to apply this repo for another emotion task? #22

tsly123 opened this issue Nov 29, 2018 · 4 comments

Comments

@tsly123
Copy link

tsly123 commented Nov 29, 2018

Hi,
Thank you for your work.
I've read the instruction and the SincNet paper. I wonder that how can I use the pytorch-kaldi and, especially, the SincNet for emotion recognition task since the repo instruction and SincNet paper are all about the speaker identification which differ from emotion recognition in term of label.
For example, all I need to do is to modify the label TIMIT_labels.npy to the label of my emotion dataset (0-7, for 8 emotions), of course along with other instruction steps?

Thank you for your time.
tsly

@mravanelli
Copy link
Owner

Hi,
this repository is mainly intended for speech recognition. You are probably talking about the other repository where we used sincnet for speaker id (https://github.com/mravanelli/SincNet). To address another task you have to change datasets and labels. To assign to each sentence to the right label, you have modify the dictionary "TIMIT_labels.npy" as you pointed out. When you change task, it could be very important to properly tune the hyperparameters of the model (e.g., cw_len, cnn_N_filt, cnn_len_filt, fc_lay,lr) to make them more suitable for the new task.
Please, let me know if you are able to make it!

Thank you!

@tsly123
Copy link
Author

tsly123 commented Nov 30, 2018

Hi,
Thank you for you reply. The repo instruction is very informative. I will get back to you when i am able to run my fusion models.

Again, thank you for your time.
tsly

@tsly123
Copy link
Author

tsly123 commented Dec 1, 2018

Hi,
I am apologize about this but after struggling with Kaldi ASR (i'm new to kaldi), I realize that my EmotiW dataset which contains *.avi files only, can't be done as instructed for TIMIT tutorial which needs others must be done files (as stated in Kaldi for Dummies, such as text, lexicon, or spk2utt, etc.

Is there another way to construct the data preparation and alignment by myself, like preparing the pre-extracting features and labels to compatible with the pytorch-kaldi?
I've tried to run the Librispeech s5 and other free datasets with Kaldi to get how the structure of prepared data but always got some errors. I've also looked at the Kaldi-io-for-python repo and thought that the features can be converted to ark file using it but for the label and alignment i don't know how to do it.

Thank you for your time.
tsly

@mravanelli
Copy link
Owner

mravanelli commented Dec 1, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants