You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see you have several models for Speech Emotion Recognition (SER).
Would you say Vesper is the best?
I have also noticed you use acted databases for training, in your experience, does learning between acted and, so called, natural databases transfer well?
I was considering training a (unimodal audio) model on CMU-MOSEI to check if training on a natural database would produce a better performing model in real-life scenarios.
What do you think?
Of course, one could argue that a significant percent of the utterances from YouTube are also acted and do not reflect real emotions, in which case, it it would be better go with professional actors than with amateur Youtubers...
Best,
Ed
The text was updated successfully, but these errors were encountered:
Hi,
I see you have several models for Speech Emotion Recognition (SER).
Would you say Vesper is the best?
I have also noticed you use acted databases for training, in your experience, does learning between acted and, so called, natural databases transfer well?
I was considering training a (unimodal audio) model on CMU-MOSEI to check if training on a natural database would produce a better performing model in real-life scenarios.
What do you think?
Of course, one could argue that a significant percent of the utterances from YouTube are also acted and do not reflect real emotions, in which case, it it would be better go with professional actors than with amateur Youtubers...
Best,
Ed
The text was updated successfully, but these errors were encountered: