-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the settings in speech_data_simulator #9288
Comments
Just in case, I've been using the latest version of NeMo with:
|
@tango4j, could you check the above issue with |
We need a little more time to figure out why
You're right that the |
@stevehuang52 Thank you for figuring out this issue. You can check these dataset I used and the simulated meetings I've generated in case it helps:
Thanks a lot. Then I'll set it to an integer in my case. |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
(Just a bump) |
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
This issue was closed because it has been inactive for 7 days since being marked as stale. |
Hi, I'm currently using
NeMo/tools/speech_data_simulator
to fine-tune the MSDD model and have some questions about the data_simulator.1. How can I ensure that every session has exactly as many speakers as
num_speakers
?Currently in my case, sessions are occasionally created that contain fewer speakers than
num_speakers
.This seemed to become more frequent as
num_speakers
became larger than 4.For example, I've created 32 sessions with
num_speakers
as 4, but 9 sessions include only 3 speakers.I used a custom dataset as an input to this simulator, and the total number of speakers in the dataset was around 50.
The minimum number of utterances from speakers was 300, and the average length of an utterance was about 5 seconds.
As far as I've looked up, the following parameters are related with the above question:
NeMo/tools/speech_data_simulator/conf/data_simulator.yaml
Lines 8 to 9 in 0e744c9
NeMo/tools/speech_data_simulator/conf/data_simulator.yaml
Lines 79 to 83 in 0e744c9
NeMo/tools/speech_data_simulator/conf/data_simulator.yaml
Lines 18 to 21 in 0e744c9
I tried tweaking the settings to fix this, but nothing worked.
My current setup is as follows:
2. Why the default value of
sentence_length_params
is not an integer?According to the comments, the value of
sentence_length_params
must be a positive integer but the value is set to0.4
.The session itself creates fine with this setting, but I'd like to ask why this is the default.
NeMo/tools/speech_data_simulator/conf/data_simulator.yaml
Lines 15 to 17 in 0f2874b
Thank you in advance.
The text was updated successfully, but these errors were encountered: