Skip to content
This repository has been archived by the owner on Oct 10, 2022. It is now read-only.

Several questions #13

Closed
JBloodless opened this issue Oct 15, 2019 · 5 comments
Closed

Several questions #13

JBloodless opened this issue Oct 15, 2019 · 5 comments
Labels
question Further information is requested

Comments

@JBloodless
Copy link

JBloodless commented Oct 15, 2019

Hi.

  1. Is there any gender separation or you didn't track this info?
  2. Are any of this files clear from noise and, if so, can I easily track them?
  3. Which dataset have most number of different speakers?
@snakers4 snakers4 added the question Further information is requested label Oct 15, 2019
@snakers4
Copy link
Owner

Hi,

(1)
Did not track
We do not have proper labels for speakers in this dataset

(2)
There is an issue where I posted a bench on the dataset with CER
You can just take a subset with low CER, i.e. CER < 5%
Most likely it will be free of noise
Also note that some dataset parts, i.e. prank calls, are very noisy by definition

(3)
Hard to say, but probably radio_2 or youtube

@snakers4
Copy link
Owner

We do not have proper labels for speakers in this dataset

Though we are planning to apply our speaker encoders retroactively, as soon as we have trained them

@JBloodless
Copy link
Author

Ok, what speaker_set in public meta means? Can I believe that speakers in one set is the same (have same pitches, spectral coloration etc.)?

@snakers4
Copy link
Owner

Ok, what speaker_set in public meta means? Can I believe that speakers in one set is the same (have same pitches, spectral coloration etc.)?

It is the same set of speakers within the same speaker_set
You can believe that if speaker_set_1 != speaker_set2, then in any 2 given audios from set1 and set2 speakers will be 100% different

@snakers4
Copy link
Owner

I.e. speaker_set1 may contain N speakers
speaker_set2 may contain M speakers different from N

@snakers4 snakers4 closed this as completed Nov 4, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants