New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There is a problems when run sre10 v1 run.sh #1014
Comments
Have you modified the sre10/v1 example? Are you using the same datasets? Could you copy the first few lines of exp/scores_gmm_${num_components}${x}${y}/plda_scores and $trials here? |
First, thanks for your reply. I used the timit datasets. And the scores_gmm_1024_ind_female/plda_scores info: Thanks. |
and the data/test/trials info is: Could you copy the first few lines of trials for me? |
Mine looks like this:
etc If you look at local/prepare_for_eer.py you'll see that it's very simple. All it's does is prepare the input to the binary compute-eer. The expected input is of the form
etc Could you try running python local/prepare_for_eer.py by itself, and copy some of the output here? |
First, thank you very much! just as you can see , the "istarget" column value is "target" for all lines, there is no "nontarget", the reason is that I don't have the "$db_base/keys/coreext-coreext.trialkey.csv" file, so I don't know the value of "istarget" is "target" or "nontarget". Thank you very much! |
A "target" trial is where the utterance is spoken by the speaker. A "nontarget" trial is where the utterance is spoken by a different speaker. If your verification system has no errors, target trials should be accepted, and nontarget trials should be rejected by your verification system. What I suggest is that you look at the spk2utt file, and write a script that generates a trials file for you. For each speaker, you can take the speaker id and all the utterances that belong to it and pair them up to create the "target" trials. For example:
To create the nontarget trials, you can randomly pair a speaker with utterances belonging to another speaker. For example:
Since you'll need to write a script to do this, you can create an option that controls the probability of forming nontarget trials, and you can generate several trials files with a different percent of nontargets and see how the results vary between them. Since the scripts evaluate on the equal error-rate (EER), most likely it won't make a big difference. E.g., if your EER is 10% with 50% nontarget trials, it will likely continue to be 10% with 80% nontarget trials.
In your case you would just have three datasets, train, enroll (called sre10_train in this example), and test (called sre10_test in this example). The SRE dataset is used to train the PLDA model, since the bulk of the training data is out of domain. In your situation, you would train the PLDA model directly on the training data, since it is in domain. I would do the following:
In my opinion, if you have more questions about this, it is best to move this to the Kaldi help page at http://kaldi-asr.org/forums.html. |
Thanks very much for your detailed answers. Now I know how to modify my script. |
the problem is happend when run the part:
dep pooled: 2.16
echo "GMM-$num_components EER"
for x in ind dep; do
for y in female male pooled; do
#eer=
compute-eer <(python local/prepare_for_eer.py $trials exp/scores_gmm_${num_components}_${x}_${y}/plda_scores) 2> /dev/null
echo "${x} ${y}: $eer"
done
done
and the error info is :
GMM-1024 EER
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
environment:
python2.7 ubuntu16.04 'The latest version kaldi'
I don't know what caused it. Thanks for your reply
The text was updated successfully, but these errors were encountered: