[Feature Request]:I want to ask how score and prediction are calculated and operated? #2148
Replies: 6 comments
-
The model consumes audio and is trained to produce voice embeddings that should be as close as possible for identical speakers. The properties of the model and training method mean that you can compare two embeddings and essentially compute a similarity score. To make a decision as to whether the voice should be considered to match, you simply compare that similarity score against the threshold ( However, you can adjust that threshold. This choice depends on your data and your requirements for your application, as it directly influences the false positive and false negative rates. See this plot from this discussion #1767 (comment) : A higher threshold means you will reduce the amount of false positives, but means you're increasing the amount of false negatives. A lower threshold means the other way around. Ultimately, you should test these with the data you're going to be using, as the curve may look different. |
Beta Was this translation helpful? Give feedback.
-
Then I would like to ask where can I find which file to modify the threshold? |
Beta Was this translation helpful? Give feedback.
-
You will have to use Fairly straightforward: waveform_x = spkrec.load_audio(path_x)
waveform_y = spkrec.load_audio(path_y)
# Fake batches:
batch_x = waveform_x.unsqueeze(0)
batch_y = waveform_y.unsqueeze(0)
# Verify:
score, decision = spkrec.verify_batch(batch_x, batch_y, threshold=0.25) # <--
score, decision = score[0], decision[0] |
Beta Was this translation helpful? Give feedback.
-
OK, thank you. |
Beta Was this translation helpful? Give feedback.
-
It is the cosine similarity between the two embeddings. |
Beta Was this translation helpful? Give feedback.
-
Thank you so much, if I still encounter any problems in the future, |
Beta Was this translation helpful? Give feedback.
-
🚀 The feature
I'm using this page: https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb/blob/main/README.md,
and I use code about caulating score and operated prediction.
But I want to know how this code calculates the score and Why is the score less than 0.25 is false not 0.5.
Solution outline
Please help me to understand these.
Thank you.
Additional context
I use this code to modify it into my file.
from speechbrain.pretrained import SpeakerRecognition
verification = SpeakerRecognition.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-voxceleb")
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker
Beta Was this translation helpful? Give feedback.
All reactions