[Feature Request]:I want to ask how score and prediction are calculated and operated? #2148

kevin00616 · 2023-08-25T06:34:31Z

kevin00616
Aug 25, 2023

🚀 The feature

I'm using this page: https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb/blob/main/README.md,
and I use code about caulating score and operated prediction.
But I want to know how this code calculates the score and Why is the score less than 0.25 is false not 0.5.

Solution outline

Please help me to understand these.
Thank you.

Additional context

I use this code to modify it into my file.

from speechbrain.pretrained import SpeakerRecognition
verification = SpeakerRecognition.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-voxceleb")
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk2_snt1.wav") # Different Speakers
score, prediction = verification.verify_files("tests/samples/ASR/spk1_snt1.wav", "tests/samples/ASR/spk1_snt2.wav") # Same Speaker

asumagic · 2023-08-25T12:42:38Z

asumagic
Aug 25, 2023
Maintainer

The model consumes audio and is trained to produce voice embeddings that should be as close as possible for identical speakers. The properties of the model and training method mean that you can compare two embeddings and essentially compute a similarity score.

To make a decision as to whether the voice should be considered to match, you simply compare that similarity score against the threshold (0.25 by default).

However, you can adjust that threshold. This choice depends on your data and your requirements for your application, as it directly influences the false positive and false negative rates.

See this plot from this discussion #1767 (comment) :

A higher threshold means you will reduce the amount of false positives, but means you're increasing the amount of false negatives. A lower threshold means the other way around.

Ultimately, you should test these with the data you're going to be using, as the curve may look different.
The choice of threshold is then yours depending on your error rate requirements. For example, based on the above graph, if I wanted a < 0.5% false accept rate, I would pick a threshold of ~0.31.

0 replies

kevin00616 · 2023-08-29T06:29:58Z

kevin00616
Aug 29, 2023
Author

Then I would like to ask where can I find which file to modify the threshold?

0 replies

asumagic · 2023-08-29T07:49:44Z

asumagic
Aug 29, 2023
Maintainer

You will have to use SpeakerRecognition's verify_batch method which provides the threshold parameter. verify_files does not provide it.

Fairly straightforward:

waveform_x = spkrec.load_audio(path_x)
waveform_y = spkrec.load_audio(path_y)
# Fake batches:
batch_x = waveform_x.unsqueeze(0)
batch_y = waveform_y.unsqueeze(0)
# Verify:
score, decision = spkrec.verify_batch(batch_x, batch_y, threshold=0.25) # <--
score, decision = score[0], decision[0]

0 replies

kevin00616 · 2023-08-29T08:05:24Z

kevin00616
Aug 29, 2023
Author

OK, thank you.
In addition, I would like to ask whether there is a calculation formula for score, or a definition that can explain how it is calculated.

0 replies

asumagic · 2023-08-29T08:08:11Z

asumagic
Aug 29, 2023
Maintainer

It is the cosine similarity between the two embeddings.

0 replies

kevin00616 · 2023-08-29T08:11:55Z

kevin00616
Aug 29, 2023
Author

Thank you so much, if I still encounter any problems in the future,
I will ask again in this thread,and I look forward to learning more from you.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]:I want to ask how score and prediction are calculated and operated? #2148

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

[Feature Request]:I want to ask how score and prediction are calculated and operated? #2148

kevin00616 Aug 25, 2023

🚀 The feature

Solution outline

Additional context

Replies: 6 comments

asumagic Aug 25, 2023 Maintainer

kevin00616 Aug 29, 2023 Author

asumagic Aug 29, 2023 Maintainer

kevin00616 Aug 29, 2023 Author

asumagic Aug 29, 2023 Maintainer

kevin00616 Aug 29, 2023 Author

kevin00616
Aug 25, 2023

asumagic
Aug 25, 2023
Maintainer

kevin00616
Aug 29, 2023
Author

asumagic
Aug 29, 2023
Maintainer

kevin00616
Aug 29, 2023
Author

asumagic
Aug 29, 2023
Maintainer

kevin00616
Aug 29, 2023
Author