# ECAPA-TDNN demo

This notebook demonstrates formerly trained ecapa-tdnn model usage.

## Step 0: SpeechBrain Installation and Repository Cloning
Before starting, let's install speechbrain and clone the repository:

In [3]:
%%capture
# For pip installation
!pip install speechbrain

!git clone https://github.com/Sharif-DAL-INEF-1400/Verification-and-Identification-Speechbrain.git SpeakerVerification

## Step 1: Pretrained model download

Now, we download pretrained model from Drive using gdown:

In [5]:
!gdown -q -O best_model --folder 1R_gvC_St56Atxfu8MLRb1PIlBnBahta2

## Step 2: Inferencing

In [6]:
%cd SpeakerVerification

/content/SpeakerVerification


In [7]:
from verification import SpeakerVerification

verification = SpeakerVerification.from_hparams(source="/content/best_model/", hparams_file='hparams_inference.yaml')

**Single file encoding**

In [None]:
import torchaudio

signal, sample_rate = torchaudio.load('/path/to/sample1.mp3')

# Compute speaker embeddings
embeddings1 = verification.encode_batch(signal)

embeddings1

tensor([[[ -3.6837,  -0.6558,  -0.6454,   2.6533,   3.3016,  -9.2773,   3.9467,
            0.4680,  -5.6970,   7.1742,  -5.2316,  -2.9714,   2.4388,  -5.3347,
            3.1162,  -2.4510,  -0.2553,   0.4154,   4.9278,  -4.5087,  -0.9209,
            4.8101,   2.3582,  -3.8135,  -7.6600,  -0.4009,  -7.4168,  -0.2756,
            3.2330,   4.8542,   1.0191,   3.2338,   2.6052,   5.7562,  -1.6302,
            0.0497,  -4.4003,   0.6605,  10.7517,   4.6959,  -0.1784,   1.9866,
            5.8134,  -3.2151,   2.0901,   0.6933,   4.1360,  -0.1310,   4.8007,
            3.6745,  -2.3712,  -4.0766,  -0.9199,   3.4865,   4.6883,  -3.6697,
            1.2828,   0.9975,  -5.1622,   4.1725,  -5.4362,  -5.0500,  -5.6218,
            4.7661,  -2.4573,  -2.4288,  -3.4452,   2.7631,   2.7647,   5.4494,
           -1.8735,   4.4892,   3.8760,  -0.1997,  -7.3866,   2.1623,   0.5600,
            2.9146,   3.0708,  -2.2795,  -1.6158,   2.3083,   2.2174,  -2.9761,
            4.4273,   7.2554,  -4.2679, 

**Two files verification**

Note: use `mean_norm=False`, `snorm=False`, and/or `a_norm=False` (amplitude norm) flags to disable corresponding normalization.

In [None]:
file1 = '/path/to/sample1.mp3'  # Speaker 1
file2 = '/path/to/sample2.mp3'  # Speaker 2
file3 = '/path/to/sample3.mp3'  # Speaker 2

score, prediction = verification.verify_files(file1, file2, threshold = 8)
print(f"Different speakers: score: {score.item()}, prediction: {prediction.item()}")  # True = same speaker, False = Different speakers


score, prediction = verification.verify_files(file2, file3, threshold = 8)
print(f"Different speakers: score: {score.item()}, prediction: {prediction.item()}")  # True = same speaker, False = Different speakers

Different speakers: score: 7.726524353027344, prediction: False
Different speakers: score: 14.730301856994629, prediction: True
