To load a pretrained ECAPA-TDNN model available for commercial use, download some sample audio files, and evaluate the model’s performance, you can follow these steps. For this example, we’ll use the SpeechBrain library, which provides pretrained models and tools for speaker verification.

### Step 1: Install Dependencies
If you haven’t already installed the necessary libraries, do so:

In [None]:
pip install speechbrain
pip install torchaudio

### Step 2: Load the Pretrained Model
SpeechBrain provides a pretrained ECAPA-TDNN model that you can use directly. Here’s how to load it:

In [7]:
import torchaudio
from speechbrain.inference.speaker import SpeakerRecognition

model_id = "speechbrain/spkrec-ecapa-voxceleb"

# Load the pretrained ECAPA-TDNN model
verification = SpeakerRecognition.from_hparams(source=model_id, savedir="pretrained_models/spkrec-ecapa-voxceleb")

# https://huggingface.co/speechbrain/spkrec-ecapa-voxceleb#compute-your-speaker-embeddings
from speechbrain.inference.speaker import EncoderClassifier
classifier = EncoderClassifier.from_hparams(source=model_id)

### Step 3: Download Sample Audio Files
For demonstration purposes, you can download some sample audio files from online sources. Here, we use URLs to download a couple of sample audio files:

In [None]:
import requests

def download_audio(url, filename):
    response = requests.get(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

# Sample audio URLs
audio_urls = [
    "https://example.com/audio1.wav",  # Replace with actual URLs
    "https://example.com/audio2.wav"   # Replace with actual URLs
]

# Download audio samples
audio_files = ["audio1.wav", "audio2.wav"]
for url, filename in zip(audio_urls, audio_files):
    download_audio(url, filename)

### Step 4: Evaluate the Model
Evaluate the model by computing verification scores between pairs of audio files:

In [None]:
# Load audio samples
waveform1, sample_rate1 = torchaudio.load('samples/LibriSpeech/dev-other/116/288045/116-288045-0000.flac')#audio_files[0])
waveform2, sample_rate2 = torchaudio.load('samples/LibriSpeech/dev-other/116/288045/116-288045-0001.flac')#audio_files[1])

# Ensure the sample rates match
assert sample_rate1 == sample_rate2, "Sample rates do not match!"

# Perform speaker verification
# score, prediction = verification.verify_batch(waveform1, waveform2)
score, prediction = verification.verify_files(waveform1, waveform2)

# Output the results
print(f"Verification score: {score.item()}")
print(f"Prediction: {'Same speaker' if prediction else 'Different speakers'}")

In [10]:
signal, fs =torchaudio.load('samples/LibriSpeech/dev-other/116/288045/116-288045-0000.flac')
embeddings = classifier.encode_batch(signal)

score, prediction = verification.verify_files('samples/LibriSpeech/dev-other/116/288045/116-288045-0000.flac', 'samples/LibriSpeech/dev-other/116/288045/116-288045-0001.flac') # Different Speakers

# Output the results
print(f"Verification score: {score.item()}")
print(f"Prediction: {'Same speaker' if prediction else 'Different speakers'}")

score, prediction = verification.verify_files('samples/LibriSpeech/dev-other/116/288045/116-288045-0000.flac', 'samples/LibriSpeech/dev-other/700/122866/700-122866-0000.flac') # Same Speaker

# Output the results
print(f"Verification score: {score.item()}")
print(f"Prediction: {'Same speaker' if prediction else 'Different speakers'}")

Verification score: 0.7892520427703857
Prediction: Same speaker
Verification score: 0.08970315754413605
Prediction: Different speakers


### Complete Script
Combining all the steps into a complete script:

In [None]:
import requests
import torchaudio
from speechbrain.inference.speaker import SpeakerRecognition

def download_audio(url, filename):
    response = requests.get(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

# Load the pretrained ECAPA-TDNN model
model = SpeakerRecognition.from_hparams(source="speechbrain/spkrec-ecapa-voxceleb", savedir="pretrained_models/spkrec-ecapa-voxceleb")

# Sample audio URLs (replace with actual URLs)
audio_urls = [
    "https://example.com/audio1.wav",
    "https://example.com/audio2.wav"
]

# Download audio samples
audio_files = ["audio1.wav", "audio2.wav"]
for url, filename in zip(audio_urls, audio_files):
    download_audio(url, filename)

# Load audio samples
waveform1, sample_rate1 = torchaudio.load(audio_files[0])
waveform2, sample_rate2 = torchaudio.load(audio_files[1])

# Ensure the sample rates match
assert sample_rate1 == sample_rate2, "Sample rates do not match!"

# Perform speaker verification
score, prediction = model.verify_batch(waveform1, waveform2)

# Output the results
print(f"Verification score: {score.item()}")
print(f"Prediction: {'Same speaker' if prediction else 'Different speakers'}")

Replace the example audio URLs with actual URLs of audio files you want to test. This script will download the audio files, load the pretrained ECAPA-TDNN model, and evaluate the speaker verification performance between the two audio samples.