##NPPE Speaker Classification Instructions
1. Install Required Packages
Ensure Python is installed and run the following command to install the necessary libraries:


`pip install transformers==4.44.2 datasets==3.1.0 torch==2.5.0`

2. Set Up Environment
Verify the availability of a GPU for faster processing. Use CUDA (GPU support) if available by transferring your PyTorch models and data for feature extraction and CNN training to the available device.

3. Loading and Processing Data
Dataset Loading: Load the VoxPopuli dataset using the
1
datasets
library with
1
streaming=True
for memory efficiency. Load the dataset with language "slovene" and split "train."
Random Shuffling: Shuffle the dataset using a random seed of
1
42
. Also, fix the same seed for
1
torch.manual_seed
.
4. Feature Extraction Using Wav2Vec2
Loading Model and Processor: Load the Wav2Vec2 model and processor from Hugging Face’s transformers library to extract audio features. Use the pretrained model
1
facebook/wav2vec2-base-960h
.
Feature Extraction Function: Define a function to process each audio sample and extract its feature representation. Apply this function to obtain Wav2Vec2 embeddings for the dataset.
5. Data Filtering and Label Mapping
Filtering Short Samples: Exclude samples with fewer than 200 frames.
Label Mapping: Assign each unique speaker an integer label. Sort the speaker list using the
1
sorted
function before mapping and store mappings in a dictionary.
6. Model Definition
Define a custom CNN model with the following architecture:

Conv1: In channels: 768, Out channels: 256, Kernel size: 3
BatchNorm1: Features: 256
ReLU1: Applied after BatchNorm1
MaxPool1: Kernel size: 2
Conv2: In channels: 256, Out channels: 128, Kernel size: 3
BatchNorm2: Features: 128
ReLU2: Applied after BatchNorm2
MaxPool2: Kernel size: 2
Conv3: In channels: 128, Out channels: 32, Kernel size: 3
BatchNorm3: Features: 32
ReLU3: Applied after BatchNorm3
MaxPool3: Kernel size: 2
Global Average Pooling: Averaged across the temporal dimension
FC1: Input features: 32, Output features: 128
ReLU4: Applied after FC1
FC2: Input features: 128, Output features: Number of classes **bold text**
7. Data Splitting and DataLoader
Data Splitting: Split the data into training, validation, and test sets using sklearn. Divide the dataset by taking 80% for train and 20% for test. Then take another 10% of the train data for valid set. Set random state to 42 and use sklearn for this stage.
DataLoader: Use PyTorch’s DataLoader with a custom collate function to convert features and labels to tensors, truncating features to the first 200 frames. Batch sizes:
Train: 100
Validation: 10
Test: 1
8. Training the Model
Loss and Optimizer: Use cross-entropy loss and Adam optimizer with a learning rate of
1
0.001
.
Training Loop: Train for 100 epochs. For each epoch:
Set the model to training mode.
Zero the optimizer’s gradient.
For each batch, forward propagate, calculate loss, and backpropagate.
Track average loss for the epoch.
Validation: Evaluate on the validation set after each epoch by calculating loss without weight updates. Track validation loss to monitor performance.
9. Evaluating on the Test Set
Set the model to evaluation mode. Compute class predictions for each test sample, calculate accuracy using
1
sklearn
's
1
accuracy_score
, and display the test accuracy.

In [None]:
!pip install transformers==4.44.2 datasets==3.1.0 torch==2.5.0

Collecting transformers==4.44.2
  Downloading transformers-4.44.2-py3-none-any.whl.metadata (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.7/43.7 kB[0m [31m132.7 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets==3.1.0
  Downloading datasets-3.1.0-py3-none-any.whl.metadata (20 kB)
Collecting torch==2.5.0
  Downloading torch-2.5.0-cp310-cp310-manylinux1_x86_64.whl.metadata (28 kB)
Collecting tokenizers<0.20,>=0.19 (from transformers==4.44.2)
  Downloading tokenizers-0.19.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets==3.1.0)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets==3.1.0)
  Downloading xxhash-3.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets==3.1.0)
  Downloading multiprocess-0.70.16-py310-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0

In [None]:
import torch
from datasets import load_dataset, get_dataset_config_names

In [None]:
get_dataset_config_names("facebook/voxpopuli")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

voxpopuli.py:   0%|          | 0.00/8.84k [00:00<?, ?B/s]

The repository for facebook/voxpopuli contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/facebook/voxpopuli.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y


['en',
 'de',
 'fr',
 'es',
 'pl',
 'it',
 'ro',
 'hu',
 'cs',
 'nl',
 'fi',
 'hr',
 'sk',
 'sl',
 'et',
 'lt',
 'en_accented',
 'multilang']

In [None]:
dataset = load_dataset("facebook/voxpopuli", "sl", split="train", streaming=True)

In [None]:
for i in dataset:
  print(i)
  break

{'audio_id': '20140417-0900-PLENARY-7-sl_20140417-10:42:28_17', 'language': 13, 'audio': {'path': 'train_part_0/20140417-0900-PLENARY-7-sl_20140417-10:42:28_17.wav', 'array': array([ 2.44140625e-04, -1.22070312e-04, -6.10351562e-05, ...,
       -7.01904297e-04,  0.00000000e+00,  2.44140625e-04]), 'sampling_rate': 16000}, 'raw_text': 'V tem duhu se zahvaljujem poročevalcem v senci in vodji delegacije za Japonsko gospodu van Baalenu, kot tudi njenim posameznim članom za sodelovanje in koristne konzultacije.', 'normalized_text': 'v tem duhu se zahvaljujem poročevalcem v senci in vodji delegacije za japonsko gospodu van baalenu kot tudi njenim posameznim članom za sodelovanje in koristne konzultacije.', 'gender': 'male', 'speaker_id': '23693', 'is_gold_transcript': True, 'accent': 'None'}


In [None]:
# c = 0
# for i in dataset:
#   c += 1
# c

In [None]:

dataset = dataset.shuffle(seed=42)

In [None]:
from transformers import Wav2Vec2Processor, Wav2Vec2Model
from tqdm import tqdm
# Load the Wav2Vec2 processor and model
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2Model.from_pretrained("facebook/wav2vec2-base-960h")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

preprocessor_config.json:   0%|          | 0.00/159 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/163 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.60k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/291 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]



model.safetensors:   0%|          | 0.00/378M [00:00<?, ?B/s]

Some weights of Wav2Vec2Model were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
import torch

# Move model to GPU
model = model.to('cuda')
min_frame = float('inf')
max_frame = 0
count = 0
def process_audio(data):
    global count, min_frame, max_frame
    count += 1
    # print(count)

    # Move input data to GPU
    inputs = processor(data["audio"]["array"], sampling_rate=16000, return_tensors="pt", padding=True)
    inputs = {key: val.to('cuda') for key, val in inputs.items()}  # Move inputs to GPU

    with torch.no_grad():
        # Forward pass on GPU
        outputs = model(**inputs)

    # Move the output back to CPU
    features = outputs.last_hidden_state.squeeze()
    frame = outputs.last_hidden_state.cpu().numpy().shape[1]
    if frame < min_frame:
      min_frame = frame
    if frame > max_frame:
      max_frame = frame
    # print(frame)
    return {'features': features, 'label': data['speaker_id'], 'frame_rate': frame}


In [None]:
x = process_audio(next(iter(dataset)))

In [None]:
x['features'].shape, x['label'], x['frame_rate']

(torch.Size([324, 768]), '97019', 324)

In [None]:
# Map the function over the dataset
processed_dataset = dataset.map(process_audio).filter(lambda x: x['frame_rate']>=200)

In [None]:
unique_spk = dict()
spk_to_id = dict()
cur_spk = 0

c = 0
data_list = list()
for data in processed_dataset:
  c += 1
  if c == 1:
    print(data)
  if unique_spk.get(data['label']) is None:
    unique_spk[data['label']] = 1
    spk_to_id[data['label']] = cur_spk
    cur_spk += 1
  else:
    unique_spk[data['label']] += 1
  data['label'] = spk_to_id[data['label']]
  data['features'] = data['features'][:200].cpu().numpy()
  data_list.append(data)

c

{'audio_id': '20160706-0900-PLENARY-16-sl_20160706-17:46:12_5', 'language': 13, 'audio': {'path': 'train_part_0/20160706-0900-PLENARY-16-sl_20160706-17:46:12_5.wav', 'array': array([-0.00079346,  0.00195312,  0.00448608, ..., -0.13134766,
       -0.14324951, -0.10733032]), 'sampling_rate': 16000}, 'raw_text': 'Zato bom to svojo minuto posvetil razstavi o oljki, simbolu miru, preživetja in sožitja.', 'normalized_text': 'zato bom to svojo minuto posvetil razstavi o oljki simbolu miru preživetja in sožitja.', 'gender': 'male', 'speaker_id': '97019', 'is_gold_transcript': True, 'accent': 'None', 'features': tensor([[-0.2237, -0.0537,  0.2011,  ..., -0.0044,  0.0124, -0.0896],
        [-0.1868, -0.0021, -0.0329,  ...,  0.0359, -0.1378, -0.0867],
        [-0.2168,  0.1737, -0.2272,  ..., -0.0951,  0.0240,  0.1956],
        ...,
        [-0.0827, -0.1002,  0.4105,  ..., -0.0068,  0.0306, -0.1201],
        [-0.0823, -0.0750,  0.3703,  ..., -0.1898,  0.0454, -0.1828],
        [-0.0546, -0.0293,

1937

In [None]:
print(f"  min_frame: {min_frame}")
print(f"  max_frame: {max_frame}")

  min_frame: 27
  max_frame: 1948


In [None]:
len(data_list)

1937

In [None]:
len(unique_spk)

12

In [None]:
# Sort the list by the 'label' key
data_list = sorted(data_list, key=lambda x: x["label"])

In [None]:
torch.manual_seed(42)

<torch._C.Generator at 0x7d2b9c51b050>

In [None]:
import torch.nn as nn
import torch.nn.functional as F
# Set the random seed for reproducibility
torch.manual_seed(42)
class CustomCNN(nn.Module):
    def __init__(self, num_classes):
        super(CustomCNN, self).__init__()
        # torch.manual_seed = 42
        # Conv1 -> BatchNorm1 -> ReLU1 -> MaxPool1
        self.conv1 = nn.Conv1d(in_channels=768, out_channels=256, kernel_size=3)
        self.bn1 = nn.BatchNorm1d(num_features=256)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool1d(kernel_size=2)

        # Conv2 -> BatchNorm2 -> ReLU2 -> MaxPool2
        self.conv2 = nn.Conv1d(in_channels=256, out_channels=128, kernel_size=3)
        self.bn2 = nn.BatchNorm1d(num_features=128)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool1d(kernel_size=2)

        # Conv3 -> BatchNorm3 -> ReLU3 -> MaxPool3
        self.conv3 = nn.Conv1d(in_channels=128, out_channels=32, kernel_size=3)
        self.bn3 = nn.BatchNorm1d(num_features=32)
        self.relu3 = nn.ReLU()
        self.pool3 = nn.MaxPool1d(kernel_size=2)

        # Global Average Pooling
        self.global_avg_pool = nn.AdaptiveAvgPool1d(1)  # Outputs a tensor of shape [batch_size, channels, 1]

        # Fully Connected Layers
        self.fc1 = nn.Linear(in_features=32, out_features=128)
        self.relu4 = nn.ReLU()
        self.fc2 = nn.Linear(in_features=128, out_features=num_classes)

    def forward(self, x):
        # Conv1 -> BatchNorm1 -> ReLU1 -> MaxPool1
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu1(x)
        x = self.pool1(x)

        # Conv2 -> BatchNorm2 -> ReLU2 -> MaxPool2
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu2(x)
        x = self.pool2(x)

        # Conv3 -> BatchNorm3 -> ReLU3 -> MaxPool3
        x = self.conv3(x)
        x = self.bn3(x)
        x = self.relu3(x)
        x = self.pool3(x)

        # Global Average Pooling
        x = self.global_avg_pool(x)  # Shape: [batch_size, channels, 1]
        x = x.squeeze(-1)  # Remove the last dimension, Shape: [batch_size, channels]

        # Fully Connected Layers
        x = self.fc1(x)
        x = self.relu4(x)
        x = self.fc2(x)

        return x

In [None]:
for sample in samples_frame_200:
    sample['label'] = label_mapping.get(sample['label'], -1)  # Default to -1 if not found

# Check the updated samples
print(samples_frame_200)

In [None]:
from sklearn.model_selection import train_test_split

train_data, test_data = train_test_split(data_list, test_size=0.2, random_state=42)

# Further split train data into training (90%) and validation (10%)
train_data, val_data = train_test_split(train_data, test_size=0.1, random_state=42)


In [None]:
len(train_data)

1394

In [None]:
from torch.utils.data import DataLoader, Dataset
# Step 2: Custom Dataset Class
class CustomDataset(Dataset):
    def __init__(self, data):
        self.data = data

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        item = self.data[idx]
        features = item["features"]  # Assuming features are stored as numpy arrays
        label = item["label"]
        return torch.tensor(features, dtype=torch.float32), torch.tensor(label, dtype=torch.long)

# Step 4: Create DataLoaders
batch_sizes = {"train": 100, "validation": 10, "test": 1}

train_dataset = CustomDataset(train_data)
val_dataset = CustomDataset(val_data)
test_dataset = CustomDataset(test_data)

train_loader = DataLoader(train_dataset, batch_size=batch_sizes["train"], shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_sizes["validation"], shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=batch_sizes["test"], shuffle=False)

# Example: Iterate through the train DataLoader
for batch_features, batch_labels in train_loader:
    print(f"Batch features shape: {batch_features.shape}, Batch labels shape: {batch_labels.shape}")
    break

Batch features shape: torch.Size([100, 200, 768]), Batch labels shape: torch.Size([100])


In [None]:
len(train_loader)

14

In [None]:
for batch_features, batch_labels in tqdm(train_loader, desc="Training"):
  print(batch_features.shape, batch_labels.shape)
  break

Training:   0%|          | 0/14 [00:00<?, ?it/s]

torch.Size([100, 200, 768]) torch.Size([100])





In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm

# Assuming `model` is already defined and matches the architecture you shared
# Assuming `train_loader` and `val_loader` are DataLoaders created as described earlier
model = CustomCNN(num_classes=len(unique_spk))
# Step 1: Define Loss and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Step 2: Training and Validation Loop
num_epochs = 100
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

In [None]:

train_loss_each = list()
val_loss_each = list()
for epoch in range(1, num_epochs + 1):
    # Set the model to training mode
    model.train()
    running_train_loss = 0.0

    print(f"Epoch {epoch}/{num_epochs}")
    print("-" * 30)

    # Training Loop
    for batch_features, batch_labels in tqdm(train_loader, desc="Training"):
        batch_features, batch_labels = batch_features.to(device), batch_labels.to(device)
        outputs = model(batch_features.transpose(1, 2).float())

        # Zero the optimizer gradients
        optimizer.zero_grad()

        # Forward pass
        # Compute the loss
        loss = criterion(outputs, batch_labels)

        # Backward pass
        loss.backward()
        optimizer.step()

        # Accumulate the loss
        running_train_loss += loss.item()

    # Average training loss for the epoch
    avg_train_loss = running_train_loss / len(train_loader)
    train_loss_each.append(avg_train_loss)
    print(f"Training Loss: {avg_train_loss:.4f}")

    # Validation Loop
    model.eval()
    running_val_loss = 0.0

    with torch.no_grad():
        for val_features, val_labels in tqdm(val_loader, desc="Validation"):
            val_features, val_labels = val_features.to(device), val_labels.to(device)

            # Forward pass
            val_outputs = model(val_features.transpose(1, 2).float())

            # Compute the validation loss
            val_loss = criterion(val_outputs, val_labels)

            # Accumulate validation loss
            running_val_loss += val_loss.item()

    # Average validation loss for the epoch
    avg_val_loss = running_val_loss / len(val_loader)
    val_loss_each.append(avg_val_loss)
    print(f"Validation Loss: {avg_val_loss:.4f}\n")

# Save the trained model
torch.save(model.state_dict(), "cnn_model.pth")
print("Training complete. Model saved to cnn_model.pth")


Epoch 1/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.28it/s]


Training Loss: 2.2512


Validation: 100%|██████████| 16/16 [00:00<00:00, 134.59it/s]


Validation Loss: 2.2606

Epoch 2/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.01it/s]


Training Loss: 1.8966


Validation: 100%|██████████| 16/16 [00:00<00:00, 150.17it/s]


Validation Loss: 1.9658

Epoch 3/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.13it/s]


Training Loss: 1.6754


Validation: 100%|██████████| 16/16 [00:00<00:00, 151.19it/s]


Validation Loss: 1.7989

Epoch 4/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.19it/s]


Training Loss: 1.5005


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.95it/s]


Validation Loss: 1.6179

Epoch 5/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.22it/s]


Training Loss: 1.3042


Validation: 100%|██████████| 16/16 [00:00<00:00, 123.04it/s]


Validation Loss: 1.5014

Epoch 6/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.33it/s]


Training Loss: 1.1419


Validation: 100%|██████████| 16/16 [00:00<00:00, 132.31it/s]


Validation Loss: 1.4659

Epoch 7/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.98it/s]


Training Loss: 1.0105


Validation: 100%|██████████| 16/16 [00:00<00:00, 141.13it/s]


Validation Loss: 1.4358

Epoch 8/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.05it/s]


Training Loss: 0.8344


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.69it/s]


Validation Loss: 1.3218

Epoch 9/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.12it/s]


Training Loss: 0.6543


Validation: 100%|██████████| 16/16 [00:00<00:00, 121.28it/s]


Validation Loss: 1.1945

Epoch 10/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.49it/s]


Training Loss: 0.5144


Validation: 100%|██████████| 16/16 [00:00<00:00, 131.26it/s]


Validation Loss: 1.0537

Epoch 11/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.67it/s]


Training Loss: 0.4294


Validation: 100%|██████████| 16/16 [00:00<00:00, 141.72it/s]


Validation Loss: 1.5617

Epoch 12/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.41it/s]


Training Loss: 0.3122


Validation: 100%|██████████| 16/16 [00:00<00:00, 116.28it/s]


Validation Loss: 1.1268

Epoch 13/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.19it/s]


Training Loss: 0.2216


Validation: 100%|██████████| 16/16 [00:00<00:00, 126.03it/s]


Validation Loss: 0.9507

Epoch 14/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.74it/s]


Training Loss: 0.1510


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.61it/s]


Validation Loss: 0.9714

Epoch 15/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.13it/s]


Training Loss: 0.1255


Validation: 100%|██████████| 16/16 [00:00<00:00, 148.44it/s]


Validation Loss: 1.0094

Epoch 16/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.89it/s]


Training Loss: 0.0859


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.24it/s]


Validation Loss: 0.9835

Epoch 17/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.98it/s]


Training Loss: 0.0511


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.33it/s]


Validation Loss: 0.9134

Epoch 18/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.58it/s]


Training Loss: 0.0387


Validation: 100%|██████████| 16/16 [00:00<00:00, 133.68it/s]


Validation Loss: 0.8637

Epoch 19/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.24it/s]


Training Loss: 0.0277


Validation: 100%|██████████| 16/16 [00:00<00:00, 131.40it/s]


Validation Loss: 0.8913

Epoch 20/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.76it/s]


Training Loss: 0.0202


Validation: 100%|██████████| 16/16 [00:00<00:00, 123.47it/s]


Validation Loss: 0.8698

Epoch 21/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.91it/s]


Training Loss: 0.0151


Validation: 100%|██████████| 16/16 [00:00<00:00, 121.35it/s]


Validation Loss: 0.9016

Epoch 22/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.93it/s]


Training Loss: 0.0124


Validation: 100%|██████████| 16/16 [00:00<00:00, 137.28it/s]


Validation Loss: 0.9550

Epoch 23/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.06it/s]


Training Loss: 0.0100


Validation: 100%|██████████| 16/16 [00:00<00:00, 154.35it/s]


Validation Loss: 0.8589

Epoch 24/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.83it/s]


Training Loss: 0.0079


Validation: 100%|██████████| 16/16 [00:00<00:00, 120.65it/s]


Validation Loss: 0.9213

Epoch 25/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.14it/s]


Training Loss: 0.0072


Validation: 100%|██████████| 16/16 [00:00<00:00, 137.26it/s]


Validation Loss: 0.8484

Epoch 26/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.86it/s]


Training Loss: 0.0066


Validation: 100%|██████████| 16/16 [00:00<00:00, 126.33it/s]


Validation Loss: 0.8819

Epoch 27/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.76it/s]


Training Loss: 0.0053


Validation: 100%|██████████| 16/16 [00:00<00:00, 87.08it/s]

Validation Loss: 0.8821

Epoch 28/100
------------------------------







Training: 100%|██████████| 14/14 [00:02<00:00,  5.46it/s]


Training Loss: 0.0049


Validation: 100%|██████████| 16/16 [00:00<00:00, 122.59it/s]


Validation Loss: 0.9109

Epoch 29/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.23it/s]


Training Loss: 0.0041


Validation: 100%|██████████| 16/16 [00:00<00:00, 133.44it/s]


Validation Loss: 0.8585

Epoch 30/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.82it/s]


Training Loss: 0.0043


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.79it/s]


Validation Loss: 0.8934

Epoch 31/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.54it/s]


Training Loss: 0.0036


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.22it/s]


Validation Loss: 0.8788

Epoch 32/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.28it/s]


Training Loss: 0.0033


Validation: 100%|██████████| 16/16 [00:00<00:00, 134.01it/s]


Validation Loss: 0.8764

Epoch 33/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.92it/s]


Training Loss: 0.0035


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.84it/s]


Validation Loss: 0.8678

Epoch 34/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.36it/s]


Training Loss: 0.0029


Validation: 100%|██████████| 16/16 [00:00<00:00, 131.45it/s]


Validation Loss: 0.9231

Epoch 35/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.33it/s]


Training Loss: 0.0027


Validation: 100%|██████████| 16/16 [00:00<00:00, 123.19it/s]


Validation Loss: 0.8809

Epoch 36/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.87it/s]


Training Loss: 0.0025


Validation: 100%|██████████| 16/16 [00:00<00:00, 136.61it/s]


Validation Loss: 0.8953

Epoch 37/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.61it/s]


Training Loss: 0.0024


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.33it/s]


Validation Loss: 0.8830

Epoch 38/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.88it/s]


Training Loss: 0.0021


Validation: 100%|██████████| 16/16 [00:00<00:00, 149.45it/s]


Validation Loss: 0.9056

Epoch 39/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.99it/s]


Training Loss: 0.0020


Validation: 100%|██████████| 16/16 [00:00<00:00, 141.04it/s]


Validation Loss: 0.8922

Epoch 40/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.68it/s]


Training Loss: 0.0019


Validation: 100%|██████████| 16/16 [00:00<00:00, 136.13it/s]


Validation Loss: 0.8989

Epoch 41/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.29it/s]


Training Loss: 0.0023


Validation: 100%|██████████| 16/16 [00:00<00:00, 136.40it/s]


Validation Loss: 0.8820

Epoch 42/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.76it/s]


Training Loss: 0.0020


Validation: 100%|██████████| 16/16 [00:00<00:00, 141.52it/s]


Validation Loss: 0.9113

Epoch 43/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.32it/s]


Training Loss: 0.0019


Validation: 100%|██████████| 16/16 [00:00<00:00, 110.31it/s]


Validation Loss: 0.8971

Epoch 44/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.88it/s]


Training Loss: 0.0016


Validation: 100%|██████████| 16/16 [00:00<00:00, 134.07it/s]


Validation Loss: 0.9040

Epoch 45/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.67it/s]


Training Loss: 0.0015


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.99it/s]


Validation Loss: 0.8832

Epoch 46/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.89it/s]


Training Loss: 0.0015


Validation: 100%|██████████| 16/16 [00:00<00:00, 149.63it/s]


Validation Loss: 0.9075

Epoch 47/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.93it/s]


Training Loss: 0.0013


Validation: 100%|██████████| 16/16 [00:00<00:00, 156.55it/s]


Validation Loss: 0.9133

Epoch 48/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.79it/s]


Training Loss: 0.0014


Validation: 100%|██████████| 16/16 [00:00<00:00, 135.24it/s]


Validation Loss: 0.9180

Epoch 49/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.66it/s]


Training Loss: 0.0012


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.56it/s]


Validation Loss: 0.8999

Epoch 50/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.85it/s]


Training Loss: 0.0011


Validation: 100%|██████████| 16/16 [00:00<00:00, 138.27it/s]


Validation Loss: 0.9277

Epoch 51/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.19it/s]


Training Loss: 0.0011


Validation: 100%|██████████| 16/16 [00:00<00:00, 126.88it/s]


Validation Loss: 0.9024

Epoch 52/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.95it/s]


Training Loss: 0.0012


Validation: 100%|██████████| 16/16 [00:00<00:00, 130.80it/s]


Validation Loss: 0.9144

Epoch 53/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  7.37it/s]


Training Loss: 0.0010


Validation: 100%|██████████| 16/16 [00:00<00:00, 133.20it/s]


Validation Loss: 0.8954

Epoch 54/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.83it/s]


Training Loss: 0.0010


Validation: 100%|██████████| 16/16 [00:00<00:00, 142.54it/s]


Validation Loss: 0.9004

Epoch 55/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.17it/s]


Training Loss: 0.0010


Validation: 100%|██████████| 16/16 [00:00<00:00, 151.27it/s]


Validation Loss: 0.9302

Epoch 56/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.91it/s]


Training Loss: 0.0009


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.81it/s]


Validation Loss: 0.9203

Epoch 57/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.03it/s]


Training Loss: 0.0009


Validation: 100%|██████████| 16/16 [00:00<00:00, 150.57it/s]


Validation Loss: 0.9389

Epoch 58/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.33it/s]


Training Loss: 0.0009


Validation: 100%|██████████| 16/16 [00:00<00:00, 121.39it/s]


Validation Loss: 0.9191

Epoch 59/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.36it/s]


Training Loss: 0.0008


Validation: 100%|██████████| 16/16 [00:00<00:00, 119.61it/s]


Validation Loss: 0.9189

Epoch 60/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.98it/s]


Training Loss: 0.0010


Validation: 100%|██████████| 16/16 [00:00<00:00, 149.91it/s]


Validation Loss: 0.9145

Epoch 61/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.86it/s]


Training Loss: 0.0008


Validation: 100%|██████████| 16/16 [00:00<00:00, 132.75it/s]


Validation Loss: 0.9160

Epoch 62/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.73it/s]


Training Loss: 0.0008


Validation: 100%|██████████| 16/16 [00:00<00:00, 142.22it/s]


Validation Loss: 0.9336

Epoch 63/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.86it/s]


Training Loss: 0.0007


Validation: 100%|██████████| 16/16 [00:00<00:00, 142.73it/s]


Validation Loss: 0.9364

Epoch 64/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.23it/s]


Training Loss: 0.0007


Validation: 100%|██████████| 16/16 [00:00<00:00, 150.36it/s]


Validation Loss: 0.9167

Epoch 65/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.44it/s]


Training Loss: 0.0007


Validation: 100%|██████████| 16/16 [00:00<00:00, 160.03it/s]


Validation Loss: 0.9237

Epoch 66/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.87it/s]


Training Loss: 0.0007


Validation: 100%|██████████| 16/16 [00:00<00:00, 130.68it/s]


Validation Loss: 0.9235

Epoch 67/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.42it/s]


Training Loss: 0.0006


Validation: 100%|██████████| 16/16 [00:00<00:00, 120.67it/s]


Validation Loss: 0.9327

Epoch 68/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.73it/s]


Training Loss: 0.0006


Validation: 100%|██████████| 16/16 [00:00<00:00, 133.02it/s]


Validation Loss: 0.9337

Epoch 69/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.82it/s]


Training Loss: 0.0006


Validation: 100%|██████████| 16/16 [00:00<00:00, 143.26it/s]


Validation Loss: 0.9265

Epoch 70/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.97it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.63it/s]


Validation Loss: 0.9107

Epoch 71/100
------------------------------


Training: 100%|██████████| 14/14 [00:02<00:00,  6.24it/s]


Training Loss: 0.0007


Validation: 100%|██████████| 16/16 [00:00<00:00, 112.84it/s]


Validation Loss: 0.9142

Epoch 72/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.24it/s]


Training Loss: 0.0006


Validation: 100%|██████████| 16/16 [00:00<00:00, 131.57it/s]


Validation Loss: 0.9538

Epoch 73/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.00it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.01it/s]


Validation Loss: 0.9429

Epoch 74/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.16it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 115.40it/s]


Validation Loss: 0.9375

Epoch 75/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.52it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 144.62it/s]


Validation Loss: 0.9353

Epoch 76/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.13it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 153.50it/s]


Validation Loss: 0.9328

Epoch 77/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.80it/s]


Training Loss: 0.0006


Validation: 100%|██████████| 16/16 [00:00<00:00, 142.68it/s]


Validation Loss: 0.9290

Epoch 78/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.83it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 149.48it/s]


Validation Loss: 0.9498

Epoch 79/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.94it/s]


Training Loss: 0.0005


Validation: 100%|██████████| 16/16 [00:00<00:00, 147.15it/s]


Validation Loss: 0.9433

Epoch 80/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.51it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 128.70it/s]


Validation Loss: 0.9252

Epoch 81/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.16it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 136.60it/s]


Validation Loss: 0.9474

Epoch 82/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.34it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 133.67it/s]


Validation Loss: 0.9352

Epoch 83/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.14it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.47it/s]


Validation Loss: 0.9334

Epoch 84/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.70it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 143.89it/s]


Validation Loss: 0.9287

Epoch 85/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00, 10.11it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 147.04it/s]


Validation Loss: 0.9430

Epoch 86/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.77it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 137.10it/s]


Validation Loss: 0.9515

Epoch 87/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.40it/s]


Training Loss: 0.0004


Validation: 100%|██████████| 16/16 [00:00<00:00, 138.37it/s]


Validation Loss: 0.9423

Epoch 88/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.26it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 131.87it/s]


Validation Loss: 0.9516

Epoch 89/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.82it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 146.84it/s]


Validation Loss: 0.9437

Epoch 90/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.01it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 125.49it/s]


Validation Loss: 0.9466

Epoch 91/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.18it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 143.48it/s]


Validation Loss: 0.9427

Epoch 92/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.91it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 147.63it/s]


Validation Loss: 0.9662

Epoch 93/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.53it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 139.65it/s]


Validation Loss: 0.9507

Epoch 94/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.58it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 143.49it/s]


Validation Loss: 0.9471

Epoch 95/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.63it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 140.43it/s]


Validation Loss: 0.9558

Epoch 96/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.90it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 145.58it/s]


Validation Loss: 0.9251

Epoch 97/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.38it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 122.19it/s]


Validation Loss: 0.9335

Epoch 98/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.13it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 116.92it/s]


Validation Loss: 0.9502

Epoch 99/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  8.51it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 123.70it/s]


Validation Loss: 0.9531

Epoch 100/100
------------------------------


Training: 100%|██████████| 14/14 [00:01<00:00,  9.28it/s]


Training Loss: 0.0003


Validation: 100%|██████████| 16/16 [00:00<00:00, 134.63it/s]

Validation Loss: 0.9509

Training complete. Model saved to cnn_model.pth





In [None]:
import numpy as np
val_out = np.array([])
val_label = np.array([])
with torch.no_grad():
  for val_features, val_labels in tqdm(val_loader, desc="Validation"):
    val_features, val_labels = val_features.to(device), val_labels.to(device)
    val_label = np.append(val_label, val_labels.cpu().numpy())
    # Forward pass
    val_outputs = model(val_features.transpose(1, 2).float())
    out = torch.argmax(val_outputs, dim=1)

    val_out = np.append(val_out, out.cpu().numpy())

Validation: 100%|██████████| 16/16 [00:00<00:00, 126.45it/s]


In [None]:
val_out.shape, val_label.shape

((155,), (155,))

In [None]:
from sklearn.metrics import accuracy_score
accuracy_score(val_out, val_label)

0.7161290322580646

# Test

In [None]:
import numpy as np
test_out = np.array([])
test_label = np.array([])
with torch.no_grad():
  for test_features, test_labels in tqdm(test_loader, desc="test"):
    test_features, test_labels = test_features.to(device), test_labels.to(device)
    test_label = np.append(test_label, test_labels.cpu().numpy())
    # Forward pass
    test_outputs = model(test_features.transpose(1, 2).float())
    out = torch.argmax(test_outputs, dim=1)

    test_out = np.append(test_out, out.cpu().numpy())




test: 100%|██████████| 388/388 [00:00<00:00, 588.07it/s]


In [None]:
from sklearn.metrics import accuracy_score
accuracy_score(test_out, test_label)

0.7474226804123711