# Editing history

1. Training cannot fitting well on training dataset (underfitting).
    - Add more neurals at FF layer
    - Use 4 muti-head
    - Use 2 transformer layers


2. Use Comformer model to replace original Transformer encoder. Because our target is to classify the speaker, maybe consider small sequence will be better than all context in sequence.
    - Use Comformer with default hyper-parameter


3. Trying to reporduce Self-Attention Pooling[[1](https://arxiv.org/pdf/2008.01077v1.pdf)]
    - This approach encodes short-term speaker spectral features into speaker embeddings to be used in text-independent speaker verification.


4. Use Additive Margin Softmax to replace conventional Softmax
    - For intra-class classification tasks, AMSoftmax may be properly used.

# Task description
- Classify the speakers of given features.
- Main goal: Learn how to use transformer.
- Baselines:
  - Easy: Run sample code and know how to use transformer.
  - Medium: Know how to adjust parameters of transformer.
  - Strong: Construct [conformer](https://arxiv.org/abs/2005.08100) which is a variety of transformer. 
  - Boss: Implement [Self-Attention Pooling](https://arxiv.org/pdf/2008.01077v1.pdf) & [Additive Margin Softmax](https://arxiv.org/pdf/1801.05599.pdf) to further boost the performance.

- Other links
  - Competiton: [link](https://www.kaggle.com/t/49ea0c385a974db5919ec67299ba2e6b)
  - Slide: [link](https://docs.google.com/presentation/d/1LDAW0GGrC9B6D7dlNdYzQL6D60-iKgFr/edit?usp=sharing&ouid=104280564485377739218&rtpof=true&sd=true)
  - Data: [link](https://github.com/googly-mingto/ML2023HW4/releases)



In [11]:
import numpy as np
import torch
import random

def set_seed(seed):
    np.random.seed(seed)
    random.seed(seed)
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True

set_seed(87)

# Data

## Dataset
- Original dataset is [Voxceleb2](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/vox2.html).
- The [license](https://creativecommons.org/licenses/by/4.0/) and [complete version](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/files/license.txt) of Voxceleb2.
- We randomly select 600 speakers from Voxceleb2.
- Then preprocess the raw waveforms into mel-spectrograms.

- Args:
  - data_dir: The path to the data directory.
  - metadata_path: The path to the metadata.
  - segment_len: The length of audio segment for training. 
- The architecture of data directory \\
  - data directory \\
  |---- metadata.json \\
  |---- testdata.json \\
  |---- mapping.json \\
  |---- uttr-{random string}.pt \\

- The information in metadata
  - "n_mels": The dimention of mel-spectrogram.
  - "speakers": A dictionary. 
    - Key: speaker ids.
    - value: "feature_path" and "mel_len"


For efficiency, we segment the mel-spectrograms into segments in the traing step.

In [12]:
import os
import json
import torch
import random
from pathlib import Path
from torch.utils.data import Dataset
from torch.nn.utils.rnn import pad_sequence


class myDataset(Dataset):
    def __init__(self, data_dir, segment_len=128):
        self.data_dir = data_dir
        self.segment_len = segment_len

        # Load the mapping from speaker neme to their corresponding id. 
        mapping_path = Path(data_dir) / "mapping.json"
        mapping = json.load(mapping_path.open())
        self.speaker2id = mapping["speaker2id"]

        # Load metadata of training data.
        metadata_path = Path(data_dir) / "metadata.json"
        metadata = json.load(open(metadata_path))["speakers"]

        # Get the total number of speaker.
        self.speaker_num = len(metadata.keys())
        self.data = []
        for speaker in metadata.keys():
            for utterances in metadata[speaker]:
                self.data.append([utterances["feature_path"], self.speaker2id[speaker]])
 
    def __len__(self):
        return len(self.data)
 
    def __getitem__(self, index):
        feat_path, speaker = self.data[index]
        # Load preprocessed mel-spectrogram.
        mel = torch.load(os.path.join(self.data_dir, feat_path))

        # Segmemt mel-spectrogram into "segment_len" frames.
        if len(mel) > self.segment_len:
            # Randomly get the starting point of the segment.
            start = random.randint(0, len(mel) - self.segment_len)
            # Get a segment with "segment_len" frames.
            mel = torch.FloatTensor(mel[start:start+self.segment_len])
        else:
            mel = torch.FloatTensor(mel)
        # Turn the speaker id into long for computing loss later.
        speaker = torch.FloatTensor([speaker]).long()
        return mel, speaker
 
    def get_speaker_number(self):
        return self.speaker_num

## Dataloader
- Split dataset into training dataset(90%) and validation dataset(10%).
- Create dataloader to iterate the data.

In [13]:
import torch
from torch.utils.data import DataLoader, random_split
from torch.nn.utils.rnn import pad_sequence


def collate_batch(batch):
    # Process features within a batch.
    """Collate a batch of data."""
    mel, speaker = zip(*batch)
    # Because we train the model batch by batch, we need to pad the features in the same batch to make their lengths the same.
    mel = pad_sequence(mel, batch_first=True, padding_value=-20)    # pad log 10^(-20) which is very small value.
    # mel: (batch size, length, 40)
    return mel, torch.FloatTensor(speaker).long()


def get_dataloader(data_dir, batch_size, n_workers):
    """Generate dataloader"""
    dataset = myDataset(data_dir)
    speaker_num = dataset.get_speaker_number()
    # Split dataset into training dataset and validation dataset
    trainlen = int(0.9 * len(dataset))
    lengths = [trainlen, len(dataset) - trainlen]
    trainset, validset = random_split(dataset, lengths)

    train_loader = DataLoader(
        trainset,
        batch_size=batch_size,
        shuffle=True,
        drop_last=True,
        num_workers=n_workers,
        pin_memory=True,
        collate_fn=collate_batch,
    )
    valid_loader = DataLoader(
        validset,
        batch_size=batch_size,
        num_workers=n_workers,
        drop_last=True,
        pin_memory=True,
        collate_fn=collate_batch,
    )

    return train_loader, valid_loader, speaker_num

# Model
- TransformerEncoderLayer:
  - Base transformer encoder layer in [Attention Is All You Need](https://arxiv.org/abs/1706.03762)
  - Parameters:
    - d_model: the number of expected features of the input (required).

    - nhead: the number of heads of the multiheadattention models (required).

    - dim_feedforward: the dimension of the feedforward network model (default=2048).

    - dropout: the dropout value (default=0.1).

    - activation: the activation function of intermediate layer, relu or gelu (default=relu).

- TransformerEncoder:
  - TransformerEncoder is a stack of N transformer encoder layers
  - Parameters:
    - encoder_layer: an instance of the TransformerEncoderLayer() class (required).

    - num_layers: the number of sub-encoder-layers in the encoder (required).

    - norm: the layer normalization component (optional).

In [34]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class Classifier(nn.Module):
	def __init__(self, d_model=512, n_spks=600, nhead=16,dropout=0.1):
		super().__init__()
		# Project the dimension of features from that of input into d_model.
		self.prenet = nn.Linear(40, d_model)
		self.dropout = nn.Dropout(dropout)
		# TODO:
		#   Change Transformer to Conformer.
		#   https://arxiv.org/abs/2005.08100
		self.encoder_layer = nn.TransformerEncoderLayer(
			d_model=d_model, dim_feedforward=256, nhead=nhead
		)
		# self.encoder = nn.TransformerEncoder(self.encoder_layer, num_layers=2)

		# Project the the dimension of features from d_model into speaker nums.
		self.pred_layer = nn.Sequential(
			nn.Linear(d_model, d_model),
			# nn.Sigmoid(),
			nn.GELU(),
			nn.Dropout(dropout),
			nn.Linear(d_model, n_spks),
		)

	def forward(self, mels):
		"""
		args:
			mels: (batch size, length, 40)
		return:
			out: (batch size, n_spks)
		"""
		# out: (batch size, length, d_model)
		out = self.prenet(mels)
		# out: (length, batch size, d_model)
		out = out.permute(1, 0, 2)
		# The encoder layer expect features in the shape of (length, batch size, d_model).
		out = self.encoder_layer(out)
		# out = self.encoder(out)
		# out: (batch size, length, d_model)
		out = out.transpose(0, 1)
		# mean pooling
		stats = out.mean(dim=1)

		# out: (batch, n_spks)
		out = self.pred_layer(stats)
		return out

# Learning rate schedule
- For transformer architecture, the design of learning rate schedule is different from that of CNN.
- Previous works show that the warmup of learning rate is useful for training models with transformer architectures.
- The warmup schedule
  - Set learning rate to 0 in the beginning.
  - The learning rate increases linearly from 0 to initial learning rate during warmup period.

In [28]:
import math

import torch
from torch.optim import Optimizer
from torch.optim.lr_scheduler import LambdaLR


def get_cosine_schedule_with_warmup(
    optimizer: Optimizer,
    num_warmup_steps: int,
    num_training_steps: int,
    num_cycles: float = 0.5,
    last_epoch: int = -1,
):
    """
    Create a schedule with a learning rate that decreases following the values of the cosine function between the
    initial lr set in the optimizer to 0, after a warmup period during which it increases linearly between 0 and the
    initial lr set in the optimizer.

    Args:
        optimizer (:class:`~torch.optim.Optimizer`):
        The optimizer for which to schedule the learning rate.
        num_warmup_steps (:obj:`int`):
        The number of steps for the warmup phase.
        num_training_steps (:obj:`int`):
        The total number of training steps.
        num_cycles (:obj:`float`, `optional`, defaults to 0.5):
        The number of waves in the cosine schedule (the defaults is to just decrease from the max value to 0
        following a half-cosine).
        last_epoch (:obj:`int`, `optional`, defaults to -1):
        The index of the last epoch when resuming training.

    Return:
        :obj:`torch.optim.lr_scheduler.LambdaLR` with the appropriate schedule.
    """
    def lr_lambda(current_step):
        # Warmup
        if current_step < num_warmup_steps:
            return float(current_step) / float(max(1, num_warmup_steps))
        # decadence
        progress = float(current_step - num_warmup_steps) / float(
            max(1, num_training_steps - num_warmup_steps)
        )
        return max(
            0.0, 0.5 * (1.0 + math.cos(math.pi * float(num_cycles) * 2.0 * progress))
        )

    return LambdaLR(optimizer, lr_lambda, last_epoch)

# Model Function

- Model forward function.

In [29]:
import torch


def model_fn(batch, model, criterion, device):
	"""Forward a batch through the model."""

	mels, labels = batch
	mels = mels.to(device)
	labels = labels.to(device)

	outs = model(mels)

	loss = criterion(outs, labels)

	# Get the speaker id with highest probability.
	preds = outs.argmax(1)
	# Compute accuracy.
	accuracy = torch.mean((preds == labels).float())

	return loss, accuracy

# Validate
- Calculate accuracy of the validation set.

In [30]:
from tqdm import tqdm
import torch


def valid(dataloader, model, criterion, device): 
    """Validate on validation set."""

    model.eval()
    running_loss = 0.0
    running_accuracy = 0.0
    pbar = tqdm(total=len(dataloader.dataset), ncols=0, desc="Valid", unit=" uttr")

    for i, batch in enumerate(dataloader):
        with torch.no_grad():
            loss, accuracy = model_fn(batch, model, criterion, device)
            running_loss += loss.item()
            running_accuracy += accuracy.item()

        pbar.update(dataloader.batch_size)
        pbar.set_postfix(
            loss=f"{running_loss / (i+1):.2f}",
            accuracy=f"{running_accuracy / (i+1):.2f}",
        )

    pbar.close()
    model.train()

    return running_accuracy / len(dataloader)

# Main function

In [35]:
from tqdm import tqdm

import torch
import torch.nn as nn
from torch.optim import AdamW
from torch.utils.data import DataLoader, random_split


def parse_args():
    """arguments"""
    config = {
        "data_dir": "./Dataset",
        "save_path": "model.ckpt",
        "batch_size": 32,
        "n_workers": 8,
        "valid_steps": 2000,
        "warmup_steps": 1000,
        "save_steps": 10000,
        "total_steps": 70000,
    }

    return config


def main(
    data_dir,
    save_path,
    batch_size,
    n_workers,
    valid_steps,
    warmup_steps,
    total_steps,
    save_steps,
):
    """Main function."""
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"[Info]: Use {device} now!")

    train_loader, valid_loader, speaker_num = get_dataloader(data_dir, batch_size, n_workers)
    train_iterator = iter(train_loader)
    print(f"[Info]: Finish loading data!",flush = True)

    model = Classifier(n_spks=speaker_num).to(device)
    criterion = nn.CrossEntropyLoss()
    #criterion = AMSoftmaxLoss(m=0.4, s=30)
    optimizer = AdamW(model.parameters(), lr=1e-3)
    scheduler = get_cosine_schedule_with_warmup(optimizer, warmup_steps, total_steps)
    print(f"[Info]: Finish creating model!",flush = True)

    best_accuracy = -1.0
    best_state_dict = None

    pbar = tqdm(total=valid_steps, ncols=0, desc="Train", unit=" step")

    for step in range(total_steps):
        # Get data
        try:
            batch = next(train_iterator) # (32, 128, 40)
        except StopIteration:
            train_iterator = iter(train_loader)
            batch = next(train_iterator)

        loss, accuracy = model_fn(batch, model, criterion, device)
        batch_loss = loss.item()
        batch_accuracy = accuracy.item()

        # Updata model
        loss.backward()
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

        # Log
        pbar.update()
        pbar.set_postfix(
            loss=f"{batch_loss:.2f}",
            accuracy=f"{batch_accuracy:.2f}",
            step=step + 1,
        )

        # Do validation
        if (step + 1) % valid_steps == 0:
            pbar.close()

            valid_accuracy = valid(valid_loader, model, criterion, device)

            # keep the best model
            if valid_accuracy > best_accuracy:
                best_accuracy = valid_accuracy
                best_state_dict = model.state_dict()

            pbar = tqdm(total=valid_steps, ncols=0, desc="Train", unit=" step")

        # Save the best model so far.
        if (step + 1) % save_steps == 0 and best_state_dict is not None:
            torch.save(best_state_dict, save_path)
            pbar.write(f"Step {step + 1}, best model saved. (accuracy={best_accuracy:.4f})")

    pbar.close()


if __name__ == "__main__":
    main(**parse_args())

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 147.52 step/s, accuracy=0.34, loss=3.29, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

Step 10000, best model saved. (accuracy=0.5713)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 151.57 step/s, accuracy=0.38, loss=2.01, step=12000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path

Step 20000, best model saved. (accuracy=0.6414)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 148.54 step/s, accuracy=0.69, loss=1.18, step=22000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path

Step 30000, best model saved. (accuracy=0.6803)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 140.09 step/s, accur

Step 40000, best model saved. (accuracy=0.7299)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 149.47 step/s, accuracy=0.78, loss=0.79, step=42000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path

Step 50000, best model saved. (accuracy=0.7472)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 142.79 step/s, accuracy=0.81, loss=0.59, step=52000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path

Step 60000, best model saved. (accuracy=0.7671)


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 148.58 step/s, accuracy=0.88, loss=0.46, step=62000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path

Step 70000, best model saved. (accuracy=0.7726)


# Inference

In [36]:
import os
import json
import torch
from pathlib import Path
from torch.utils.data import Dataset


class InferenceDataset(Dataset):
    def __init__(self, data_dir):
        testdata_path = Path(data_dir) / "testdata.json"
        metadata = json.load(testdata_path.open())
        self.data_dir = data_dir
        self.data = metadata["utterances"]

    def __len__(self):
        return len(self.data)

    def __getitem__(self, index):
        utterance = self.data[index]
        feat_path = utterance["feature_path"]
        mel = torch.load(os.path.join(self.data_dir, feat_path))

        return feat_path, mel


def inference_collate_batch(batch):
    """Collate a batch of data."""
    feat_paths, mels = zip(*batch)

    return feat_paths, torch.stack(mels)

## Main function of Inference

In [37]:
import json
import csv
from pathlib import Path
# from tqdm.notebook import tqdm
from tqdm import tqdm


import torch
from torch.utils.data import DataLoader

def parse_args():
    """arguments"""
    config = {
        "data_dir": "./Dataset",
        "model_path": "./model.ckpt",
        "output_path": "./output.csv",
    }

    return config


def main(
    data_dir,
    model_path,
    output_path,
):
    """Main function."""
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"[Info]: Use {device} now!")

    mapping_path = Path(data_dir) / "mapping.json"
    mapping = json.load(mapping_path.open())

    dataset = InferenceDataset(data_dir)
    dataloader = DataLoader(
        dataset,
        batch_size=1,
        shuffle=False,
        drop_last=False,
        num_workers=8,
        collate_fn=inference_collate_batch,
    )
    print(f"[Info]: Finish loading data!",flush = True)

    speaker_num = len(mapping["id2speaker"])
    model = Classifier(n_spks=speaker_num).to(device)
    model.load_state_dict(torch.load(model_path))
    model.eval()
    print(f"[Info]: Finish creating model!",flush = True)

    results = [["Id", "Category"]]
    for feat_paths, mels in tqdm(dataloader):
        with torch.no_grad():
            mels = mels.to(device)
            outs = model(mels)
            #outs = model.pred_layer(outs)  # AMSoftmax
            preds = outs.argmax(1).cpu().numpy()
            for feat_path, pred in zip(feat_paths, preds):
                results.append([feat_path, mapping["id2speaker"][str(pred)]])

    with open(output_path, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerows(results)


if __name__ == "__main__":
    main(**parse_args())

[Info]: Use cuda now!
[Info]: Finish loading data!
[Info]: Finish creating model!


  model.load_state_dict(torch.load(model_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
100%|██████████| 8000/8000 [00:11<00:00, 694.76it/s]


### Test GPU No:

In [10]:
# import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if device.type == "cuda":
    current_gpu = torch.cuda.current_device()
    gpu_name = torch.cuda.get_device_name(current_gpu)
    print(f"[Info]: Use GPU {current_gpu}: {gpu_name}")
else:
    print("[Info]: Use CPU now!")

# #手動指定 GPU
# device = torch.device("cuda:1")
# gpu_index = device.index  # 取得你指定的 GPU 編號
# gpu_name = torch.cuda.get_device_name(gpu_index)

# print(f"[Info]: Use GPU {gpu_index}: {gpu_name}")

[Info]: Use GPU 0: NVIDIA RTX A6000


In [10]:
import optuna
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.optim import AdamW
from torch.utils.data import DataLoader, random_split

def parse_args():
    """arguments"""
    config = {
        "data_dir": "./Dataset",
        "save_path": "model.ckpt",
        "batch_size": 32,
        "n_workers": 8,
        "valid_steps": 2000,
        "warmup_steps": 1000,
        "save_steps": 10000,
        "total_steps": 20000,
    }
    return config

def objective(trial, data_dir, batch_size, n_workers, valid_steps, warmup_steps, total_steps, save_steps):
    d_model = trial.suggest_categorical("d_model", [256, 512])
    nhead = trial.suggest_categorical("nhead", [16, 32])


    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"[Info]: Use {device} now!")
    train_loader, valid_loader, speaker_num = get_dataloader(data_dir, batch_size, n_workers)
    train_iterator = iter(train_loader)
    print(f"[Info]: Finish loading data!", flush=True)


    # Define the model
    model = Classifier(d_model=d_model, nhead=nhead, n_spks=600).to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = AdamW(model.parameters(), lr=1e-3)
    scheduler = get_cosine_schedule_with_warmup(optimizer, warmup_steps, total_steps)
    print(f"[Info]: Finish creating model!", flush=True)


    best_accuracy = -1.0
    best_state_dict = None

    pbar = tqdm(total=valid_steps, ncols=0, desc="Train", unit=" step")

    for step in range(total_steps):
        try:
            batch = next(train_iterator)
        except StopIteration:
            train_iterator = iter(train_loader)
            batch = next(train_iterator)

        loss, accuracy = model_fn(batch, model, criterion, device)
        batch_loss = loss.item()
        batch_accuracy = accuracy.item()

        loss.backward()
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

        pbar.update()
        pbar.set_postfix(
            loss=f"{batch_loss:.2f}",
            accuracy=f"{batch_accuracy:.2f}",
            step=step + 1,
        )

        if (step + 1) % valid_steps == 0:
            pbar.close()
            valid_accuracy = valid(valid_loader, model, criterion, device)

            if valid_accuracy > best_accuracy:
                best_accuracy = valid_accuracy
                best_state_dict = model.state_dict()

            pbar = tqdm(total=valid_steps, ncols=0, desc="Train", unit=" step")

            # 提前停止：如果當前試驗的驗證準確率過低，可以提前終止
            if best_accuracy < 0.1 and step > 10000:  # 設置閾值和步數
                break

    pbar.close()
    if best_state_dict is not None:
        torch.save(best_state_dict, f"model_trial_{trial.number}.ckpt")

    return best_accuracy

def main(data_dir, save_path, batch_size, n_workers, valid_steps, warmup_steps, total_steps, save_steps):
    """Main function with Optuna optimization"""
    # 創建 Optuna study，目標是最大化驗證準確率
    study = optuna.create_study(direction="maximize")
    study.optimize(
        lambda trial: objective(trial, data_dir, batch_size, n_workers, valid_steps, warmup_steps, total_steps, save_steps),
        n_trials=20  # 設置試驗次數
    )

    # 打印最佳試驗結果
    # print("Best trial:")
    # trial = study.best_trial
    # print(f"  Value: {trial.value}")
    # print("  Params: ")
    # for key, value in trial.params.items():
    #     print(f"    {key}: {value}")
    print("Best hyperparameters: {}".format(study.best_params))
    print("Best accuracy: {:.4f}".format(study.best_value))



if __name__ == "__main__":
    main(**parse_args())


[I 2025-04-22 00:48:36,098] A new study created in memory with name: no-name-908ca96b-cd8a-4441-99d8-18e2d36f1389


[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 142.88 step/s, accuracy=0.12, loss=4.32, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 149.28 step/s, accuracy=0.25, loss=3.80, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 145.13 step/s, accuracy=0.06, loss=4.49, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 152.92 step/s, accuracy=0.09, loss=4.43, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 146.55 step/s, accuracy=0.22, loss=4.19, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:12<00:00, 155.84 step/s, accuracy=0.16, loss=4.20, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 148.65 step/s, accuracy=0.06, loss=4.49, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 143.11 step/s, accuracy=0.19, loss=4.53, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 150.23 step/s, accuracy=0.22, loss=4.18, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 147.48 step/s, accuracy=0.16, loss=4.33, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 143.90 step/s, accuracy=0.16, loss=4.32, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 142.68 step/s, accuracy=0.19, loss=3.92, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 144.83 step/s, accuracy=0.09, loss=4.28, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 144.92 step/s, accuracy=0.16, loss=4.20, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 140.53 step/s, accuracy=0.31, loss=4.20, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 142.36 step/s, accuracy=0.03, loss=4.58, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 142.03 step/s, accuracy=0.16, loss=3.61, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 143.09 step/s, accuracy=0.12, loss=4.44, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:13<00:00, 147.99 step/s, accuracy=0.22, loss=3.94, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

[Info]: Use cuda now!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish loading data!


  mel = torch.load(os.path.join(self.data_dir, feat_path))


[Info]: Finish creating model!


  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
Train: 100% 2000/2000 [00:14<00:00, 142.02 step/s, accuracy=0.09, loss=4.39, step=2000]
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.join(self.data_dir, feat_path))
  mel = torch.load(os.path.

Best hyperparameters: {'d_model': 512, 'nhead': 32}
Best accuracy: 0.5320
