<a href="https://colab.research.google.com/github/358Xin/DL/blob/main/Phoneme_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [5]:
!gdown --id '1HPkcmQmFGu-3OknddKIa5dNDsR05lIQR' --output data.zip
!unzip data.zip
!ls 

Downloading...
From: https://drive.google.com/uc?id=1HPkcmQmFGu-3OknddKIa5dNDsR05lIQR
To: /content/data.zip
100% 372M/372M [00:01<00:00, 234MB/s]
Archive:  data.zip
   creating: timit_11/
  inflating: timit_11/train_11.npy   
  inflating: timit_11/test_11.npy    
  inflating: timit_11/train_label_11.npy  
data.zip  sample_data  timit_11


In [21]:
import numpy as np

print("Loading data...")
data_root = './timit_11/'
train_X = np.load(data_root + 'train_11.npy')
train_Y = np.load(data_root + 'train_label_11.npy')
test = np.load(data_root + 'test_11.npy')

print("Size of training data: {}".format(train_X.shape))
print("Size of testing data: {}".format(test.shape))

Loading data...
Size of training data: (1229932, 429)
Size of testing data: (451552, 429)


In [22]:
import torch
from torch.utils.data import Dataset

class TIMITDataset(Dataset):
  def __init__(self, X, y=None):
    self.data = torch.from_numpy(X).float()
    if y is not None:
      y = y.astype(np.int)
      self.label = torch.LongTensor(y)
    else:
      self.label = None
  
  def __getitem__(self, index):
    if self.label is not None:
      return self.data[index], self.label[index]
    else:
      return self.data[index]
  
  def __len__(self):
    return len(self.data)

划分训练集为训练集与验证集，因此此处给入一个valid_ratio

In [23]:
valid_ratio = 0.2
percent = int(train_X.shape[0] * (1-valid_ratio))   #比重划分，之后进行分割
train_x, train_y, valid_x, valid_y = train_X[:percent], train_Y[:percent], train_X[percent:], train_Y[percent:]
print("Size of training set: {}".format(train_x.shape))
print("Size of validation set: {}".format(valid_x.shape))

Size of training set: (983945, 429)
Size of validation set: (245987, 429)


有了Dataset之后创建DataLoader

In [24]:
batch_size = 64
from torch.utils.data import DataLoader

train_set = TIMITDataset(train_x, train_y)
valid_set = TIMITDataset(valid_x, valid_y)

train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=False)

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  


做一下GC,此时的RAM与磁盘有点大了

In [25]:
import gc
del train_X, train_Y, train_x, train_y, valid_x, valid_y
gc.collect()

2236

开始建立模型

In [26]:
import torch
import torch.nn as nn

class Classifier(nn.Module):
  def __init__(self):
    super(Classifier, self).__init__()
    self.layer1 = nn.Linear(429, 1024)
    self.layer2 = nn.Linear(1024, 512)
    self.layer3 = nn.Linear(512, 128)
    self.out = nn.Linear(128, 39)
    self.dropout = nn.Dropout(0.15)
    self.activation = nn.ReLU()
  
  def forward(self, x):
    x = self.layer1(x)
    x = self.dropout(x)
    x = self.activation(x)

    x = self.layer2(x)
    x = self.dropout(x)
    x = self.activation(x)

    x = self.layer3(x)
    x = self.dropout(x)
    x = self.activation(x)

    x = self.out(x)

    return x

开始训练

In [27]:
def get_device():
  return 'cuda' if torch.cuda.is_available() else 'cpu'

In [28]:
def same_seeds(seed):
  torch.backends.cudnn.deterministric = True
  torch.backends.cudnn.benchmark = False
  np.random.seed(seed)
  torch.manual_seed(seed)
  if torch.cuda.is_available():
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

In [29]:
same_seeds(0)
device = get_device()
print(f"Dvice: {device}")
num_epochs = 20
learning_rate = 0.0001
weight_decay = 0.001
model_path = './model.ckpt'

model = Classifier().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

Dvice: cuda


In [31]:
best_acc = 0.0

for epoch in range(num_epochs):
  train_acc = 0.0
  train_loss = 0.0
  valid_acc = 0.0
  valid_loss = 0.0

  #train
  model.train()
  for inputs, labels in train_loader:
    inputs, labels = inputs.to(device), labels.to(device)   #数据转device
    optimizer.zero_grad()                     #从0梯度开始
    outputs = model(inputs)
    loss = criterion(outputs, labels)             #计算损失
    _, train_pred = torch.max(outputs, 1)           #以最大的概率得到类别的下标
    loss.backward()                        #损失求导
    optimizer.step()                       #更新

    train_acc += (train_pred.cpu() == labels.cpu()).sum().item()
    train_loss += loss.item()
  
  #validation
  if len(valid_set) > 0:
    model.eval()
    with torch.no_grad():
      for inputs, labels in valid_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        _, valid_pred = torch.max(outputs, 1)

        valid_acc += (valid_pred.cpu() == labels.cpu()).sum().item()
        valid_loss += loss.item()
      print('[{:2d}/{:2d}] Train Acc: {:3.6f} Loss: {:3.6f} | Val Acc: {:3.6f} loss: {:3.6f}'.format(epoch + 1, num_epochs, train_acc/len(train_set), train_loss/len(train_loader), valid_acc/len(valid_set), valid_loss/len(valid_loader)))

      #考察模型表现是否有提升
      if valid_acc > best_acc:
        best_acc = valid_acc
        torch.save(model.state_dict(), model_path)
        print("Saving the model with acc {:.3f}".format(best_acc/len(valid_set)))
      else:
        print('[{:2d}/{:2d}] Train Acc: {:3.6f} Loss: {:3.6f}'.format(epoch + 1, num_epochs, train_acc/len(train_set), train_loss/len(train_loader)))
if len(valid_set) == 0:
  torch.save(model.state_dict(), model_path)
  print("Saving model at last epoch")

[ 1/20] Train Acc: 0.576637 Loss: 1.389694 | Val Acc: 0.658356 loss: 1.097302
Saving the model with acc 0.658
[ 2/20] Train Acc: 0.639295 Loss: 1.154364 | Val Acc: 0.682260 loss: 1.008956
Saving the model with acc 0.682
[ 3/20] Train Acc: 0.656807 Loss: 1.089702 | Val Acc: 0.687874 loss: 0.979401
Saving the model with acc 0.688
[ 4/20] Train Acc: 0.667187 Loss: 1.051945 | Val Acc: 0.695561 loss: 0.947114
Saving the model with acc 0.696
[ 5/20] Train Acc: 0.674732 Loss: 1.026505 | Val Acc: 0.700317 loss: 0.933694
Saving the model with acc 0.700
[ 6/20] Train Acc: 0.679900 Loss: 1.007311 | Val Acc: 0.702749 loss: 0.924192
Saving the model with acc 0.703
[ 7/20] Train Acc: 0.682953 Loss: 0.994167 | Val Acc: 0.707911 loss: 0.913820
Saving the model with acc 0.708
[ 8/20] Train Acc: 0.686179 Loss: 0.983410 | Val Acc: 0.707017 loss: 0.908492
[ 8/20] Train Acc: 0.686179 Loss: 0.983410
[ 9/20] Train Acc: 0.688892 Loss: 0.974500 | Val Acc: 0.710159 loss: 0.902964
Saving the model with acc 0.710

测试

In [32]:
test_set = TIMITDataset(test, None)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)
model = Classifier().to(device)
model.load_state_dict(torch.load(model_path))

<All keys matched successfully>

In [33]:
predict = []
model.eval()
with torch.no_grad():
  for inputs in test_loader:
    inputs = inputs.to(device)
    outputs = model(inputs)
    _, test_pred = torch.max(outputs, 1)

    for y in test_pred.cpu().numpy():
      predict.append(y)
with open('prediction.csv',"w") as f:
  f.write("id, class\n")
  for i, y in enumerate(predict):
    f.write("{},{}\n".format(i, y))