# Multiclass classification

After the first neural network has classified the seam as poor quality, it is necessary to determine the type of defect. I can do this with the help of a second neural network that will do just that.

## STEP 1. Import data and libraries

In [36]:
import numpy as np
import pandas as pd
import zipfile
from matplotlib import pyplot as plt
import shutil 
from tqdm import tqdm
import torch
import torchvision
import time
import copy
from torchvision import transforms, models
import os
import torch.nn as nn
data_root = r"data\data_root\al5083\train"
print('Success!')

Success!


## STEP 2. Data markup

I have a json file with markup. I need to parse it. I am adding a column for a multiclass classification

In [6]:
js = os.path.join(data_root, r"train.json")
labels = pd.read_json(js, typ='series')
labels = labels.to_frame().reset_index().rename(columns={'index':'path',0:'class'})
labels['class'] = labels['class'].replace({0:'good_weld',
                                           1:'burn_through',
                                           2:'contamination',
                                           3:'lack_of_fusion',
                                           4:'misalignment',
                                           5:'lack_of_penetration'})
labels = labels.sort_values(by='class').reset_index().drop('index',axis=1)
labels = labels.loc[labels['class'] != 'good_weld']
classes = labels['class'].unique()
labels

Unnamed: 0,path,class
0,170906-113317-Al 2mm-part3/frame_00647.png,burn_through
1,170906-144958-Al 2mm/frame_01521.png,burn_through
2,170906-144958-Al 2mm/frame_01128.png,burn_through
3,170906-144958-Al 2mm/frame_01144.png,burn_through
4,170906-144958-Al 2mm/frame_01239.png,burn_through
...,...,...
26661,170904-113012-Al 2mm-part2/frame_00829.png,misalignment
26662,170904-113012-Al 2mm-part2/frame_00686.png,misalignment
26663,170904-113012-Al 2mm-part2/frame_00550.png,misalignment
26664,170904-113012-Al 2mm-part2/frame_00660.png,misalignment


## STEP 3. Split the data into training and validation data

As in binary categorization, I decided to split the data 5 to 1.

In [7]:
many_classes_root = r"data\multiclass_train"
train_dir = os.path.join(many_classes_root,'train')
val_dir = os.path.join(many_classes_root,'val')
for dir_name in [train_dir, val_dir]: #итерация через 2 строчки
    for class_name in classes: # для каждого имени класса  в списке классов
        os.makedirs(os.path.join(dir_name, class_name), exist_ok=True)
print('Success!')

Success!


I'd like to point out a problem I've encountered - when copying files to a shared directory, files with the same name replace each other.... You have to rename them somehow

In [8]:
for class_name in classes: 
    for i, file_name in enumerate(tqdm(labels['path'].loc[labels['class']==class_name].tolist())):
        pic_name = str(class_name) + "_" + str(i) + '.jpg'
        if i % 6 != 0:
            shutil.copy(os.path.join(data_root, file_name), os.path.join(os.path.join(train_dir, class_name, pic_name)))
        else:
            shutil.copy(os.path.join(data_root, file_name), os.path.join(os.path.join(val_dir, class_name, pic_name)))
print('Изображения отсортированы по train и val!') 

100%|██████████| 1783/1783 [00:08<00:00, 204.76it/s]
100%|██████████| 6325/6325 [00:32<00:00, 191.92it/s]
100%|██████████| 4028/4028 [00:21<00:00, 188.42it/s]
100%|██████████| 2819/2819 [00:14<00:00, 196.06it/s]
100%|██████████| 2953/2953 [00:13<00:00, 213.78it/s]

Изображения отсортированы по train и val!





Check the correctness of file partitioning:

In [15]:
lst_train = []
lst_val = []
for type in classes:
    items_train = os.listdir(os.path.join(train_dir,type))
    items_val =  os.listdir(os.path.join(val_dir,type))
    lst_train.append(len(items_train))
    lst_val.append(len(items_val))
print('Всего изображений в деле: ', sum(lst_train) + sum(lst_val))
print('Всего на обучении: ', sum(lst_train))
print('Всего на валидации: ', sum(lst_val))
print('Отношение обучающих данных к валидационным: ', sum(lst_train) / sum(lst_val))
for cls in classes:
    print(cls, len(os.listdir(os.path.join(many_classes_root,'train',cls))))

Всего изображений в деле:  17908
Всего на обучении:  14920
Всего на валидации:  2988
Отношение обучающих данных к валидационным:  4.99330655957162
burn_through 1485
contamination 5270
lack_of_fusion 3356
lack_of_penetration 2349
misalignment 2460


## STEP 4. Transform

Here everything is similar to the binary classification. However, now for data balancing I decided to use weighted loss in defining the loss function. This is another method for data balancing and it is better suited for cases when there are many classes.

In [16]:
val_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(), 
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
val_dataset = torchvision.datasets.ImageFolder(val_dir, val_transforms)

In [17]:
train_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(), 
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 
])
train_dataset = torchvision.datasets.ImageFolder(train_dir, train_transforms)

In [18]:
print('Обучающих изображений: ', len(train_dataset))
print('Валидационных изображений: ', len(val_dataset))

Обучающих изображений:  14920
Валидационных изображений:  2988


In [19]:
len(train_dataset) / len(val_dataset)

4.99330655957162

## STEP 5.  Define the architecture of the model:

Based on the results of the study, it became clear that the optimal model for this task is VGG11.

In [37]:
model = models.densenet121(pretrained=True) 
num_features = model.classifier.in_features
model.classifier = nn.Linear(num_features, 1)
model.classifier.add_module("sigmoid", nn.Sigmoid())
device = torch.device("cuda:0")
model = model.to(device)
print("Success")

Success


## STEP 6. Define hyperparameters

In [40]:
batch_size = 16
class_weights = torch.tensor([1485, 5270, 3356, 2349, 2460], dtype=torch.float)
class_weights = 1 / class_weights 
class_weights = class_weights / torch.sum(class_weights)  
criterion = nn.CrossEntropyLoss(weight=class_weights)
learning_rate = 0.001
optimizer = torch.optim.Adagrad(model.parameters(), lr=learning_rate)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)
num_epochs=50
print("Success!")

Success!


Now I can load the data into the dataloader, set the batch_size and shuffle at each new step and start training the model

In [39]:
train_dataloader = torch.utils.data.DataLoader(
    train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
val_dataloader = torch.utils.data.DataLoader(
    val_dataset, batch_size=batch_size, shuffle=False, num_workers=2) 
print("Success!")

## STEP 7. Fit the model:

I will assume that if the accuracy of the model has stopped improving over 5 epochs, then the accuracy can no longer increase significantly.

In [33]:
best_accuracy = 0
no_improvement_count = 0
patience = 5
for epoch in range(50):
    # Обучение на тренировочных данных
    model.train()
    for inputs, labels in tqdm(train_dataloader):
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs).to(device)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
    
    # Оценка точности на валидационных данных
    model.eval()
    with torch.no_grad():
        total = 0
        correct = 0
        for inputs, labels in tqdm(val_dataloader):
          inputs = inputs.to(device)
          labels = labels.to(device)
          outputs = model(inputs).to(device)
          _, predicted = torch.max(outputs.data, 1)
          total += labels.size(0)
          correct += (predicted == labels).sum().item()
        accuracy = correct / total
    
    # Проверяем, улучшилась ли точность на этой эпохе
    if accuracy > best_accuracy:
        best_accuracy = accuracy
        best_model = model
        no_improvement_count = 0
        print('Модель улучшена!')
    else:
        no_improvement_count += 1
        print('Модель не улучшилась, обучение продолжается')
    
    # Если точность не улучшалась в течение patience эпох, прекращаем обучение
    if no_improvement_count == patience:
        print('No improvement for {} epochs. Stopping training.'.format(patience))
        break
    
    # Выводим информацию о текущей эпохе
    print('Epoch {}/{}:'.format(epoch + 1, num_epochs), flush=True)
    print('Training Loss: {:.4f}'.format(loss.item()), flush=True)
    print('Validation Accuracy: {:.4f}'.format(accuracy), flush=True)
print('Succes!')

Succes!


## STEP 8. Save the model

In [34]:
# torch.save(best_model, 'desnet169_adagrad_5classes.pth')
print('Succes!')

Succes!
