# **Analys The Problem**

Previously on papper "Facial Expression Recognition Methods in the Wild Based on Fusion Feature of Attention Mechanism and LBP (MDPI Sensors 2023 Q2)" authors used a model called ResNet-50 combined CBAM and LBP.

The accuracy took:
- 99.66% on CK+
- 74.23% on FER-2013
- 89.50 on FER-PLUS 
- 88.20 on RAF-DB

**The problem: What if I change the variant of LBP in this method, does it increase the accuracy spesifically for FER-2013 dataset?**

# **About Dataset**

- The paper used 4 dataset include CK+(59 images), FER-2013 (35,887 images), FER-PLUS(31.412 images), RAF-DB (29,672 images)
- CK+ is a controlled dataset and the 3 others datasets are uncontrolled datasets
- Controlled dataset has good lightning and pose but uncontrolled datasets got a random of it

# **Method**

## Paper's Method

- **ResNet-18 + CBAM**

    1. Architecture of ResNet-18:
    
        <img src="../src/Structure-of-a-ResNet-18-architecture.png" alt="Windowing of Feature in Faces" width="350" height="250">

    2. CBAM Architecture:

        <img src="../src/cbam.png" alt="Windowing of Feature in Faces" width="600" height="100">

    3. Authors combine the CBAM module into each block of the ResNet-18 architecture,
    
        Before and After implement the CBAM module:

        <img src="../src/oriblock.png" alt="Windowing of Feature in Faces" width="150" height="200">
        <img src="../src/blocknCbam.png" alt="Windowing of Feature in Faces" width="150" height="200">

- **Local Binary Patterns (LBP)**

    LBP is one of the most generally used texture pattern descriptors for examining local grain features and is regarded as one of the best methods for texture processing, which is widely employed in image processing.

- **RCL-Net Model**

    After combining all the method such as LBP and ResNet-CBAM, this architecture called by **RCL-Net Model**,

    <img src="../src/rcl-net.png" alt="Windowing of Feature in Faces" width="700" height="200">



## Proposed Method

- In planning, switching LBP method in previous architecture with the newest or other variant LBP

# **IMPORT LIBRARIES**

In [1]:
import pandas as pd
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm
import seaborn as sns

# for augmen:
import albumentations as A
from albumentations.pytorch import ToTensorV2
import torchvision.transforms as T

# for mixup augmen:
import torch
import random

# mixup from train.py file
import argparse
import os
import sys

# LBP
import cv2
import torch.nn as nn
import math

from torch.utils.data import Dataset, DataLoader

import torch.optim as optim
import mlflow
import mlflow.pytorch

from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix
from skimage.feature import local_binary_pattern

from torchvision.models import resnet18
import torch.nn.functional as F

  check_for_updates()


In [2]:
print(torch.cuda.is_available())

True


# **PREPROCESSING**

In [3]:
df_combined_train = pd.read_csv('afterAugTrain.csv')
df_public_test = pd.read_csv('afterAugTest.csv')

- Visualize datatable:

In [4]:
# Tampilan hasil original + augmentasi dataframe setelah normalisasi
df_combined_train.head()

Unnamed: 0,emotion,usage,resized_pixels
0,0,Training,0.53333336 0.5058824 0.4509804 0.39215687 0.34...
1,0,Training,0.09803922 0.09019608 0.07058824 0.06666667 0....
2,0,Training,0.8980392 0.8862745 0.87058824 0.84705883 0.83...
3,0,Training,0.8627451 0.8745098 0.88235295 0.8980392 0.878...
4,0,Training,0.2784314 0.30980393 0.3647059 0.42352942 0.42...


In [5]:
# Tampilan hasil df_public_test dataframe setelah normalisasi
df_public_test.head()

Unnamed: 0,emotion,usage,resized_pixels
0,0,PublicTest,0.00390619 0.00390619 0.00390619 0.00390619 0....
1,1,PublicTest,0.00236832 0.002475971 0.002675894 0.002906574...
2,4,PublicTest,0.00092272204 0.0012302961 0.0016916571 0.0017...
3,6,PublicTest,0.0032602844 0.0031372549 0.0029680892 0.00316...
4,3,PublicTest,0.0013225683 0.0013071896 0.0012610535 0.00121...


Kolom emotion: Jika ini adalah label kelas, awalnya tipe object maka bisa mengubahnya menjadi tipe data numerik menggunakan LabelEncoder dari scikit-learn:

In [6]:
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
df_combined_train['emotion'] = le.fit_transform(df_combined_train['emotion'])

Kolom resized_pixels: Jika kolom ini berisi string yang merepresentasikan array alias saat ini tipe object, maka perlu mengubahnya menjadi array numpy:

In [7]:
# Konversi string piksel menjadi array NumPy data train
df_combined_train['resized_pixels'] = df_combined_train['resized_pixels'].apply(
    lambda x: np.fromstring(x, sep=' ').astype(np.float32).reshape(100, 100)
)

In [8]:
df_combined_train.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6104 entries, 0 to 6103
Data columns (total 3 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   emotion         6104 non-null   int64 
 1   usage           3052 non-null   object
 2   resized_pixels  6104 non-null   object
dtypes: int64(1), object(2)
memory usage: 143.2+ KB


In [9]:
# Konversi string piksel menjadi array NumPy data test
df_public_test.loc[:, 'resized_pixels'] = df_public_test['resized_pixels'].apply(
    lambda x: np.fromstring(x, sep=' ').astype(np.float32).reshape(100, 100)
)

In [10]:
# Pastikan semua entri memiliki ukuran 100x100
print(df_combined_train['resized_pixels'].apply(lambda x: x.shape == (100, 100)).all())
print(df_public_test['resized_pixels'].apply(lambda x: x.shape == (100, 100)).all())

True
True


In [11]:
# Tampilkan informasi dataframe setelah normalisasi
df_combined_train.head()

Unnamed: 0,emotion,usage,resized_pixels
0,0,Training,"[[0.53333336, 0.5058824, 0.4509804, 0.39215687..."
1,0,Training,"[[0.09803922, 0.09019608, 0.07058824, 0.066666..."
2,0,Training,"[[0.8980392, 0.8862745, 0.87058824, 0.84705883..."
3,0,Training,"[[0.8627451, 0.8745098, 0.88235295, 0.8980392,..."
4,0,Training,"[[0.2784314, 0.30980393, 0.3647059, 0.42352942..."


In [12]:
# Tampilkan informasi dataframe setelah normalisasi
df_public_test.head()

Unnamed: 0,emotion,usage,resized_pixels
0,0,PublicTest,"[[0.00390619, 0.00390619, 0.00390619, 0.003906..."
1,1,PublicTest,"[[0.00236832, 0.002475971, 0.002675894, 0.0029..."
2,4,PublicTest,"[[0.00092272204, 0.0012302961, 0.0016916571, 0..."
3,6,PublicTest,"[[0.0032602844, 0.0031372549, 0.0029680892, 0...."
4,3,PublicTest,"[[0.0013225683, 0.0013071896, 0.0012610535, 0...."


In [13]:
df_combined_train.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6104 entries, 0 to 6103
Data columns (total 3 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   emotion         6104 non-null   int64 
 1   usage           3052 non-null   object
 2   resized_pixels  6104 non-null   object
dtypes: int64(1), object(2)
memory usage: 143.2+ KB


# PERCOBAAN MODEL II DENGAN SIMPAN LBP

- Preprocess lbp data

In [14]:
from skimage.feature import local_binary_pattern
import torch
import numpy as np

def lbp_transform(image, P=24, R=3):
    """
    Apply circular LBP to the input image after converting to integer type.

    :param image: A PyTorch tensor of shape [1, H, W], where H and W are height and width.
    :param P: Number of circularly symmetric neighbor set points.
    :param R: Radius of circle.
    :return: LBP-transformed image as a PyTorch tensor of shape [1, H, W].
    """
    # Convert the PyTorch tensor to a NumPy array
    image_np = image.squeeze(0).cpu().numpy()  # Remove channel dimension and convert to NumPy

    # Scale the floating-point image to the [0, 255] range and convert to uint8
    image_np_int = (image_np * 255).astype(np.uint8)
    
    # Apply circular LBP using scikit-image
    lbp_image = local_binary_pattern(image_np_int, P=P, R=R, method='uniform')
    
    # Normalize the LBP output (handle cases where min equals max)
    lbp_min = lbp_image.min()
    lbp_max = lbp_image.max()
    
    if lbp_min != lbp_max:
        lbp_image = (lbp_image - lbp_min) / (lbp_max - lbp_min)  # Normalize to [0, 1]
    else:
        lbp_image = lbp_image - lbp_min  # Set all values to 0 if min equals max (no variation)
    
    # Convert back to PyTorch tensor
    lbp_tensor = torch.tensor(lbp_image, dtype=torch.float32, device=image.device).unsqueeze(0)  # Add channel dimension back
    
    return lbp_tensor


In [15]:
from tqdm import tqdm  # Import tqdm for the progress bar

def lbp_preprocess_and_save(df, save_path, P=24, R=3):
    """
    Apply LBP transformation to images in a DataFrame and save the processed data.

    :param df: A DataFrame containing 'resized_pixels' column with image data.
    :param save_path: Path to save the processed LBP features.
    :param P: Number of circularly symmetric neighbor set points for LBP.
    :param R: Radius of circle for LBP.
    """
    processed_data = []
    
    # Initialize tqdm progress bar
    for idx, row in tqdm(df.iterrows(), total=len(df), desc="Processing LBP", ncols=100):
        # Extract image from 'resized_pixels' and convert to PyTorch tensor
        image = torch.tensor(row['resized_pixels'], dtype=torch.float32).unsqueeze(0)  # Add channel dimension

        # Apply the LBP transform
        lbp_image = lbp_transform(image, P=P, R=R)

        # Convert the LBP image back to NumPy array and store it
        processed_data.append(lbp_image.squeeze(0).cpu().numpy())  # Remove channel dimension
    
    # Save the processed data as a NumPy file
    np.save(save_path, np.array(processed_data))
    print(f"LBP features saved to {save_path}")


In [16]:
lbp_preprocess_and_save(df_public_test, "df_public_test")

Processing LBP: 100%|██████████████████████████████████████████| 3589/3589 [00:20<00:00, 172.13it/s]


LBP features saved to df_public_test


In [14]:
import torch
import numpy as np

class DualInputDataset(torch.utils.data.Dataset):
    def __init__(self, original_images_df, lbp_data_path):
        self.original_images = original_images_df['resized_pixels'].values  # Assuming 'resized_pixels' contains the images
        self.labels = original_images_df['emotion'].values  # Assuming 'emotion' contains the labels
        self.lbp_data = np.load(lbp_data_path, allow_pickle=True)  # Load the LBP features from npy
        
    def __len__(self):
        return len(self.original_images)
    
    def __getitem__(self, idx):
        # Load original image for ResNet-CBAM branch
        original_image = torch.tensor(self.original_images[idx]).float()  # Convert image to tensor
        original_image = original_image.unsqueeze(0)  # Add channel dimension if needed (e.g., for grayscale)
        
        # Load preprocessed LBP features for LBP branch
        lbp_image = torch.tensor(self.lbp_data[idx]).float()  # Convert LBP features to tensor
        
        # Load label
        label = torch.tensor(self.labels[idx]).long()
        
        return original_image, lbp_image, label

In [15]:
train_dataset = DualInputDataset(df_combined_train, 'df_combined_train.npy')
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)

In [16]:
val_dataset = DualInputDataset(df_public_test, 'df_public_test.npy')
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=64, shuffle=True)

In [17]:
# Ambil satu batch dari train_loader
for original_image, lbp_image, label in train_loader:
    lbp_image = lbp_image.unsqueeze(1)  # Menambahkan channel axis, sehingga shape menjadi [64, 1, 100, 100]

    print(f"Original image shape: {original_image.shape}")  # Dimensi dari gambar asli (untuk ResNet-CBAM)
    print(f"LBP image shape: {lbp_image.shape}")  # Dimensi dari hasil ekstraksi LBP
    print(f"Label shape: {label.shape}")  # Dimensi dari label
    
    # Hentikan loop setelah satu batch
    break

Original image shape: torch.Size([64, 1, 100, 100])
LBP image shape: torch.Size([64, 1, 100, 100])
Label shape: torch.Size([64])


In [18]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Channel Attention Module
class ChannelAttention(nn.Module):
    def __init__(self, in_planes, ratio=16):
        super(ChannelAttention, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.max_pool = nn.AdaptiveMaxPool2d(1)

        self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False)

        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
        max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
        out = avg_out + max_out
        return self.sigmoid(out)


# Spatial Attention Module
class SpatialAttention(nn.Module):
    def __init__(self, kernel_size=7):
        super(SpatialAttention, self).__init__()
        assert kernel_size in (3, 7), "kernel size must be 3 or 7"
        padding = 3 if kernel_size == 7 else 1

        self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = torch.mean(x, dim=1, keepdim=True)
        max_out, _ = torch.max(x, dim=1, keepdim=True)
        x = torch.cat([avg_out, max_out], dim=1)
        x = self.conv1(x)
        return self.sigmoid(x)


# CBAM Block
class CBAM(nn.Module):
    def __init__(self, in_planes, ratio=16, kernel_size=7):
        super(CBAM, self).__init__()
        self.ca = ChannelAttention(in_planes, ratio)
        self.sa = SpatialAttention(kernel_size)

    def forward(self, x):
        out = self.ca(x) * x
        out = self.sa(out) * out
        return out

In [19]:
# Load the pre-trained ResNet18 model
model = resnet18(pretrained=True)

# Modify the ResNet18 model by adding CBAM after conv2 to conv5
class ResNet18_CBAM_7Classes(nn.Module):
    def __init__(self, original_model, num_classes=7):
        super(ResNet18_CBAM_7Classes, self).__init__()
        
        # Take the original model's layers
        self.conv1 = nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1, bias=False) 
        self.bn1 = original_model.bn1
        self.relu = original_model.relu
        self.maxpool = original_model.maxpool
        
        # Add CBAM after each convolutional block
        self.layer1 = nn.Sequential(
            original_model.layer1,
            CBAM(64)  # After conv2
        )
        self.layer2 = nn.Sequential(
            original_model.layer2,
            CBAM(128)  # After conv3
        )
        self.layer3 = nn.Sequential(
            original_model.layer3,
            CBAM(256)  # After conv4
        )
        self.layer4 = nn.Sequential(
            original_model.layer4,
            CBAM(512)  # After conv5
        )
        
        # Original avgpool layer
        self.avgpool = original_model.avgpool
        
        # Modify the FC layer to output 7 classes
        self.fc = nn.Linear(512, 512)
    
    def forward(self, x):
        # Forward pass through the modified network
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        
        return x

# Create the modified ResNet18 with CBAM for 7 classes
model = ResNet18_CBAM_7Classes(model, num_classes=7)


print(model.eval())  # Seharusnya [1, 7]




ResNet18_CBAM_7Classes(
  (conv1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, trac

In [20]:
import torch 
import torch.nn as nn 
import torch.nn.functional as F 
from torchvision.models import resnet18 

class RCL_Net(nn.Module): 
    def __init__(self, original_model, num_classes=7): 
        super(RCL_Net, self).__init__() 
         
        # ResNet-CBAM branch (global feature extraction) 
        self.resnet_cbam = ResNet18_CBAM_7Classes(original_model, num_classes) 
         
        # FC1 untuk ResNet-CBAM branch (input 512 → output 7)
        self.fc_resnet = nn.Linear(512, 7)  
         
        # LBP branch - tambahkan layer pooling untuk mengurangi dimensi
        self.pool_lbp = nn.AdaptiveAvgPool2d((16, 16))  # Mengurangi dimensi dari [100, 100] ke [16, 16]
        self.fc_lbp = nn.Linear(16 * 16, 7)  # FC2: setelah pooling jadi [batch_size, 16*16], diubah ke 7
         
        # FC3 setelah concatenation (input 14 → output 7)
        self.fc_fusion = nn.Linear(14, num_classes)  
         
    def forward(self, x_resnet, lbp_x): 
        # print(f"LBP input shape: {lbp_x.shape}")
        
        # ResNet-CBAM branch
        x_resnet = self.resnet_cbam(x_resnet)
        # print(f"ResNet output shape: {x_resnet.shape}")
        x_resnet = F.sigmoid(self.fc_resnet(x_resnet))
        
        # LBP branch
        lbp_x = self.pool_lbp(lbp_x)  # Pooling untuk mengurangi dimensi
        lbp_x = lbp_x.view(lbp_x.size(0), -1)  # Flatten
        # print(f"LBP flattened shape: {lbp_x.shape}")
        lbp_x = F.sigmoid(self.fc_lbp(lbp_x))
        
        # print(f"Final shapes - ResNet: {x_resnet.shape}, LBP: {lbp_x.shape}")

        # Concatenate output dari FC1 dan FC2
        fused = torch.cat((x_resnet, lbp_x), dim=1)  # Shape: [batch_size, 14] (7+7)
 
        # FC3 dengan sigmoid activation diikuti softmax untuk klasifikasi
        fused = F.sigmoid(self.fc_fusion(fused))  # 14 → 7
        output = F.softmax(fused, dim=1)  # Menambahkan softmax untuk klasifikasi final
         
        return output 

# Instantiate model
original_model = resnet18(pretrained=True) 
model = RCL_Net(original_model, num_classes=7) 
 
print(model.eval())


RCL_Net(
  (resnet_cbam): ResNet18_CBAM_7Classes(
    (conv1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (1): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): 

In [21]:
def mixup_data(x, y, alpha=1.0):
    """Mixup data."""
    if alpha > 0:
        lam = np.random.beta(alpha, alpha)
    else:
        lam = 1
    batch_size = x.size()[0]
    index = torch.randperm(batch_size).to(x.device)

    mixed_x = lam * x + (1 - lam) * x[index, :]
    y_a, y_b = y, y[index]
    return mixed_x, y_a, y_b, lam

def mixup_criterion(criterion, pred, y_a, y_b, lam):
    return lam * criterion(pred, y_a) + (1 - lam) * criterion(pred, y_b)


In [22]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.metrics import precision_score, recall_score, f1_score
from tqdm import tqdm
import numpy as np
import os

def train_and_evaluate(model, train_loader, val_loader, 
                      num_epochs=300,
                      lr=0.001,  # Learning rate yang lebih kecil
                      momentum=0.1,  # Momentum yang lebih besar
                      weight_decay=0.0001,
                      max_early_stop=30,
                      save_path='./checkpoints',
                      resume=False):
    
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    # device = torch.device("cpu")
    model = model.to(device)
    
    optimizer = optim.SGD(model.parameters(), lr=lr, momentum=momentum, weight_decay=weight_decay)
    criterion = nn.CrossEntropyLoss()
    
    # Tambahkan learning rate scheduler
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', 
                                                    factor=0.1, patience=5, 
                                                    verbose=True)
    
    best_val_acc = 0.0
    start_epoch = 0
    early_stop_count = 0

    # Resume dari checkpoint jika ada
    if resume and os.path.exists(os.path.join(save_path, 'best_model.pth')):
        checkpoint = torch.load(os.path.join(save_path, 'best_model.pth'))
        model.load_state_dict(checkpoint['model_state_dict'])
        optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
        start_epoch = checkpoint['epoch']
        best_val_acc = checkpoint['best_val_acc']
        print(f"Resuming training from epoch {start_epoch + 1} with best validation accuracy {best_val_acc:.4f}")

    # Training loop
    for epoch in range(start_epoch, num_epochs):
        print(f"Epoch {epoch + 1}/{num_epochs}")
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        
        # Training progress bar
        train_progress_bar = tqdm(train_loader, desc=f"Epoch {epoch + 1}/{num_epochs} (Training)", leave=True)
        
        for original_image, lbp_image, labels in train_progress_bar:
            original_image, lbp_image, labels = original_image.to(device), lbp_image.to(device), labels.to(device)
            outputs = model(original_image, lbp_image)

            loss = criterion(outputs, labels)
            
            # Backward pass dengan gradient clipping
            optimizer.zero_grad()
            loss.backward()
            torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
            optimizer.step()
            
            # Perhitungan metrik training
            running_loss += loss.item()
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()
            
            # Update progress bar
            train_acc = correct / total
            train_progress_bar.set_postfix(acc=f"{train_acc:.3f}", loss=f"{loss.item():.3f}")
        
        train_loss = running_loss / len(train_loader)
        train_acc = correct / total
        
        # Validation phase
        model.eval()
        val_loss = 0.0
        val_correct = 0
        val_total = 0
        all_predictions = []
        all_labels = []
        
        val_progress_bar = tqdm(val_loader, desc=f"Epoch {epoch + 1}/{num_epochs} (Validation)", leave=False)
        
        with torch.no_grad():
            for original_image, lbp_image, labels in val_progress_bar:
                original_image, lbp_image, labels = original_image.to(device), lbp_image.to(device), labels.to(device)
                
                outputs = model(original_image, lbp_image)
                loss = criterion(outputs, labels)
                
                val_loss += loss.item()
                _, predicted = outputs.max(1)
                val_total += labels.size(0)
                val_correct += predicted.eq(labels).sum().item()
                
                # Simpan predictions dan labels untuk metrik
                all_predictions.extend(predicted.cpu().numpy())
                all_labels.extend(labels.cpu().numpy())
        
        # Hitung metrik validasi
        val_loss = val_loss / len(val_loader)
        val_acc = val_correct / val_total
        
        # Hitung precision, recall, dan F1 score
        precision = precision_score(all_labels, all_predictions, average='macro', zero_division=0)
        recall = recall_score(all_labels, all_predictions, average='macro', zero_division=0)
        f1 = f1_score(all_labels, all_predictions, average='macro', zero_division=0)
        
        # Update learning rate scheduler
        scheduler.step(val_acc)
        
        # Print metrics
        print(f"\nEpoch [{epoch + 1}/{num_epochs}]")
        print(f"Train Loss: {train_loss:.4f}, Train Accuracy: {train_acc:.4f}, hitung : {correct}/{total}")
        print(f"Val Loss: {val_loss:.4f}, Val Accuracy: {val_acc:.4f}")
        print(f"Precision: {precision:.4f}, Recall: {recall:.4f}, F1: {f1:.4f}")
        print(f"Current LR: {optimizer.param_groups[0]['lr']:.6f}\n")
        
        # Save best model
        if val_acc > best_val_acc:
            best_val_acc = val_acc
            best_checkpoint_path = os.path.join(save_path, 'best_model.pth')
            torch.save({
                'epoch': epoch,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'scheduler_state_dict': scheduler.state_dict(),
                'best_val_acc': best_val_acc,
                'train_loss': train_loss,
                'val_loss': val_loss,
                'precision': precision,
                'recall': recall,
                'f1': f1
            }, best_checkpoint_path)
            print(f"Best model saved at epoch {epoch + 1} with validation accuracy {val_acc:.4f}")
            early_stop_count = 0
        else:
            early_stop_count += 1
            print(f"Early stop count: {early_stop_count}/{max_early_stop}")
        
        # Save periodic checkpoint
        if (epoch + 1) % 10 == 0:
            checkpoint_path = os.path.join(save_path, f'checkpoint_epoch_{epoch + 1}.pth')
            torch.save({
                'epoch': epoch,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'scheduler_state_dict': scheduler.state_dict(),
                'best_val_acc': best_val_acc
            }, checkpoint_path)
            print(f"Checkpoint saved at epoch {epoch + 1}")
        
        # Early stopping check
        if early_stop_count >= max_early_stop:
            print(f"Training stopped early at epoch {epoch + 1}")
            break

    return model

In [23]:

# Buat directory untuk checkpoints
os.makedirs('./checkpoints', exist_ok=True)

# Training
model = train_and_evaluate(
    model=model,
    train_loader=train_loader,
    val_loader=val_loader,
    num_epochs=300,
    lr=0.9,
    momentum=0.1,
    weight_decay=0.0001,
    max_early_stop=30
)



Epoch 1/300


Epoch 1/300 (Training): 100%|██████████| 96/96 [01:38<00:00,  1.03s/it, acc=0.143, loss=1.947]
                                                                         


Epoch [1/300]
Train Loss: 1.9460, Train Accuracy: 0.1429, hitung : 872/6104
Val Loss: 1.9462, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Best model saved at epoch 1 with validation accuracy 0.1819
Epoch 2/300


Epoch 2/300 (Training): 100%|██████████| 96/96 [01:17<00:00,  1.23it/s, acc=0.143, loss=1.946]
                                                                         


Epoch [2/300]
Train Loss: 1.9459, Train Accuracy: 0.1429, hitung : 872/6104
Val Loss: 1.9462, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Early stop count: 1/30
Epoch 3/300


Epoch 3/300 (Training): 100%|██████████| 96/96 [01:35<00:00,  1.00it/s, acc=0.143, loss=1.949]
                                                                         


Epoch [3/300]
Train Loss: 1.9458, Train Accuracy: 0.1429, hitung : 872/6104
Val Loss: 1.9462, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Early stop count: 2/30
Epoch 4/300


Epoch 4/300 (Training): 100%|██████████| 96/96 [01:27<00:00,  1.10it/s, acc=0.143, loss=1.943]
                                                                         


Epoch [4/300]
Train Loss: 1.9453, Train Accuracy: 0.1432, hitung : 874/6104
Val Loss: 1.9462, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Early stop count: 3/30
Epoch 5/300


Epoch 5/300 (Training): 100%|██████████| 96/96 [01:24<00:00,  1.14it/s, acc=0.209, loss=1.941]
                                                                         


Epoch [5/300]
Train Loss: 1.9435, Train Accuracy: 0.2089, hitung : 1275/6104
Val Loss: 1.9462, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Early stop count: 4/30
Epoch 6/300


Epoch 6/300 (Training): 100%|██████████| 96/96 [01:29<00:00,  1.07it/s, acc=0.287, loss=1.931]
                                                                         


Epoch [6/300]
Train Loss: 1.9392, Train Accuracy: 0.2870, hitung : 1752/6104
Val Loss: 1.9468, Val Accuracy: 0.1819
Precision: 0.0260, Recall: 0.1429, F1: 0.0440
Current LR: 0.900000

Early stop count: 5/30
Epoch 7/300


Epoch 7/300 (Training): 100%|██████████| 96/96 [01:14<00:00,  1.28it/s, acc=0.261, loss=1.930]
                                                                         


Epoch [7/300]
Train Loss: 1.9340, Train Accuracy: 0.2608, hitung : 1592/6104
Val Loss: 1.9475, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 6/30
Epoch 8/300


Epoch 8/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.35it/s, acc=0.268, loss=1.923]
                                                                         


Epoch [8/300]
Train Loss: 1.9306, Train Accuracy: 0.2683, hitung : 1638/6104
Val Loss: 1.9480, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 7/30
Epoch 9/300


Epoch 9/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.37it/s, acc=0.270, loss=1.934]
                                                                         


Epoch [9/300]
Train Loss: 1.9299, Train Accuracy: 0.2697, hitung : 1646/6104
Val Loss: 1.9481, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 8/30
Epoch 10/300


Epoch 10/300 (Training): 100%|██████████| 96/96 [01:12<00:00,  1.32it/s, acc=0.271, loss=1.922]
                                                                          


Epoch [10/300]
Train Loss: 1.9292, Train Accuracy: 0.2713, hitung : 1656/6104
Val Loss: 1.9482, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 9/30
Checkpoint saved at epoch 10
Epoch 11/300


Epoch 11/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.36it/s, acc=0.272, loss=1.922]
                                                                          


Epoch [11/300]
Train Loss: 1.9285, Train Accuracy: 0.2718, hitung : 1659/6104
Val Loss: 1.9482, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 10/30
Epoch 12/300


Epoch 12/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.36it/s, acc=0.273, loss=1.928]
                                                                          


Epoch [12/300]
Train Loss: 1.9280, Train Accuracy: 0.2726, hitung : 1664/6104
Val Loss: 1.9482, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.090000

Early stop count: 11/30
Epoch 13/300


Epoch 13/300 (Training): 100%|██████████| 96/96 [01:09<00:00,  1.39it/s, acc=0.274, loss=1.931]
                                                                          


Epoch [13/300]
Train Loss: 1.9275, Train Accuracy: 0.2739, hitung : 1672/6104
Val Loss: 1.9484, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 12/30
Epoch 14/300


Epoch 14/300 (Training): 100%|██████████| 96/96 [01:09<00:00,  1.39it/s, acc=0.274, loss=1.928]
                                                                          


Epoch [14/300]
Train Loss: 1.9270, Train Accuracy: 0.2741, hitung : 1673/6104
Val Loss: 1.9483, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 13/30
Epoch 15/300


Epoch 15/300 (Training): 100%|██████████| 96/96 [01:12<00:00,  1.33it/s, acc=0.274, loss=1.916]
                                                                          


Epoch [15/300]
Train Loss: 1.9269, Train Accuracy: 0.2741, hitung : 1673/6104
Val Loss: 1.9483, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 14/30
Epoch 16/300


Epoch 16/300 (Training): 100%|██████████| 96/96 [01:09<00:00,  1.38it/s, acc=0.274, loss=1.934]
                                                                          


Epoch [16/300]
Train Loss: 1.9269, Train Accuracy: 0.2739, hitung : 1672/6104
Val Loss: 1.9484, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 15/30
Epoch 17/300


Epoch 17/300 (Training): 100%|██████████| 96/96 [01:12<00:00,  1.33it/s, acc=0.274, loss=1.931]
                                                                          


Epoch [17/300]
Train Loss: 1.9268, Train Accuracy: 0.2736, hitung : 1670/6104
Val Loss: 1.9483, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 16/30
Epoch 18/300


Epoch 18/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.35it/s, acc=0.275, loss=1.916]
                                                                          


Epoch [18/300]
Train Loss: 1.9266, Train Accuracy: 0.2747, hitung : 1677/6104
Val Loss: 1.9485, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.009000

Early stop count: 17/30
Epoch 19/300


Epoch 19/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.34it/s, acc=0.275, loss=1.930]
                                                                          


Epoch [19/300]
Train Loss: 1.9266, Train Accuracy: 0.2746, hitung : 1676/6104
Val Loss: 1.9484, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 18/30
Epoch 20/300


Epoch 20/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.34it/s, acc=0.274, loss=1.924]
                                                                          


Epoch [20/300]
Train Loss: 1.9266, Train Accuracy: 0.2744, hitung : 1675/6104
Val Loss: 1.9483, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 19/30
Checkpoint saved at epoch 20
Epoch 21/300


Epoch 21/300 (Training): 100%|██████████| 96/96 [01:14<00:00,  1.29it/s, acc=0.274, loss=1.928]
                                                                          


Epoch [21/300]
Train Loss: 1.9266, Train Accuracy: 0.2744, hitung : 1675/6104
Val Loss: 1.9484, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 20/30
Epoch 22/300


Epoch 22/300 (Training): 100%|██████████| 96/96 [01:13<00:00,  1.31it/s, acc=0.275, loss=1.935]
                                                                          


Epoch [22/300]
Train Loss: 1.9266, Train Accuracy: 0.2747, hitung : 1677/6104
Val Loss: 1.9484, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 21/30
Epoch 23/300


Epoch 23/300 (Training): 100%|██████████| 96/96 [01:12<00:00,  1.33it/s, acc=0.274, loss=1.931]
                                                                          


Epoch [23/300]
Train Loss: 1.9267, Train Accuracy: 0.2738, hitung : 1671/6104
Val Loss: 1.9485, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 22/30
Epoch 24/300


Epoch 24/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.37it/s, acc=0.274, loss=1.939]
                                                                          


Epoch [24/300]
Train Loss: 1.9267, Train Accuracy: 0.2742, hitung : 1674/6104
Val Loss: 1.9486, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000900

Early stop count: 23/30
Epoch 25/300


Epoch 25/300 (Training): 100%|██████████| 96/96 [01:15<00:00,  1.27it/s, acc=0.274, loss=1.941]
                                                                          


Epoch [25/300]
Train Loss: 1.9266, Train Accuracy: 0.2738, hitung : 1671/6104
Val Loss: 1.9485, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 24/30
Epoch 26/300


Epoch 26/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.34it/s, acc=0.274, loss=1.922]
                                                                          


Epoch [26/300]
Train Loss: 1.9266, Train Accuracy: 0.2744, hitung : 1675/6104
Val Loss: 1.9482, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 25/30
Epoch 27/300


Epoch 27/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.35it/s, acc=0.274, loss=1.924]
                                                                          


Epoch [27/300]
Train Loss: 1.9266, Train Accuracy: 0.2741, hitung : 1673/6104
Val Loss: 1.9483, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 26/30
Epoch 28/300


Epoch 28/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.36it/s, acc=0.274, loss=1.924]
                                                                          


Epoch [28/300]
Train Loss: 1.9265, Train Accuracy: 0.2744, hitung : 1675/6104
Val Loss: 1.9486, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 27/30
Epoch 29/300


Epoch 29/300 (Training): 100%|██████████| 96/96 [01:11<00:00,  1.35it/s, acc=0.274, loss=1.922]
                                                                          


Epoch [29/300]
Train Loss: 1.9266, Train Accuracy: 0.2742, hitung : 1674/6104
Val Loss: 1.9485, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 28/30
Epoch 30/300


Epoch 30/300 (Training): 100%|██████████| 96/96 [01:10<00:00,  1.36it/s, acc=0.275, loss=1.927]
                                                                          


Epoch [30/300]
Train Loss: 1.9265, Train Accuracy: 0.2747, hitung : 1677/6104
Val Loss: 1.9486, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000090

Early stop count: 29/30
Checkpoint saved at epoch 30
Epoch 31/300


Epoch 31/300 (Training): 100%|██████████| 96/96 [01:09<00:00,  1.38it/s, acc=0.275, loss=1.927]
                                                                          


Epoch [31/300]
Train Loss: 1.9266, Train Accuracy: 0.2747, hitung : 1677/6104
Val Loss: 1.9486, Val Accuracy: 0.0156
Precision: 0.0022, Recall: 0.1429, F1: 0.0044
Current LR: 0.000009

Early stop count: 30/30
Training stopped early at epoch 31




# MODEL PERTAMA TANPA SIMPAN LBP

Konversi DataFrame ke Tensor:

In [13]:
class EmotionDataset(Dataset):
    def __init__(self, dataframe):
        self.dataframe = dataframe

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        # image = torch.tensor(self.dataframe['resized_pixels'].iloc[idx], dtype=torch.float32).view(1, 100, 100)  # Jika gambar 100x100
        # label = torch.tensor(self.dataframe['emotion'].iloc[idx], dtype=torch.long)
        image = torch.tensor(self.dataframe['resized_pixels'].iloc[idx], dtype=torch.float32).unsqueeze(0)  # Sudah 100x100, hanya tambah channel dimension
        label = torch.tensor(self.dataframe['emotion'].iloc[idx], dtype=torch.long)
        return image, label


Membuat DataLoader:

In [14]:
# Inisialisasi dataset
train_dataset = EmotionDataset(df_combined_train)
test_dataset = EmotionDataset(df_public_test)

In [15]:


# Inisialisasi DataLoader
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=False)
val_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
# train_loader = DataLoader(train_dataset, batch_size=64, shuffle=False, num_workers=4, pin_memory=True)
# test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False, num_workers=4, pin_memory=True)

In [16]:
torch.cuda.empty_cache()  # Bersihkan cache CUDA

In [17]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Channel Attention Module
class ChannelAttention(nn.Module):
    def __init__(self, in_planes, ratio=16):
        super(ChannelAttention, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.max_pool = nn.AdaptiveMaxPool2d(1)

        self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False)

        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
        max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
        out = avg_out + max_out
        return self.sigmoid(out)


# Spatial Attention Module
class SpatialAttention(nn.Module):
    def __init__(self, kernel_size=7):
        super(SpatialAttention, self).__init__()
        assert kernel_size in (3, 7), "kernel size must be 3 or 7"
        padding = 3 if kernel_size == 7 else 1

        self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = torch.mean(x, dim=1, keepdim=True)
        max_out, _ = torch.max(x, dim=1, keepdim=True)
        x = torch.cat([avg_out, max_out], dim=1)
        x = self.conv1(x)
        return self.sigmoid(x)


# CBAM Block
class CBAM(nn.Module):
    def __init__(self, in_planes, ratio=16, kernel_size=7):
        super(CBAM, self).__init__()
        self.ca = ChannelAttention(in_planes, ratio)
        self.sa = SpatialAttention(kernel_size)

    def forward(self, x):
        out = self.ca(x) * x
        out = self.sa(out) * out
        return out

In [46]:
# Load the pre-trained ResNet18 model
model = resnet18(pretrained=True)

# Modify the ResNet18 model by adding CBAM after conv2 to conv5
class ResNet18_CBAM_7Classes(nn.Module):
    def __init__(self, original_model, num_classes=7):
        super(ResNet18_CBAM_7Classes, self).__init__()
        
        # Take the original model's layers
        self.conv1 = nn.Conv2d(1, 64, kernel_size=3, stride=1, padding=1, bias=False) 
        self.bn1 = original_model.bn1
        self.relu = original_model.relu
        self.maxpool = original_model.maxpool
        
        # Add CBAM after each convolutional block
        self.layer1 = nn.Sequential(
            original_model.layer1,
            CBAM(64)  # After conv2
        )
        self.layer2 = nn.Sequential(
            original_model.layer2,
            CBAM(128)  # After conv3
        )
        self.layer3 = nn.Sequential(
            original_model.layer3,
            CBAM(256)  # After conv4
        )
        self.layer4 = nn.Sequential(
            original_model.layer4,
            CBAM(512)  # After conv5
        )
        
        # Original avgpool layer
        self.avgpool = original_model.avgpool
        
        # Modify the FC layer to output 7 classes
        self.fc = nn.Linear(512, 512)
    
    def forward(self, x):
        # Forward pass through the modified network
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        
        return x

# Create the modified ResNet18 with CBAM for 7 classes
model = ResNet18_CBAM_7Classes(model, num_classes=7)


print(model.eval())  # Seharusnya [1, 7]


ResNet18_CBAM_7Classes(
  (conv1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, trac

In [47]:
import torch
import torch.nn.functional as F

def lbp_transform(image):
    """
    Apply the LBP operation on the input image (assumed to be grayscale).
    The image tensor should be of shape [batch_size, 1, height, width].
    
    :param image: Input image tensor (on GPU)
    :return: LBP transformed image tensor
    """
    # Define the LBP kernel (neighboring comparison)
    lbp_kernel = torch.tensor([[[[ 1,  1,  1],
                                 [ 1,  0, -1],
                                 [-1, -1, -1]]],
                               [[[ 1,  1,  1],
                                 [ 1,  0, -1],
                                 [-1, -1, -1]]]], dtype=torch.float32, device=image.device)

    # Apply the kernel using convolution (with padding to keep the same size)
    lbp_image = F.conv2d(image, lbp_kernel, padding=1)

    # Convert the convolution result into binary (using sign function)
    lbp_image = (lbp_image > 0).float()

    return lbp_image

# Example of using the function on GPU
# Assume `input_image` is a grayscale image tensor of shape [batch_size, 1, height, width]
input_image = torch.randn((8, 1, 48, 48), device='cuda')  # Example tensor on GPU
lbp_output = lbp_transform(input_image)

print(lbp_output.shape)  # Should be the same as input shape


torch.Size([8, 2, 48, 48])


In [48]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision.models import resnet18

# LBP feature extraction function (you can implement this with OpenCV or any other method)
# def lbp_transform(image):
#     # Placeholder: Implement LBP transformation here
#     # For now, assume it returns a tensor of the same size as the input image
#     return image  # This should be the LBP transformed image

# Define the modified model class with the LBP branch
class RCL_Net(nn.Module):
    def __init__(self, original_model, num_classes=7):
        super(RCL_Net, self).__init__()
        
        # ResNet-CBAM branch (global feature extraction)
        self.resnet_cbam = ResNet18_CBAM_7Classes(original_model, num_classes)
        
        # LBP branch (local feature extraction)
        self.conv1_lbp = nn.Conv2d(2, 64, kernel_size=3, stride=1, padding=1)  # Input: 2 channels
        self.conv2_lbp = nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1)
        self.conv3_lbp = nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1)
        self.pool_lbp = nn.AdaptiveAvgPool2d((1, 1))  # Pooling to [batch_size, 256, 1, 1]
        
        # Fully connected layer for LBP branch (FC2)
        self.fc_lbp = nn.Linear(256, 256)
        
        # Fully connected layers for ResNet-CBAM branch (FC1)
        self.fc_resnet = nn.Linear(512, 256)  # FC1 for ResNet-CBAM
        
        # Fully connected layers after feature fusion (FC3)
        self.fc_fusion = nn.Linear(256 + 256, 14)  # FC3
        self.fc_output = nn.Linear(14, num_classes)  # Final classification
        
    def forward(self, x):
        # Pass through ResNet-CBAM branch (global features)
        x_resnet = self.resnet_cbam(x)  # Output: [batch_size, 512]
        x_resnet = F.relu(self.fc_resnet(x_resnet))  # Shape: [batch_size, 256] (FC1)

        # LBP branch (local features)
        lbp_image = lbp_transform(x)  # Apply LBP transformation
        lbp_x = F.relu(self.conv1_lbp(lbp_image))
        lbp_x = F.relu(self.conv2_lbp(lbp_x))
        lbp_x = F.relu(self.conv3_lbp(lbp_x))
        lbp_x = self.pool_lbp(lbp_x)  # Pool to [batch_size, 256, 1, 1]
        lbp_x = lbp_x.view(lbp_x.size(0), -1)  # Flatten to [batch_size, 256]
        lbp_x = F.relu(self.fc_lbp(lbp_x))  # Shape: [batch_size, 256] (FC2)

        # Concatenate ResNet-CBAM and LBP feature vectors
        fused = torch.cat((x_resnet, lbp_x), dim=1)  # Shape: [batch_size, 256 + 256]

        # Pass through fusion fully connected layers (FC3)
        fused = F.sigmoid(self.fc_fusion(fused))  # Apply sigmoid activation (σ)
        
        # Final classification output
        output = self.fc_output(fused)  # Output: [batch_size, num_classes]
        
        return output

# Instantiate the model with ResNet18 backbone
original_model = resnet18(pretrained=True)
model = RCL_Net(original_model, num_classes=7)

print(model.eval())  # To check the model architecture


RCL_Net(
  (resnet_cbam): ResNet18_CBAM_7Classes(
    (conv1): Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (1): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): 

In [49]:
def mixup_data(x, y, alpha=1.0):
    """Mixup data."""
    if alpha > 0:
        lam = np.random.beta(alpha, alpha)
    else:
        lam = 1
    batch_size = x.size()[0]
    index = torch.randperm(batch_size).to(x.device)

    mixed_x = lam * x + (1 - lam) * x[index, :]
    y_a, y_b = y, y[index]
    return mixed_x, y_a, y_b, lam

def mixup_criterion(criterion, pred, y_a, y_b, lam):
    return lam * criterion(pred, y_a) + (1 - lam) * criterion(pred, y_b)


In [50]:
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.metrics import precision_score, recall_score, f1_score
from tqdm import tqdm
import numpy as np
import os

def train_and_evaluate(model, train_loader, val_loader, 
                      num_epochs=300,
                      lr=0.001,  # Learning rate yang lebih kecil
                      momentum=0.1,  # Momentum yang lebih besar
                      weight_decay=0.0001,
                      max_early_stop=30,
                      save_path='./checkpoints',
                      resume=False):
    
    # device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    device = torch.device("cpu")
    model = model.to(device)
    
    optimizer = optim.SGD(model.parameters(), lr=lr, momentum=momentum, weight_decay=weight_decay)
    criterion = nn.CrossEntropyLoss()
    
    # Tambahkan learning rate scheduler
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', 
                                                    factor=0.1, patience=5, 
                                                    verbose=True)
    
    best_val_acc = 0.0
    start_epoch = 0
    early_stop_count = 0

    # Resume dari checkpoint jika ada
    if resume and os.path.exists(os.path.join(save_path, 'best_model.pth')):
        checkpoint = torch.load(os.path.join(save_path, 'best_model.pth'))
        model.load_state_dict(checkpoint['model_state_dict'])
        optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
        scheduler.load_state_dict(checkpoint['scheduler_state_dict'])
        start_epoch = checkpoint['epoch']
        best_val_acc = checkpoint['best_val_acc']
        print(f"Resuming training from epoch {start_epoch + 1} with best validation accuracy {best_val_acc:.4f}")

    # Training loop
    for epoch in range(start_epoch, num_epochs):
        print(f"Epoch {epoch + 1}/{num_epochs}")
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        
        # Training progress bar
        train_progress_bar = tqdm(train_loader, desc=f"Epoch {epoch + 1}/{num_epochs} (Training)", leave=True)
        
        for inputs, labels in train_progress_bar:
            inputs, labels = inputs.to(device), labels.to(device)
            
            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            
            # Backward pass dengan gradient clipping
            optimizer.zero_grad()
            loss.backward()
            torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
            optimizer.step()
            
            # Perhitungan metrik training
            running_loss += loss.item()
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()
            
            # Update progress bar
            train_acc = correct / total
            train_progress_bar.set_postfix(acc=f"{train_acc:.3f}", loss=f"{loss.item():.3f}")
        
        train_loss = running_loss / len(train_loader)
        train_acc = correct / total
        
        # Validation phase
        model.eval()
        val_loss = 0.0
        val_correct = 0
        val_total = 0
        all_predictions = []
        all_labels = []
        
        val_progress_bar = tqdm(val_loader, desc=f"Epoch {epoch + 1}/{num_epochs} (Validation)", leave=False)
        
        with torch.no_grad():
            for inputs, labels in val_progress_bar:
                inputs, labels = inputs.to(device), labels.to(device)
                
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                
                val_loss += loss.item()
                _, predicted = outputs.max(1)
                val_total += labels.size(0)
                val_correct += predicted.eq(labels).sum().item()
                
                # Simpan predictions dan labels untuk metrik
                all_predictions.extend(predicted.cpu().numpy())
                all_labels.extend(labels.cpu().numpy())
        
        # Hitung metrik validasi
        val_loss = val_loss / len(val_loader)
        val_acc = val_correct / val_total
        
        # Hitung precision, recall, dan F1 score
        precision = precision_score(all_labels, all_predictions, average='macro', zero_division=0)
        recall = recall_score(all_labels, all_predictions, average='macro', zero_division=0)
        f1 = f1_score(all_labels, all_predictions, average='macro', zero_division=0)
        
        # Update learning rate scheduler
        scheduler.step(val_acc)
        
        # Print metrics
        print(f"\nEpoch [{epoch + 1}/{num_epochs}]")
        print(f"Train Loss: {train_loss:.4f}, Train Accuracy: {train_acc:.4f}, hitung : {correct}/{total}")
        print(f"Val Loss: {val_loss:.4f}, Val Accuracy: {val_acc:.4f}")
        print(f"Precision: {precision:.4f}, Recall: {recall:.4f}, F1: {f1:.4f}")
        print(f"Current LR: {optimizer.param_groups[0]['lr']:.6f}\n")
        
        # Save best model
        if val_acc > best_val_acc:
            best_val_acc = val_acc
            best_checkpoint_path = os.path.join(save_path, 'best_model.pth')
            torch.save({
                'epoch': epoch,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'scheduler_state_dict': scheduler.state_dict(),
                'best_val_acc': best_val_acc,
                'train_loss': train_loss,
                'val_loss': val_loss,
                'precision': precision,
                'recall': recall,
                'f1': f1
            }, best_checkpoint_path)
            print(f"Best model saved at epoch {epoch + 1} with validation accuracy {val_acc:.4f}")
            early_stop_count = 0
        else:
            early_stop_count += 1
            print(f"Early stop count: {early_stop_count}/{max_early_stop}")
        
        # Save periodic checkpoint
        if (epoch + 1) % 10 == 0:
            checkpoint_path = os.path.join(save_path, f'checkpoint_epoch_{epoch + 1}.pth')
            torch.save({
                'epoch': epoch,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'scheduler_state_dict': scheduler.state_dict(),
                'best_val_acc': best_val_acc
            }, checkpoint_path)
            print(f"Checkpoint saved at epoch {epoch + 1}")
        
        # Early stopping check
        if early_stop_count >= max_early_stop:
            print(f"Training stopped early at epoch {epoch + 1}")
            break

    return model

In [51]:

# Buat directory untuk checkpoints
os.makedirs('./checkpoints', exist_ok=True)

# Training
model = train_and_evaluate(
    model=model,
    train_loader=train_loader,
    val_loader=val_loader,
    num_epochs=300,
    lr=0.9,
    momentum=0.1,
    weight_decay=0.0001,
    max_early_stop=30
)

Epoch 1/300


Epoch 1/300 (Training):   0%|          | 0/96 [00:00<?, ?it/s]


ValueError: too many values to unpack (expected 2)