# Kaggle Project

## Describe Your Dataset

**URL:** https://www.kaggle.com/datasets/gpiosenka/cards-image-datasetclassification/data

**Task:**

Identify the suits and points of playing cards, divided into 53 categories.


**Datasets**

* Train dataset: 7624 card Image, 224*224 3 channel

* Validation dataset: 265 Image, 224*224 3 channel

* Test dataset: 265 Image, 224*224 3 channel

**Features(x):**

suits: clubs, diamonds, hearts, spades  
points: ace, two, three, four, five, six, seven, eight, nine, ten, jack, queen, king, jocker
| suits   | 
| ------ | 
| ♠️     | 
| ♥️     | 
| ♦️     | 
| ♣️     |   

---

| points    |
| ------- |
| A, 2-10, J, Q, K |
| A, 2-10, J, Q, K |
| A, 2-10, J, Q, K |
| A, 2-10, J, Q, K |


**Target(y):**  

playing card classification

suits & points:  

| suits   | points    |
| ------ | ------- |
| ♠️     | A, 2-10, J, Q, K |
| ♥️     | A, 2-10, J, Q, K |
| ♦️     | A, 2-10, J, Q, K |
| ♣️     | A, 2-10, J, Q, K |


---

## Build Your Model

### Data preprocessing

In [31]:
## codes
import os.path
import torch
import torch.nn as nn
import torch.nn.functional as F
from tqdm import tqdm
from torch import optim
from torch.utils.data import DataLoader as D
from torchvision.datasets import ImageFolder
from torchvision.transforms import transforms


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')  # use GPU or CPU
print('Using {} device'.format(device))

# dataset path
image_path = ''
batch_size = 128  # batch size
torch.manual_seed(42)  #  set random seed to ensure the same split everytime

# data preprocessing
transform = transforms.Compose([
    transforms.Resize((224, 224)),  # resize image
    transforms.CenterCrop(224),  # clip image from center
    transforms.RandomHorizontalFlip(),  # random horizontal flip
    transforms.RandomRotation(10),  # random rotation
    transforms.ToTensor(),  # transform to tensor
])

transform_test = transforms.Compose([
    transforms.Resize((224, 224)),  # resize image
    transforms.CenterCrop(224),  # clip image from center
    transforms.ToTensor(),  # transform to tensor
])

dataset = ImageFolder(image_path)

# read dataset
train_dataset = ImageFolder(image_path + '/train', transform= transform)
val_dataset = ImageFolder(image_path + '/valid', transform = transform_test)
test_dataset = ImageFolder(image_path + '/test', transform = transform_test)

# data loader
train_loader = D(train_dataset, batch_size=batch_size, shuffle=True)  # random shuffle
val_loader = D(val_dataset, batch_size=batch_size, shuffle=False)  # not shuffle
test_loader = D(test_dataset, batch_size=batch_size, shuffle=False)  # not shuffle


Using cuda device


### Model Construction

In [32]:
## codes
from typing import Union, Type, List

import torch
import torch.nn as nn

# 3x3 convolution
def conv3x3(in_channels,out_channels,stride=1):
    return nn.Conv2d(in_channels,out_channels,3,stride,1,bias=False)

def conv1x1(in_channels,out_channels,stride=1):
    return nn.Conv2d(in_channels,out_channels,1,stride,0,bias=False)

class BasicBlock(nn.Module):
    # basic block
    expansion = 1 # channel expansion
    def __init__(self,in_channels,out_channels,stride=1,
                    downsample=None,
                    groups=1,
                    base_width=64,
                    dilation=1,
                    norm_layer=None
                 ): 
        super(BasicBlock,self).__init__()
        self.conv1 = conv3x3(in_channels,out_channels,stride)
        self.bn1 = nn.BatchNorm2d(out_channels) # batch normalization
        self.relu = nn.ReLU(inplace=True)
        self.conv2 = conv3x3(out_channels,out_channels)
        self.bn2 = nn.BatchNorm2d(out_channels)
        # if the input and output channel is not the same, then downsample
        self.downsample = downsample
        self.stride = stride

    def forward(self,x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)

        # if the input and output channel is not the same, then downsample
        if self.downsample is not None:
            identity = self.downsample(x)

        # residual connection
        out += identity
        out = self.relu(out)

        return out

class Bottleneck(nn.Module):
    # bottleneck block

    expansion = 4 # channel expansion

    def __init__(self, in_channels,out_channels,stride=1,
                    downsample=None,
                    groups=1,
                    base_width=64,
                    dilation=1,
                    norm_layer=None):
        super().__init__()
        if norm_layer is None:
            norm_layer = nn.BatchNorm2d # batch normalization

        width = int(out_channels * (base_width / 64.)) # calculate the width of each layer

        self.conv1 = conv1x1(in_channels,width) # 1x1 convolution
        self.bn1 = norm_layer(width)
        self.conv2 = conv3x3(width,width,stride,dilation) # 3x3 convolution
        self.bn2 = norm_layer(width)
        self.conv3 = conv1x1(width,out_channels * self.expansion ) # 1x1 convolution
        self.bn3 = norm_layer(out_channels * self.expansion )
        self.relu = nn.ReLU(inplace=True)
        self.downsample = downsample
        self.stride = stride

    def forward(self,x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.relu(out)

        out = self.conv3(out)
        out = self.bn3(out)

        if self.downsample is not None:
            identity = self.downsample(x)

        out += identity
        out = self.relu(out)

        return out
    
class ResNet18(nn.Module):
    def __init__(self, num_classes,
                 block:Type[Union[BasicBlock,Bottleneck]],
                 layers:List[int],):
        super(ResNet18, self).__init__()

        self._norm_layer = nn.BatchNorm2d
        self.in_channels = 64
        self.dilation = 1

        self.groups = 1
        self.base_width = 64

        self.conv1 = nn.Conv2d(3, self.in_channels, 7, 2, 3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.in_channels)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(3, 2, 1)

        # ResNet-18 has 4 stages with [2, 2, 2, 2] BasicBlocks
        self.layer1 = self.make_layer(block, 64, layers[0])
        self.layer2 = self.make_layer(block, 128, layers[0], stride=2)
        self.layer3 = self.make_layer(block, 256, layers[0], stride=2)
        self.layer4 = self.make_layer(block, 512, layers[0], stride=2)

        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512, num_classes)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)

    def forward(self, x):
        x = self.conv1(x) # 7x7 convolution
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x) # 3x3 pooling

        x = self.layer1(x) # residual block
        x = self.layer2(x) # residual block
        x = self.layer3(x) # residual block
        x = self.layer4(x) # residual block

        x = self.avgpool(x) # average pooling
        x = torch.flatten(x,1) # flatten
        x = self.fc(x) # fully connected layer

        return x

    def make_layer(self, block, out_channels, num_blocks, stride=1):
        layers = []
        downsample = None
        previous_dilation = self.dilation

        if stride != 1 or self.in_channels != out_channels * block.expansion:
            downsample = nn.Sequential(
                conv1x1(self.in_channels, out_channels * block.expansion, stride),
                self._norm_layer(out_channels * block.expansion)
            )

        layers.append(block(
            self.in_channels,
            out_channels,
            stride,
            downsample,
            self.groups,
            self.base_width,
            previous_dilation,
            self._norm_layer)
        )
        self.in_channels = out_channels * block.expansion

        for _ in range(1, num_blocks):
            layers.append(block(
                self.in_channels,
                out_channels,
                groups=self.groups,
                base_width=self.base_width,
                dilation=self.dilation,
                norm_layer=self._norm_layer
            ))

        return nn.Sequential(*layers)


### Train Model & Select Model

In [33]:
## codes
def validate(model, val_loader):
    # validate the model, return the accuracy
    correct = 0 # predict correct number
    total = 0   # total number
    with torch.no_grad():
        for data in val_loader:
            images, labels = data
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    return correct / total

model = ResNet18(53, BasicBlock, [2, 2, 2, 2])
model.to(device)

# define loss function
criterion = nn.CrossEntropyLoss()
# define optimizer,Adam is a kind of gradient descent algorithm,not sensitive to hyperparameters
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 30
correct = 0

# train the model
for epoch in range(num_epochs):
    total_loss = 0.0  # every epoch loss

    # use tqdm to show the progress bar
    progress_bar = tqdm(train_loader,desc='Epoch {:1d}'.format(epoch),leave=False,disable=False)

    for inputs, labels in progress_bar:
        # 
        inputs = inputs.to(device)
        labels = labels.to(device)
        # gradient clear
        optimizer.zero_grad()
        outputs = model(inputs)
        # calculate loss
        loss = criterion(outputs, labels)
        # backpropagation
        loss.backward()
        optimizer.step()  # update parameters

        total_loss += loss.item()

        progress_bar.set_postfix({'training_loss': '{:.3f}'.format(loss.item() / len(inputs))})

    # use validate dataset to validate the model, each 5 epoch
    if (epoch + 1) % 5 == 0:
        acc = validate(model, val_loader)
    #   save the best model
        if(acc > correct):
            correct = acc
            torch.save(model.state_dict(), 'model.ckpt')  # save model


                                                                              

---

## Performance

In [34]:
## codes

if(os.path.exists('model.ckpt')):
    model.load_state_dict(torch.load('model.ckpt'))
model.eval()  # set model to evaluation mode
correct = 0
total = 0

# test the model
with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)
        # get the predicted label
        _, predicted = torch.max(outputs.data, dim=1)
        # calculate the accuracy
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Accuracy of the network on the test images: {} %'.format(100 * correct / total))


Accuracy of the network on the test images: 92.45283018867924 %


The results explains:

Accuracy of the network on the test images: **92.45283018867924 %**

- **Data Set**
  - We used a dataset containing a large number of playing card images during training and testing of ResNet-18.
- **Neural Network**
  - ResNet-18 is a relatively deep neural network containing 18 layers. This depth enables the network to learn more features and patterns to better distinguish images of playing cards.
- **Residual connections**
  - ResNet-18 introduced residual connections, allowing information to pass more freely between layers in the network. This helps prevent vanishing gradient problems and makes the network easier to train.
- **Data augmentation**
  - When training the data, we used the data augmentation technique of random flipping. This increases the diversity of the training data, helping the network generalize better to new images.
- **Optimization algorithm**
  - We use the efficient Adam optimization algorithm. Adam can automatically adjust the learning rate of each parameter, which helps speed up the convergence process and reduce training time and resource costs.

**Due to the above factors, the model achieved an accuracy of 92% in the test set.**