## Deep Learning Coding Project 2: Image Classification

Before we start, please put your **Chinese** name and student ID in following format:

Name, 0000000000 // e.g.) 傅炜, 2021123123

YOUR ANSWER HERE

## Introduction

We will use Python 3, [NumPy](https://numpy.org/), and [PyTorch](https://pytorch.org/) for this coding project. The example code has been tested under the latest stable release version.

### Task

In this notebook, you need to train a model to classify images. Given an image, you need to distinguish its category,
e.g., whether it is a horse or an automobile. There are total 10 classes:
airplane, automobile, bird, cat, deer, dog, frog, horse, ship and truck. We
release 40,000 images for training, 10,000 images for validation. Each image has
a shape of (3, 128, 128). We will evaluate your model in 10,000 images on the test set.

Download the dataset from [here](https://cloud.tsinghua.edu.cn/d/00e0704738e04d32978b/) and organize them into a folder named "cifar_10_4x".

<!-- Images can be classified as "No Finding" or **one or more types**. In the basic task, given an image, you only need to tell whether the X-ray indicates "Infiltration". In the bonus task, you need to tell whether *each* of the diseases exists.

Images are taken from the [ChestX-ray14 dataset](https://www.kaggle.com/nih-chest-xrays/data) and downsampled to (256, 256). We release 44872 gray scale images for training and validation. We will evaluate your model on 10285 images in the test set. The dataset is available [here](https://cloud.tsinghua.edu.cn/d/16d06a89c5b4459db703/) and organized as follows: `train` directory includes all images for training and validation, and each line of `train.txt` records the labels separated by "|". -->

### Coding

We provide a code template. You can add new cells and modify our example to train your own model. To run this code, you should:

+ implement your model (named `Net`) in `model.py`.
+ implement your training loop in this notebook

Your final submitted model should not be larger than **20M**. **Using any pretrained model is NOT permitted**.
Besides, before you submit your result, **make sure you can test your model using our evaluation cell.** Name your best model "cifar10_4x_best.pth".

### Report & Submission

Your report should include:

1. the details of your model
2. all the hyper-parameters
3. all the tricks or training techniques you use
4. the training curve of your submitted model.

Reporting additional ablation studies and how you improve your model are also encouraged.

You should submit:

+ all codes
+ the model checkpoint (only "cifar10_4x_best.pth")
+ your report (a separate "pdf")

to web learning. We will use the evaluation code in this notebook to evaluate your model on the test set.

### Grading

We will grade this coding project based on the performance of your model (70%) and your report (30%). Regarding the evaluation metric of your model, assume your test accuracy is $X$, then your score is

$\frac{min(X,H)−0.6}{H−0.6}×7$

where $H$ is accuracy of the model trained by TAs and $H=0.9$, i.e., you will get the full score if your test accuracy is above 90%.

**Bonus**: The best submission with the highest testing accuracy will get 1 bonus point for the final course grade.

**Avoid plagiarism! Any student who violates academic integrity will be seriously dealt with and receive an F for the course.**

## Code Template

We have masked the the training loop in this notebook for you to complete. You should also overwrite "model.py" and implement your own model.

In [3]:
%load_ext autoreload
%autoreload 2

### Setup Code

If you use Colab in this coding project, please uncomment the code, fill the `GOOGLE_DRIVE_PATH_AFTER_MYDRIVE` and run the following cells to mount your Google drive. Then, the notebook can find the required file. If you run the notebook locally, you can skip the following cells.

In [2]:
# from google.colab import drive
# drive.mount('/content/drive')

In [3]:
# import os

# # TODO: Fill in the Google Drive path where you uploaded the assignment
# # Example: If you create a 2022SP folder and put all the files under CP1 folder, then '2022SP/CP1'
# # GOOGLE_DRIVE_PATH_AFTER_MYDRIVE = '2022SP/CP1'
# GOOGLE_DRIVE_PATH_AFTER_MYDRIVE = None 
# GOOGLE_DRIVE_PATH = os.path.join('drive', 'MyDrive', GOOGLE_DRIVE_PATH_AFTER_MYDRIVE)
# print(os.listdir(GOOGLE_DRIVE_PATH))

In [4]:
# import sys
# sys.path.append(GOOGLE_DRIVE_PATH)

In [4]:
from dataset import CIFAR10_4x
from evaluation import evaluation

from model import Net  # this should be implemented by yourself

### Enjoy Your Coding Time!

In [5]:
import math
import os
import random
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F

from torchvision import transforms
from PIL import Image


def set_seed(seed):
    seed = int(seed)
    if seed < 0 or seed > (2**32 - 1):
        raise ValueError("Seed must be between 0 and 2**32 - 1")
    else:
        random.seed(seed)
        np.random.seed(seed)
        torch.manual_seed(seed)
        torch.cuda.manual_seed(seed)
        torch.backends.cudnn.deterministic = True


device = 'cuda' if torch.cuda.is_available() else 'cpu'
set_seed(16)

In [6]:
data_root_dir = '.'

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([125 / 255, 124 / 255, 115 / 255],
                         [60 / 255, 59 / 255, 64 / 255])
])

trainset = CIFAR10_4x(root=data_root_dir,
                      split="train", transform=transform)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=32, shuffle=True, num_workers=8, pin_memory=True)

validset = CIFAR10_4x(root=data_root_dir,
                      split='valid', transform=transform)
validloader = torch.utils.data.DataLoader(
    validset, batch_size=128, shuffle=False, num_workers=8)

net = Net()
print("number of trained parameters: %d" % (
    sum([param.nelement() for param in net.parameters() if param.requires_grad])))
print("number of total parameters: %d" %
      (sum([param.nelement() for param in net.parameters()])))

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=1e-3)

net.to(device)

number of trained parameters: 48938
number of total parameters: 48938


Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 12, kernel_size=(5, 5), stride=(1, 1))
  (conv3): Conv2d(12, 16, kernel_size=(5, 5), stride=(3, 3))
  (fc1): Linear(in_features=256, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

In [7]:
model_dir = '.'
if not os.path.exists(model_dir):
    os.makedirs(model_dir)
torch.save(net, os.path.join(model_dir, 'cifar10_4x_0.pth'))

# check the model size
os.system(' '.join(['du', '-h', os.path.join(model_dir, 'cifar10_4x_0.pth')]))

1

In [8]:
##############################################################################
#                  TODO: You need to complete the code here                  #
##############################################################################
# YOUR CODE HERE
# --- 准备工作 ---
# 确保模型保存的目录存在
model_save_dir = "./models"
if not os.path.exists(model_save_dir):
    os.makedirs(model_save_dir)
model_path = os.path.join(model_save_dir, "cifar10_4x_best.pth")

# --- 训练参数 ---
num_epochs = 20  # 您可以根据需要调整训练轮数
best_valid_accuracy = 0.0  # 用于记录最佳验证准确率

print("开始训练...")

# --- 训练循环 ---
for epoch in range(num_epochs):
    # --- 训练阶段 ---
    net.train()  # 将模型设置为训练模式
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # 获取输入数据；data 是一个 [inputs, labels] 的列表
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)
        # 梯度清零
        optimizer.zero_grad()
        # 前向传播 + 反向传播 + 优化
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        # 打印统计信息
        running_loss += loss.item()
        if i % 100 == 99:    # 每 100 个 mini-batches 打印一次
            print(f'[Epoch {epoch + 1}, Batch {i + 1:5d}] 训练损失: {running_loss / 100:.3f}')
            running_loss = 0.0

    net.eval()  
    correct = 0
    total = 0
    valid_loss = 0.0
    with torch.no_grad():
        for data in validloader:
            images, labels = data
            images, labels = images.to(device), labels.to(device)
            outputs = net(images)
            loss = criterion(outputs, labels)
            valid_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    current_accuracy = 100 * correct / total
    avg_valid_loss = valid_loss / len(validloader)
    print(f'Epoch {epoch + 1} 结束 | '
          f'验证损失: {avg_valid_loss:.3f} | '
          f'验证准确率: {current_accuracy:.2f} %')

    # --- 保存最佳模型 ---
    if current_accuracy > best_valid_accuracy:
        best_valid_accuracy = current_accuracy
        torch.save(net, model_path)
        print(f'检测到新的最佳模型！已保存到 {model_path}')

print('训练完成!')
print(f'最佳验证准确率为: {best_valid_accuracy:.2f} %')
##############################################################################
#                              END OF YOUR CODE                              #
##############################################################################

开始训练...
[Epoch 1, Batch   100] 训练损失: 2.049
[Epoch 1, Batch   200] 训练损失: 1.813
[Epoch 1, Batch   300] 训练损失: 1.693
[Epoch 1, Batch   400] 训练损失: 1.612
[Epoch 1, Batch   500] 训练损失: 1.518
[Epoch 1, Batch   600] 训练损失: 1.506
[Epoch 1, Batch   700] 训练损失: 1.458
[Epoch 1, Batch   800] 训练损失: 1.422
[Epoch 1, Batch   900] 训练损失: 1.390
[Epoch 1, Batch  1000] 训练损失: 1.346
[Epoch 1, Batch  1100] 训练损失: 1.333
[Epoch 1, Batch  1200] 训练损失: 1.324
Epoch 1 结束 | 验证损失: 1.256 | 验证准确率: 55.75 %
检测到新的最佳模型！已保存到 ./models\cifar10_4x_best.pth
[Epoch 2, Batch   100] 训练损失: 1.218
[Epoch 2, Batch   200] 训练损失: 1.225
[Epoch 2, Batch   300] 训练损失: 1.196
[Epoch 2, Batch   400] 训练损失: 1.188
[Epoch 2, Batch   500] 训练损失: 1.160
[Epoch 2, Batch   600] 训练损失: 1.166
[Epoch 2, Batch   700] 训练损失: 1.140
[Epoch 2, Batch   800] 训练损失: 1.131
[Epoch 2, Batch   900] 训练损失: 1.127
[Epoch 2, Batch  1000] 训练损失: 1.155
[Epoch 2, Batch  1100] 训练损失: 1.107
[Epoch 2, Batch  1200] 训练损失: 1.073
Epoch 2 结束 | 验证损失: 1.093 | 验证准确率: 61.50 %
检测到新的最佳模型！已保存到 ./models\

KeyboardInterrupt: 

## Evaluation

Before submission, please run the following cell to make sure your model can be correctly graded.

In [11]:
!python evaluation.py
# net = torch.load(os.path.join(base_dir, "models/cifar10_4x_best.pth"))
# 这里直接加载了整个的模型对象
# torch.save(net.state_dict(), model_path) 这是仅保存参数的代码
# torch.save(net, model_path) 这是保存完整架构的代码

number of trained parameters: 48938
number of total parameters: 48938
can't load test set because [Errno 2] No such file or directory: 'F:\\desktop\\CodingProject2\\CodingProject2\\cifar_10_4x\\test', load valid set now
Accuracy of the network on the valid images: 66 %
