## 场景识别
说明：由于本代码运行涉及到具体数据集，而数据集量偏大。下面仅仅在jupyter notebook中给出相应代码，具体运行结果截图表示。


给定一批图片，利用深度学习算法，判断该图片究竟是属于哪座建筑的一部分。测试集图片未知，测试集图片，需要说明几点：
* 测试图片均为前视视角图像；
* 测试图片中既可能包含指定的建筑物的某个部分，也可能包含非指定建筑的某个部分，还有可能不包含任何需要识别的建筑物。
* 要求最终的分类器按照下表输出相应的值。

<img src="pic/分类类别.png" style="zoom:50%;" />

#### 1.要求
1. 对20类不同建筑样本进行分类，数据集需自行采集。
2. 测试集不包含夜晚的照片。
3. 采集时，设备以普通手机为主，采集后的图片会被裁剪、缩放。使用手机设备进行采集时，以前视视角进行拍摄，不会采用无人机拍摄建筑物的俯视图片。前视视角是指人在道路上正常行进的时候，从他的视角看到的建筑物的样子。但是前视视角不代表相机一定水平朝前正对着建筑物。测试集中还有建筑物的侧面照片。拍摄点一般都会在主路附近，不会刻意跑到建筑物的犄角旮旯去拍摄建筑物。
4. 测试集包括一定比例负样本，负样本包含建筑类和非建筑类。

#### 2.需求分析
1. 对数据样本进行采集，其中采集的数据集文件夹目录层次如下：
  
<img src="pic/数据集.png" style="zoom:50%;" />

2. 数据采集完成后，需对数据进行处理：
- 数据清理：清理不符合要求的数据。比如树木等遮挡物超过80%的图片、拍摄过程中误拍的图片。
- 数据预处理：统一数据格式，并为了防止数据量过大，将原始图片压缩到100W像素。
- 数据增强：随机RGB(正负10)、随机亮度和对比度、随机仿射变换、随机裁剪。为了平衡不同类别之间的样本数量，每种类别统一增强到500张左右。
  处理结果下图：
<img src="pic/数据增强结果.png" style="zoom:50%;" />

3. 20类样本+负样本共21类。负样本可以自行采集+网络上搜集建筑数据集+网络上相关建筑照片。

In [None]:
import albumentations as A
import cv2
import os
from PIL import Image
import numpy as np

# 数据集根目录
dataset_root = "/home/yjq/dataset_augmentation"
target_total_pixels = 1000000
max_num = 500
labels = [
    "00_负样本",
    "01_天河大楼",
    "02_体育馆",
    "03_航院主楼",
    "04_01教学楼",
    "05_02教学楼",
    "06_03教学楼",
    "07_图书馆",
    "08_东跨线桥",
    "09_西跨线桥",
    "10_游泳馆",
    "11_博士生楼",
    "12_俱乐部",
    "13_银河大楼",
    "14_老图书馆",
    "15_三院1号楼（主楼）",
    "16_三院2号楼（老楼）",
    "17_海天楼",
    "18_四院主楼",
    "19_北斗",
    "20_校主楼",
]

# 数据增强
transform = A.Compose(
    [
        A.RGBShift(r_shift_limit=5, g_shift_limit=5, b_shift_limit=5, p=0.5),
        A.RandomBrightnessContrast(brightness_limit=0.3, contrast_limit=0.2, p=0.5),
        A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=30, p=1),
    ]
)

# 遍历数据集，将每一张图片缩放到100W像素，保存压缩后的图像和3张数据增强后的图像
for category_folder in os.listdir(dataset_root):
    category_folder_path = os.path.join(dataset_root, category_folder)
    output_folder_path = os.path.join(dataset_root, category_folder)
    os.makedirs(output_folder_path, exist_ok=True)
    if os.path.isdir(category_folder_path):
        print(f"Processing category: {category_folder}")
        num_cnt = len(os.listdir(category_folder_path))
    for filename in os.listdir(category_folder_path):
        file_path = os.path.join(category_folder_path, filename)
        output_filename = os.path.join(output_folder_path, filename)
        filename = filename.split(".")[0]
    if os.path.isfile(file_path):
            image = Image.open(file_path)
            width = image.width
            height = image.height
            if (width * height) < target_total_pixels:
                continue
            else:
                scale_factor = (target_total_pixels / (width * height)) ** 0.5
                target_width = int(width * scale_factor)
                target_height = int(height * scale_factor)
                image = image.resize(
                    (target_width, target_height), resample=Image.BILINEAR
                )
            image.save(output_folder_path + "/" + filename + ".jpg")
            image = np.array(image)
            for i in range(3):
                output_filename = (
                    output_folder_path + "/" + filename + "_" + str(i) + ".jpg"
                )
                output_filename = os.path.join(
                    output_folder_path, filename + "_" + str(i) + ".jpg"
                )
                transformed = transform(image=image)["image"]
                transformed = cv2.cvtColor(transformed, cv2.COLOR_RGB2BGR)
                cv2.imencode(".jpg", transformed)[1].tofile(output_filename)
                num_cnt = num_cnt + 1

#### 3.训练环境
* 电脑型号：13th Gen Intel® Core™ i7-13700KF × 24 + RTX 4070Ti显卡
* Ubuntu20.04环境 + CUDA12.2
* Vscode开发环境

<img src="pic/环境.png" style="zoom:50%;" />

#### 4.结果

1. 使用Resnet50模型微调
2. 训练结果Resnet50模型在测试集上的正确率在89%左右
   
<img src="pic/restnet.png" style="zoom:50%;" />

3. Loss曲线

<img src="pic/Loss.png" style="zoom:15%;" />

In [None]:
# 载入数据
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
from torchvision import models, transforms
import torch.utils.data as tud
import numpy as np
import matplotlib.pyplot as plt
from model.inception_resnet_v2 import Inception_ResNetv2
from torch.utils.tensorboard import SummaryWriter

dataset_root = "/home/yjq/dataset_augmentation"
global_model_name = "resnet50"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
writer = SummaryWriter("logs")

batch_size = 32
input_size = 256
num_class = 21

f = open("result_" + global_model_name + ".txt", "w")

dataset = torchvision.datasets.ImageFolder(
    root=dataset_root,
    transform=torchvision.transforms.Compose(
        [
            torchvision.transforms.Resize(280),  # 调整图像短边
            torchvision.transforms.CenterCrop(input_size),
            torchvision.transforms.ToTensor(),
        ]
    ),
)
print(dataset, "\n")
print("classes:\n", dataset.classes, "\n")

# Split dataset into train and test (7:3)
train_dataset, test_dataset = torch.utils.data.random_split(
    dataset, [int(len(dataset) * 0.7), len(dataset) - int(len(dataset) * 0.7)]
)
train_dataloader = tud.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dataloader = tud.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

def initialize_model(model_name, num_class, use_pretrained, feature_extract):
    if model_name == "resnet50":
        model_ft = models.resnet50(pretrained=use_pretrained)
        if feature_extract:  # do not update the parameters
            for param in model_ft.parameters():
                param.requires_grad = False
        num_ftrs = model_ft.fc.in_features
        model_ft.fc = nn.Linear(num_ftrs, num_class)
    else:
        print("model not implemented")
        return None
    return model_ft

def train_model(model, train_dataloader, loss_fn, optimizer, epoch):
    model = model.to(device)
    model.train()
    total_loss = 0.0
    total_corrects = 0.0
    for idx, (inputs, labels) in enumerate(train_dataloader):
        inputs = inputs.to(device)
        labels = labels.to(device)
        outputs = model(inputs)
        loss = loss_fn(outputs, labels)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        preds = outputs.argmax(dim=1)
        total_loss += loss.item() * inputs.size(0)
        total_corrects += torch.sum(preds.eq(labels))
    epoch_loss = total_loss / len(train_dataloader.dataset)
    epoch_accuracy = total_corrects / len(train_dataloader.dataset)
    f.write(
        "Epoch:{}, Training Loss:{}, Traning Acc:{}\n".format(
            epoch, epoch_loss, epoch_accuracy
        )
    )
    print(
        "Epoch:{}, Training Loss:{}, Traning Acc:{}\n".format(
            epoch, epoch_loss, epoch_accuracy
        )
    )
    writer.add_scalar("Loss/train", epoch, epoch_loss)
    writer.add_scalar("Accuracy/train", epoch, epoch_accuracy)


def test_model(model, test_dataloader, loss_fn):
    model.eval()
    total_loss = 0.0
    total_corrects = 0.0
    with torch.no_grad():
        for idx, (inputs, labels) in enumerate(test_dataloader):
            inputs = inputs.to(device)
            labels = labels.to(device)
            outputs = model(inputs)
            loss = loss_fn(outputs, labels)
            preds = outputs.argmax(dim=1)
            total_loss += loss.item() * inputs.size(0)
            total_corrects += torch.sum(preds.eq(labels))
    epoch_loss = total_loss / len(test_dataloader.dataset)
    epoch_accuracy = total_corrects / len(test_dataloader.dataset)
    f.write("Test Loss:{}, Test Acc:{}\n".format(epoch_loss, epoch_accuracy))
    print("Test Loss:{}, Test Acc:{}\n".format(epoch_loss, epoch_accuracy))
    writer.add_scalar("Loss/test", epoch, epoch_loss)
    writer.add_scalar("Accuracy/test", epoch, epoch_accuracy)
    return epoch_accuracy

数据集输出结果为：

<img src="pic/datasets.png" style="zoom:50%;" />

In [None]:
# 进行训练
model = initialize_model(global_model_name, 21, use_pretrained=True, feature_extract=True)
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
num_epochs = 80
best_epoch = 0
best_acc = 0.95
test_accuracy_hist = []
for epoch in range(num_epochs):
    train_model(model, train_dataloader, loss_fn, optimizer, epoch)
    acc = test_model(model, test_dataloader, loss_fn)
    test_accuracy_hist.append(acc.item())
    for name, param in model.named_parameters():
        writer.add_histogram(name, param, epoch)
        writer.add_histogram(f"{name}.grad", param.grad, epoch)
    if acc > best_acc:
        best_acc = acc
        best_epoch = epoch
        torch.save(model.state_dict(), "Resnet50_best.pth")
    if (epoch + 1) % 10 == 0:
        torch.save(
            model.state_dict(), global_model_name + "_" + str(epoch + 1) + ".pth"
        )
f.close()
writer.close()
torch.save(model.state_dict(), global_model_name + "_" + str(epoch + 1) + ".pth")

1. 使用Inception_Resnet_v2网络进行训练
2. 训练结果显示Inception_Resnet_v2在测试集上的正确率能达到97%，能满足任务需要

<img src="pic/inception_resnet2.png" style="zoom:50%;" />

3. 训练过程中使用nvitop查看GPU占用率，使用System Monitor查看CPU占用情况

<img src="pic/nvitop.png" style="zoom:50%;" />

<img src="pic/cpu.png" style="zoom:50%;" />

In [None]:
#搭建Inception_Resnet_v2网络
class Conv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, padding, stride=1, bias=True):
        super(Conv2d, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, padding=padding, bias=bias)
        self.bn = nn.BatchNorm2d(out_channels, eps=0.001, momentum=0.1)
        self.relu = nn.ReLU(inplace=True)
    def forward(self, x):
        x = self.conv(x)
        x = self.bn(x)
        x = self.relu(x)
        return x


class Reduction_A(nn.Module):
    # 35 -> 17
    def __init__(self, in_channels, k, l, m, n):
        super(Reduction_A, self).__init__()
        self.branch_0 = Conv2d(in_channels, n, 3, stride=2, padding=0, bias=False)
        self.branch_1 = nn.Sequential(
            Conv2d(in_channels, k, 1, stride=1, padding=0, bias=False),
            Conv2d(k, l, 3, stride=1, padding=1, bias=False),
            Conv2d(l, m, 3, stride=2, padding=0, bias=False),
        )
        self.branch_2 = nn.MaxPool2d(3, stride=2, padding=0)

    def forward(self, x):
        x0 = self.branch_0(x)
        x1 = self.branch_1(x)
        x2 = self.branch_2(x)
        return torch.cat((x0, x1, x2), dim=1) # 17 x 17 x 1024

class Stem(nn.Module):
    def __init__(self, in_channels):
        super(Stem, self).__init__()
        self.conv2d_1a_3x3 = Conv2d(in_channels, 32, 3, stride=2, padding=0, bias=False)

        self.conv2d_2a_3x3 = Conv2d(32, 32, 3, stride=1, padding=0, bias=False)
        self.conv2d_2b_3x3 = Conv2d(32, 64, 3, stride=1, padding=1, bias=False)

        self.mixed_3a_branch_0 = nn.MaxPool2d(3, stride=2, padding=0)
        self.mixed_3a_branch_1 = Conv2d(64, 96, 3, stride=2, padding=0, bias=False)

        self.mixed_4a_branch_0 = nn.Sequential(
            Conv2d(160, 64, 1, stride=1, padding=0, bias=False),
            Conv2d(64, 96, 3, stride=1, padding=0, bias=False),
        )
        self.mixed_4a_branch_1 = nn.Sequential(
            Conv2d(160, 64, 1, stride=1, padding=0, bias=False),
            Conv2d(64, 64, (1, 7), stride=1, padding=(0, 3), bias=False),
            Conv2d(64, 64, (7, 1), stride=1, padding=(3, 0), bias=False),
            Conv2d(64, 96, 3, stride=1, padding=0, bias=False)
        )

        self.mixed_5a_branch_0 = Conv2d(192, 192, 3, stride=2, padding=0, bias=False)
        self.mixed_5a_branch_1 = nn.MaxPool2d(3, stride=2, padding=0)

    def forward(self, x):
        x = self.conv2d_1a_3x3(x) # 149 x 149 x 32
        x = self.conv2d_2a_3x3(x) # 147 x 147 x 32
        x = self.conv2d_2b_3x3(x) # 147 x 147 x 64
        x0 = self.mixed_3a_branch_0(x)
        x1 = self.mixed_3a_branch_1(x)
        x = torch.cat((x0, x1), dim=1) # 73 x 73 x 160
        x0 = self.mixed_4a_branch_0(x)
        x1 = self.mixed_4a_branch_1(x)
        x = torch.cat((x0, x1), dim=1) # 71 x 71 x 192
        x0 = self.mixed_5a_branch_0(x)
        x1 = self.mixed_5a_branch_1(x)
        x = torch.cat((x0, x1), dim=1) # 35 x 35 x 384
        return x


class Inception_A(nn.Module):
    def __init__(self, in_channels):
        super(Inception_A, self).__init__()
        self.branch_0 = Conv2d(in_channels, 96, 1, stride=1, padding=0, bias=False)
        self.branch_1 = nn.Sequential(
            Conv2d(in_channels, 64, 1, stride=1, padding=0, bias=False),
            Conv2d(64, 96, 3, stride=1, padding=1, bias=False),
        )
        self.branch_2 = nn.Sequential(
            Conv2d(in_channels, 64, 1, stride=1, padding=0, bias=False),
            Conv2d(64, 96, 3, stride=1, padding=1, bias=False),
            Conv2d(96, 96, 3, stride=1, padding=1, bias=False),
        )
        self.brance_3 = nn.Sequential(
            nn.AvgPool2d(3, 1, padding=1, count_include_pad=False),
            Conv2d(384, 96, 1, stride=1, padding=0, bias=False)
        )

    def forward(self, x):
        x0 = self.branch_0(x)
        x1 = self.branch_1(x)
        x2 = self.branch_2(x)
        x3 = self.brance_3(x)
        return torch.cat((x0, x1, x2, x3), dim=1)


class Inception_B(nn.Module):
    def __init__(self, in_channels):
        super(Inception_B, self).__init__()
        self.branch_0 = Conv2d(in_channels, 384, 1, stride=1, padding=0, bias=False)
        self.branch_1 = nn.Sequential(
            Conv2d(in_channels, 192, 1, stride=1, padding=0, bias=False),
            Conv2d(192, 224, (1, 7), stride=1, padding=(0, 3), bias=False),
            Conv2d(224, 256, (7, 1), stride=1, padding=(3, 0), bias=False),
        )
        self.branch_2 = nn.Sequential(
            Conv2d(in_channels, 192, 1, stride=1, padding=0, bias=False),
            Conv2d(192, 192, (7, 1), stride=1, padding=(3, 0), bias=False),
            Conv2d(192, 224, (1, 7), stride=1, padding=(0, 3), bias=False),
            Conv2d(224, 224, (7, 1), stride=1, padding=(3, 0), bias=False),
            Conv2d(224, 256, (1, 7), stride=1, padding=(0, 3), bias=False)
        )
        self.branch_3 = nn.Sequential(
            nn.AvgPool2d(3, stride=1, padding=1, count_include_pad=False),
            Conv2d(in_channels, 128, 1, stride=1, padding=0, bias=False)
        )
    def forward(self, x):
        x0 = self.branch_0(x)
        x1 = self.branch_1(x)
        x2 = self.branch_2(x)
        x3 = self.branch_3(x)
        return torch.cat((x0, x1, x2, x3), dim=1)


class Reduction_B(nn.Module):
    # 17 -> 8
    def __init__(self, in_channels):
        super(Reduction_B, self).__init__()
        self.branch_0 = nn.Sequential(
            Conv2d(in_channels, 192, 1, stride=1, padding=0, bias=False),
            Conv2d(192, 192, 3, stride=2, padding=0, bias=False),
        )
        self.branch_1 = nn.Sequential(
            Conv2d(in_channels, 256, 1, stride=1, padding=0, bias=False),
            Conv2d(256, 256, (1, 7), stride=1, padding=(0, 3), bias=False),
            Conv2d(256, 320, (7, 1), stride=1, padding=(3, 0), bias=False),
            Conv2d(320, 320, 3, stride=2, padding=0, bias=False)
        )
        self.branch_2 = nn.MaxPool2d(3, stride=2, padding=0)

    def forward(self, x):
        x0 = self.branch_0(x)
        x1 = self.branch_1(x)
        x2 = self.branch_2(x)
        return torch.cat((x0, x1, x2), dim=1)  # 8 x 8 x 1536


class Inception_C(nn.Module):
    def __init__(self, in_channels):
        super(Inception_C, self).__init__()
        self.branch_0 = Conv2d(in_channels, 256, 1, stride=1, padding=0, bias=False)

        self.branch_1 = Conv2d(in_channels, 384, 1, stride=1, padding=0, bias=False)
        self.branch_1_1 = Conv2d(384, 256, (1, 3), stride=1, padding=(0, 1), bias=False)
        self.branch_1_2 = Conv2d(384, 256, (3, 1), stride=1, padding=(1, 0), bias=False)

        self.branch_2 = nn.Sequential(
            Conv2d(in_channels, 384, 1, stride=1, padding=0, bias=False),
            Conv2d(384, 448, (3, 1), stride=1, padding=(1, 0), bias=False),
            Conv2d(448, 512, (1, 3), stride=1, padding=(0, 1), bias=False),
        )
        self.branch_2_1 = Conv2d(512, 256, (1, 3), stride=1, padding=(0, 1), bias=False)
        self.branch_2_2 = Conv2d(512, 256, (3, 1), stride=1, padding=(1, 0), bias=False)

        self.branch_3 = nn.Sequential(
            nn.AvgPool2d(3, stride=1, padding=1, count_include_pad=False),
            Conv2d(in_channels, 256, 1, stride=1, padding=0, bias=False)
        )

    def forward(self, x):
        x0 = self.branch_0(x)
        x1 = self.branch_1(x)
        x1_1 = self.branch_1_1(x1)
        x1_2 = self.branch_1_2(x1)
        x1 = torch.cat((x1_1, x1_2), 1)
        x2 = self.branch_2(x)
        x2_1 = self.branch_2_1(x2)
        x2_2 = self.branch_2_2(x2)
        x2 = torch.cat((x2_1, x2_2), dim=1)
        x3 = self.branch_3(x)
        return torch.cat((x0, x1, x2, x3), dim=1) # 8 x 8 x 1536


class Inceptionv4(nn.Module):
    def __init__(self, in_channels=3, classes=1000, k=192, l=224, m=256, n=384):
        super(Inceptionv4, self).__init__()
        blocks = []
        blocks.append(Stem(in_channels))
        for i in range(4):
            blocks.append(Inception_A(384))
        blocks.append(Reduction_A(384, k, l, m, n))
        for i in range(7):
            blocks.append(Inception_B(1024))
        blocks.append(Reduction_B(1024))
        for i in range(3):
            blocks.append(Inception_C(1536))
        self.features = nn.Sequential(*blocks)
        self.global_average_pooling = nn.AdaptiveAvgPool2d((1, 1))
        self.linear = nn.Linear(1536, classes)

    def forward(self, x):
        x = self.features(x)
        x = self.global_average_pooling(x)
        x = x.view(x.size(0), -1)
        x = self.linear(x)
        return x

In [None]:
# 载入数据进行训练
global_model_name = "Inception_ResNetv2"
input_size = 299

dataset = torchvision.datasets.ImageFolder(
    root=dataset_root,
    transform=torchvision.transforms.Compose(
        [
            torchvision.transforms.Resize(300),  # 调整图像短边
            torchvision.transforms.CenterCrop(input_size),
            torchvision.transforms.ToTensor(),
        ]
    ),
)

train_dataset, test_dataset = torch.utils.data.random_split(
    dataset, [int(len(dataset) * 0.7), len(dataset) - int(len(dataset) * 0.7)]
)
train_dataloader = tud.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dataloader = tud.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

model = Inception_ResNetv2()
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
num_epochs = 80
best_epoch = 0
best_acc = 0.95
test_accuracy_hist = []
for epoch in range(num_epochs):
    train_model(model, train_dataloader, loss_fn, optimizer, epoch)
    acc = test_model(model, test_dataloader, loss_fn)
    test_accuracy_hist.append(acc.item())
    for name, param in model.named_parameters():
        writer.add_histogram(name, param, epoch)
        writer.add_histogram(f"{name}.grad", param.grad, epoch)
    if acc > best_acc:
        best_acc = acc
        best_epoch = epoch
        torch.save(model.state_dict(), "Inception_Resnet_v2_best.pth")
    if (epoch + 1) % 10 == 0:
        torch.save(
            model.state_dict(), global_model_name + "_" + str(epoch + 1) + ".pth"
        )
f.close()
writer.close()

torch.save(model.state_dict(), global_model_name + "_" + str(epoch + 1) + ".pth")

#### 5.测试
使用任务提供的未知的验证集，根据验证结果统计分类正确率。分析Inception_Resnet_V2网络最终效果如何。

<img src="pic/测试结果.png" style="zoom:50%;" />

发现验证正确率为76.81%，分析原因可能有以下几点：
* 负样本很难考虑到三号院相似建筑物。
* 部分测试样本对比度和曝光较高，数据增强时阈值设置较低。
* 部分类别间相似度高，在仅提供有限视角的情况下，存在一定概率分类错误。


改进：
* 在网络上搜索国防科技大学相关图片，对图片进行爬取并人工去除场景识别中相关正样本，对剩下的负样本进行图像增强，加入到数据集负样本之中。
* 图像增强时，随机曝光度阈值提高。

<img src="pic/负样本.png" style="zoom:30%;" />

重新进行训练，并对网络进行微调，最终在训练集上正确率为98.4%正确率，测试集96.7%正确率。使用任务提供的未知的验证集，根据验证结果统计分类正确率。本次验证正确率提升到了85.5%。

<img src="pic/测试结果2.png" style="zoom:50%;" />

发现之前存在的负样本分类错误基本上去除，大部分错误主要是验证集部分图片年代稍显久远，曝光和色彩较差。下图尽管预测正确，但是预测概率值也仅仅为0.28，表示模型也极不确定其具体类别。

<img src="pic/概率值低.png" style="zoom:30%;" />

类似图片预测概率值都比较低，低于0.7,而正确分类样本的预测概率值都接近0.99.在具体分类时，可以将预测概率值较低的图片交给人类辅助处理。例如以下该种图片样本为预测值为西跨线桥，但是真实值却为东跨线桥。

<img src="pic/验证集样本.png" style="zoom:30%;" />

In [None]:
model_path = "Inception_ResNetv2_40.pth"
datas_path = "test_data"

# 加载模型
model = Inception_ResNetv2()
model.load_state_dict(torch.load(model_path))
model.eval()
test_pic = []
for filename in os.listdir(datas_path):
    test_pic.append(filename)
test_pic = sorted(test_pic)
for filename in test_pic:
    file_path = os.path.join(datas_path, filename)
    if os.path.isfile(file_path):
        image = Image.open(file_path)
        # 加载图像并进行预处理
        if (image.width * image.height) > target_total_pixels:
            scale_factor = (target_total_pixels / (image.width * image.height)) ** 0.5
            target_width = int(image.width * scale_factor)
            target_height = int(image.height * scale_factor)
            image = image.resize((target_width, target_height), resample=Image.BILINEAR)
        image_tensor = transform(image)
        image_tensor = image_tensor.unsqueeze(0)  # 添加批次维度
        # 进行预测
        with torch.no_grad():
            outputs = model(image_tensor)
            outputs = torch.sigmoid(outputs)
            predicted_value, predicted = torch.max(outputs, 1)
            if predicted_value.item() < 0.4:
                predicted = 0
        # 打印预测结果
        print(
            "图片名称:{}, 预测结果:{}, 预测概率值:{:.4f}, 预测标签为:{}".format(
                filename,
                predicted.item(),
                predicted_value.item(),
                labels[predicted.item()],
            )
        )