prompt:
我想用别人训练好的模型去分辨我这些图片里是什么动物，从而辅助我这个模型更理解我给不同动物图片打分的逻辑
有没有一个能识别动物的model我能用的

ResNet50 作为预训练的卷积基模型被用于提取图片的特征。通过将 include_top=False，我们去掉了原本用于分类的顶部全连接层，只保留了卷积层和池化层，提取出来的特征会传给后续的全连接层进行分类。

GlobalAveragePooling2D：这一层用于池化 ResNet50 输出的特征图，它会将每个通道的特征图缩减为一个单一的数值。这个操作帮助我们减少模型的参数量和计算量。

训练部分：数据被划分为训练集、验证集和测试集。然后用 fit() 训练模型，训练过程中可以通过 TensorBoard 可视化训练过程。

比较两个图片：通过调整图片大小并输入到模型中，比较模型对两张图片的预测结果，得出你更喜欢的图片。

1. 准备环境

In [24]:
import os
import cv2
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Input
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt

2. 加载图片和评分

In [15]:
# 加载CSV文件
df = pd.read_csv('dogs_manual_score.csv')

# 获取图片路径和对应的评分
image_paths = df['Photo'].apply(lambda x: os.path.join('dog_pics', x)).tolist()
scores = df['Score'].tolist()


3. 加载和预处理图片

In [16]:
# 加载图片并进行预处理
def load_image(image_path):
    img = image.load_img(image_path, target_size=(256, 256))  # 读取并调整大小
    img_array = image.img_to_array(img)  # 转换为数组
    img_array = img_array / 255.0  # 归一化处理
    return img_array

# 处理所有图片
images = np.array([load_image(path) for path in image_paths])

# 拆分训练集和测试集（80% 训练，20% 测试）
X_train, X_test, y_train, y_test = train_test_split(images, scores, test_size=0.2, random_state=42)

4. 使用ResNet50提取图片特征

In [17]:
# 使用ResNet50提取图片特征
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(256, 256, 3))

# 冻结ResNet50的层（不进行训练）
for layer in base_model.layers:
    layer.trainable = False

In [18]:
# 定义特征提取模型
feature_extractor = Model(inputs=base_model.input, outputs=GlobalAveragePooling2D()(base_model.output))

# 提取训练和测试集的特征
X_train_features = feature_extractor.predict(X_train)
X_test_features = feature_extractor.predict(X_test)


[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 3s/step
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2s/step


5. 训练模型

In [28]:
# 确保数据是float32
X_train_features = X_train_features.astype(np.float32)
X_test_features = X_test_features.astype(np.float32)

y_train = np.array(y_train, dtype=np.float32)
y_test = np.array(y_test, dtype=np.float32)

# 确保形状正确
print("X_train_features shape:", X_train_features.shape)  # (样本数, 特征维度)
print("X_test_features shape:", X_test_features.shape)
print("y_train shape:", y_train.shape)  # (样本数,)
print("y_test shape:", y_test.shape)

# 假设 `image_paths` 包含所有图像路径
# 假设 `load_image` 是加载图像的函数，将图像加载为numpy数组
X_train_images = [load_image(path) for path in image_paths]
X_train_images = np.array(X_train_images)

# 检查图像和标签数量是否一致
print("Number of images:", X_train_images.shape[0])
print("Number of labels:", y_train.shape[0])

# 如果数量不一致，修正为相同数量
assert X_train_images.shape[0] == y_train.shape[0], "图像和标签数量不一致，请检查数据。"

# 创建数据增强器
datagen = ImageDataGenerator(
    rotation_range=40, 
    width_shift_range=0.2, 
    height_shift_range=0.2, 
    shear_range=0.2, 
    zoom_range=0.2, 
    horizontal_flip=True, 
    fill_mode='nearest'
)

# 在增强图像并提取特征后，确保样本数与标签数一致
augmented_features = []
augmented_labels = []

# 定义一个变量来记录处理的批次数量
processed_samples = 0

# 遍历生成增强图像
for batch in datagen.flow(X_train_images, y_train, batch_size=32, shuffle=False):
    # 取出每个批次的特征
    features_batch = model.predict(batch)  # 使用ResNet50提取增强后的图像特征
    augmented_features.append(features_batch)
    augmented_labels.append(y_train)  # 记录标签

    # 增加处理的样本数量
    processed_samples += batch.shape[0]

    # 如果处理的样本数量达到了图像的总数，停止增强
    if processed_samples >= len(X_train_images):
        break

# 将所有增强后的特征和标签合并
X_train_features = np.vstack(augmented_features)
y_train = np.concatenate(augmented_labels)

# 检查增强后的数据数量是否一致
print("Augmented X_train_features shape:", X_train_features.shape)
print("Augmented y_train shape:", y_train.shape)

# 训练模型
history = score_model.fit(
    X_train_features, y_train,  # 使用增强后的特征数据进行训练
    epochs=20,
    batch_size=32,
    validation_data=(X_test_features, y_test)
)


X_train_features shape: (307, 2048)
X_test_features shape: (62, 2048)
y_train shape: (245,)
y_test shape: (62,)
Number of images: 307
Number of labels: 245


AssertionError: 图像和标签数量不一致，请检查数据。

6. 模型评估

In [22]:
# 评估模型
mse, mae = score_model.evaluate(X_test_features, y_test)
print(f"Model MSE: {mse}, MAE: {mae}")

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step - loss: 3452.3867 - mae: 49.3787
Model MSE: 3311.433837890625, MAE: 47.7454948425293


In [23]:
# 计算皮尔逊相关系数
y_pred = score_model.predict(X_test_features)
correlation = np.corrcoef(y_test, y_pred.flatten())[0, 1]
print(f'Pearson correlation: {correlation}')


[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 122ms/step
Pearson correlation: -0.11869934803175151


7. 比较两张图片，预测哪个更符合你的喜好

In [13]:
img1 = cv2.imread('animal_pics/google-advancedsearch-2.jpg')  # 将路径替换为实际图片路径
img2 = cv2.imread('animal_pics/google-advancedsearch-7.jpg')

# 将图片转换为数组
img1 = cv2.resize(img1, (256, 256))  # 如果使用OpenCV，需要先调整图像大小
img1 = img1 / 255.0  # 归一化到0-1范围
img1 = np.expand_dims(img1, axis=0)  # 增加批次维度

img2 = cv2.resize(img2, (256, 256))  # 如果使用OpenCV，需要先调整图像大小
img2 = img2 / 255.0  # 归一化到0-1范围
img2 = np.expand_dims(img2, axis=0)  # 增加批次维度

def compare_images(img1, img2):
    # 提取图片特征
    features1 = model.predict(img1)  # 直接传入已调整的图片
    features2 = model.predict(img2)
    
    # 使用评分模型来预测评分
    score1 = score_model.predict(features1)
    score2 = score_model.predict(features2)
    
    if score1 > score2:
        return "You prefer the first image."
    else:
        return "You prefer the second image."

# 假设img1和img2是你要比较的两张图片

print(compare_images(img1, img2))

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 260ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 274ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 154ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 84ms/step
You prefer the second image.
