prompt:
我想用别人训练好的模型去分辨我这些图片里是什么动物，从而辅助我这个模型更理解我给不同动物图片打分的逻辑
有没有一个能识别动物的model我能用的

ResNet50 作为预训练的卷积基模型被用于提取图片的特征。通过将 include_top=False，我们去掉了原本用于分类的顶部全连接层，只保留了卷积层和池化层，提取出来的特征会传给后续的全连接层进行分类。

GlobalAveragePooling2D：这一层用于池化 ResNet50 输出的特征图，它会将每个通道的特征图缩减为一个单一的数值。这个操作帮助我们减少模型的参数量和计算量。

训练部分：数据被划分为训练集、验证集和测试集。然后用 fit() 训练模型，训练过程中可以通过 TensorBoard 可视化训练过程。

比较两个图片：通过调整图片大小并输入到模型中，比较模型对两张图片的预测结果，得出你更喜欢的图片。

1. 准备环境

In [15]:
import os
import cv2
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Input
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model

2. 加载图片和评分

In [5]:
# 加载CSV文件
df = pd.read_csv('dogs_manual_score.csv')

# 获取图片路径和对应的评分
image_paths = df['Photo'].apply(lambda x: os.path.join('dog_pics', x)).tolist()
scores = df['Score'].tolist()


3. 加载和预处理图片

In [7]:
# 加载图片并进行预处理
def load_image(image_path):
    img = image.load_img(image_path, target_size=(256, 256))  # 读取并调整大小
    img_array = image.img_to_array(img)  # 转换为数组
    img_array = img_array / 255.0  # 归一化处理
    return img_array

# 处理所有图片
images = np.array([load_image(path) for path in image_paths])

# 拆分训练集和测试集（80% 训练，20% 测试）
X_train, X_test, y_train, y_test = train_test_split(images, scores, test_size=0.2, random_state=42)

4. 使用ResNet50提取图片特征

In [8]:
# 使用ResNet50提取图片特征
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(256, 256, 3))

# 冻结ResNet50的层（不进行训练）
for layer in base_model.layers:
    layer.trainable = False

In [9]:
# 定义特征提取模型
feature_extractor = Model(inputs=base_model.input, outputs=GlobalAveragePooling2D()(base_model.output))

# 提取训练和测试集的特征
X_train_features = feature_extractor.predict(X_train)
X_test_features = feature_extractor.predict(X_test)


[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 2s/step
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 1s/step


5. 训练模型

In [10]:
# 确保数据类型是 float32
X_train_features = X_train_features.astype(np.float32)
X_test_features = X_test_features.astype(np.float32)
y_train = np.array(y_train, dtype=np.float32)
y_test = np.array(y_test, dtype=np.float32)

# 打印数据形状，确保它是正确的
print("X_train_features shape:", X_train_features.shape)  # 例如 (num_samples, feature_dim)
print("X_test_features shape:", X_test_features.shape)
print("y_train shape:", y_train.shape)  # 例如 (num_samples,)
print("y_test shape:", y_test.shape)

# 构建评分预测模型（回归模型）
score_model = Sequential([
    Input(shape=(X_train_features.shape[1],)),  # 输入是 ResNet50 提取的特征
    Dense(256, activation='relu'),  # 隐藏层 1
    Dense(128, activation='relu'),  # 隐藏层 2
    Dense(1)  # 输出层：预测评分
])

# 编译模型
score_model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

# 训练模型
history = score_model.fit(
    X_train_features, y_train,  # 训练数据
    epochs=20,  # 训练轮数
    batch_size=32,  # 批大小
    validation_data=(X_test_features, y_test)  # 验证集
)

# 可选：输出训练过程中每个epoch的损失和 MAE
print("Training history:", history.history)


X_train_features shape: (245, 2048)
X_test_features shape: (62, 2048)
y_train shape: (245,)
y_test shape: (62,)
Epoch 1/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 94ms/step - loss: 953979.6250 - mae: 975.9830 - val_loss: 916458.4375 - val_mae: 956.4792
Epoch 2/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 30ms/step - loss: 907236.0000 - mae: 951.7520 - val_loss: 845894.9375 - val_mae: 918.8423
Epoch 3/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 32ms/step - loss: 825136.1875 - mae: 907.4749 - val_loss: 721895.6875 - val_mae: 848.6623
Epoch 4/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 30ms/step - loss: 681153.6250 - mae: 824.1739 - val_loss: 537553.7500 - val_mae: 731.9694
Epoch 5/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step - loss: 489372.0000 - mae: 697.3726 - val_loss: 310358.9688 - val_mae: 555.2874
Epoch 6/20
[1m8/8[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 

6. 模型评估

In [19]:
score_model = load_model('score_model.h5')



In [16]:
score_model = load_model('path/to/your/score_model.h5')
# 评估模型
mse, mae = score_model.evaluate(X_test_features, y_test)
print(f"Model MSE: {mse}, MAE: {mae}")

FileNotFoundError: [Errno 2] Unable to synchronously open file (unable to open file: name = 'path/to/your/score_model.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

In [17]:
# 计算皮尔逊相关系数
y_pred = score_model.predict(X_test_features)
correlation = np.corrcoef(y_test, y_pred.flatten())[0, 1]
print(f'Pearson correlation: {correlation}')


[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
Pearson correlation: -0.11761570693719317


7. 比较两张图片，预测哪个更符合你的喜好

In [25]:
img1 = cv2.imread('animal_pics/google-advancedsearch-2.jpg')  # 将路径替换为实际图片路径
print(score_model.input_shape)
# 将图片转换为数组
img1 = cv2.resize(img1, (256, 256))  # 如果使用OpenCV，需要先调整图像大小
img1 = img1 / 255.0  # 归一化到0-1范围
img1 = np.expand_dims(img1, axis=0)  # 增加批次维度


features1 = feature_extractor.predict(img1)  # 直接传入已调整的图片
print(f"提取的特征形状: {features1.shape}")
    
    # 使用评分模型来预测评分
score1 = score_model.predict(img1)


print(score1)

(None, 256, 256, 3)
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 325ms/step
提取的特征形状: (1, 2048)
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 311ms/step
[[0.       0.       0.       ... 2.160856 0.       0.      ]]
