# 图像特征练习
<details><summary>Image features exercise</summary>
*Complete and hand in this completed worksheet (including its outputs and any supporting code outside of the worksheet) with your assignment submission. For more details see the [assignments page](http://vision.stanford.edu/teaching/cs231n/assignments.html) on the course website.*

We have seen that we can achieve reasonable performance on an image classification task by training a linear classifier on the pixels of the input image. In this exercise we will show that we can improve our classification performance by training linear classifiers not on raw pixels but on features that are computed from the raw pixels.

All of your work for this exercise will be done in this notebook.
</details>

你需要完成并提交本练习（包括输出和任何支持代码）。更多细节请参见课程网站上的[作业页面](http://vision.stanford.edu/teaching/cs231n/assignments.html)。

我们已经看到，通过在输入图像像素上训练线性分类器，可以在图像分类任务上取得合理的性能。在本练习中，我们将展示，通过在从原始像素计算出的特征上训练线性分类器，可以进一步提升分类性能。

你所有的练习内容都将在本笔记本中完成。

In [None]:
import random
import numpy as np
from cs231n.data_utils import load_CIFAR10
import matplotlib.pyplot as plt


%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # 设置默认绘图尺寸
plt.rcParams['image.interpolation'] = 'nearest' # 设置图像插值方式
plt.rcParams['image.cmap'] = 'gray' # 设置图像默认色彩映射

# 用于自动重新加载外部模块
# 参考 http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

## 加载数据
<details><summary>Load data</summary>
Similar to previous exercises, we will load CIFAR-10 data from disk.
</details>
与之前的练习类似，我们将从磁盘加载 CIFAR-10 数据。

In [None]:
from cs231n.features import color_histogram_hsv, hog_feature


def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):
    # 加载原始 CIFAR-10 数据
    cifar10_dir = "cs231n/datasets/cifar-10-batches-py"

    # 清理变量，防止多次加载数据导致内存问题
    try:
        del X_train, y_train
        del X_test, y_test
        print("Clear previously loaded data.")
    except:
        pass

    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

    # 子采样数据
    mask = list(range(num_training, num_training + num_validation))
    X_val = X_train[mask]
    y_val = y_train[mask]
    mask = list(range(num_training))
    X_train = X_train[mask]
    y_train = y_train[mask]
    mask = list(range(num_test))
    X_test = X_test[mask]
    y_test = y_test[mask]

    return X_train, y_train, X_val, y_val, X_test, y_test


X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()

## 提取特征
<details><summary>Extract Features</summary>
For each image we will compute a Histogram of Oriented
Gradients (HOG) as well as a color histogram using the hue channel in HSV
color space. We form our final feature vector for each image by concatenating
the HOG and color histogram feature vectors.

Roughly speaking, HOG should capture the texture of the image while ignoring
color information, and the color histogram represents the color of the input
image while ignoring texture. As a result, we expect that using both together
ought to work better than using either alone. Verifying this assumption would
be a good thing to try for your own interest.

The `hog_feature` and `color_histogram_hsv` functions both operate on a single
image and return a feature vector for that image. The extract_features
function takes a set of images and a list of feature functions and evaluates
each feature function on each image, storing the results in a matrix where
each column is the concatenation of all feature vectors for a single image.
</details>
对于每张图片，我们将计算方向梯度直方图（HOG）以及 HSV 色彩空间中色调通道的颜色直方图。我们通过将 HOG 和颜色直方图特征向量拼接，形成每张图片的最终特征向量。

粗略来说，HOG 能捕捉图像的纹理信息而忽略颜色信息，颜色直方图则表示输入图像的颜色而忽略纹理。因此，结合两者应该比单独使用其中之一效果更好。你可以自行验证这个假设。

`hog_feature` 和 `color_histogram_hsv` 函数都作用于单张图片，并返回该图片的特征向量。`extract_features` 函数接受一组图片和特征函数列表，对每张图片应用每个特征函数，并将所有特征向量拼接后存储在一个矩阵中，每一列对应一张图片的所有特征。

In [None]:
from cs231n.features import *

# num_color_bins = 10 # 颜色直方图的 bin 数量
num_color_bins = 25  # 颜色直方图的 bin 数量
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# 预处理：减去均值特征
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# 预处理：除以标准差。这样可以确保每个特征具有大致相同的尺度。
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# 预处理：添加偏置维度
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])

## 在特征上训练 Softmax 分类器
<details><summary>Train Softmax classifier on features</summary>
Using the Softmax code developed earlier in the assignment, train Softmax classifiers on top of the features extracted above; this should achieve better results than training them directly on top of raw pixels.
</details>
使用之前作业中开发的 Softmax 代码，在上面提取的特征上训练 Softmax 分类器；这应该比直接在原始像素上训练效果更好。

In [None]:
# 使用验证集调优学习率和正则化强度

from cs231n.classifiers.linear_classifier import Softmax

learning_rates = [1e-7, 1e-6]
regularization_strengths = [5e5, 5e6]

results = {}
best_val = -1
best_softmax = None

################################################################################
# TODO:                                                                        #
# 使用验证集设置学习率和正则化强度。                                                 #
# Use the validation set to set the learning rate and regularization strength. #
# 这部分应与之前 Softmax 的验证过程一致；将最佳训练分类器保存在 best_softmax。          #
# This should be identical to the validation that you did for the Softmax; save#
# 如果你仔细调参，验证集准确率应能超过 0.42。                                        #
# the best trained classifer in best_softmax. If you carefully tune the model, #
#                                                                              #
# you should be able to get accuracy of above 0.42 on the validation set.      #
################################################################################


# 输出结果。
for lr, reg in sorted(results):
    train_accuracy, val_accuracy = results[(lr, reg)]
    print("lr %e reg %e train accuracy: %f val accuracy: %f" % (lr, reg, train_accuracy, val_accuracy))

print("best validation accuracy achieved: %f" % best_val)

In [None]:
# 在测试集上评估你训练的 Softmax：准确率应至少达到 0.42
y_test_pred = best_softmax.predict(X_test_feats)
test_accuracy = np.mean(y_test == y_test_pred)
print(test_accuracy)

In [None]:
# 保存最佳 softmax 模型
best_softmax.save("best_softmax_features.npy")

In [None]:
# 通过可视化错误分类的图片，可以帮助我们理解算法的工作方式。
# 在这个可视化中，我们展示了被当前系统错误分类的图片。
# 第一列展示了被系统标记为“plane”但真实标签不是“plane”的图片。

examples_per_class = 8
classes = ["plane", "car", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]
for cls, cls_name in enumerate(classes):
    idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]
    idxs = np.random.choice(idxs, examples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)
        plt.imshow(X_test[idx].astype("uint8"))
        plt.axis("off")
        if i == 0:
            plt.title(cls_name)
plt.show()

### 内嵌问题 1：
<details><summary>Inline question 1:</summary>
Describe the misclassification results that you see. Do they make sense?
</details>
请描述你看到的错误分类结果。它们合理吗？

$\color{blue}{\textit 你的答案：}$





## 在图像特征上训练神经网络
<details><summary>Neural Network on image features</summary>
Earlier in this assigment we saw that training a two-layer neural network on raw pixels achieved better classification performance than linear classifiers on raw pixels. In this notebook we have seen that linear classifiers on image features outperform linear classifiers on raw pixels.

For completeness, we should also try training a neural network on image features. This approach should outperform all previous approaches: you should easily be able to achieve over 55% classification accuracy on the test set; our best model achieves about 60% classification accuracy.
</details>
在本作业的前面，我们看到在原始像素上训练两层神经网络比线性分类器效果更好。在本笔记本中，我们也看到在图像特征上训练线性分类器比在原始像素上效果更好。

为了完整性，我们还应该尝试在图像特征上训练神经网络。这种方法应该优于之前所有方法：你应该可以轻松在测试集上达到 55% 以上的准确率，我们的最佳模型能达到约 60%。

In [None]:
# 预处理：移除偏置维度
# 请确保只运行一次本单元格
print(X_train_feats.shape)
X_train_feats = X_train_feats[:, :-1]
X_val_feats = X_val_feats[:, :-1]
X_test_feats = X_test_feats[:, :-1]

print(X_train_feats.shape)

In [None]:
from cs231n.classifiers.fc_net import TwoLayerNet
from cs231n.solver import Solver

input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10

data = {
    "X_train": X_train_feats,
    "y_train": y_train,
    "X_val": X_val_feats,
    "y_val": y_val,
    "X_test": X_test_feats,
    "y_test": y_test,
}

net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None

################################################################################
# TODO:                                                                        #
# 训练一个两层神经网络用于图像特征。你可以像之前一样交叉验证参数。                        #
# Train a two-layer neural network on image features. You may want to          #
# 将你的最佳模型保存在 best_net 变量中。                                            #
# cross-validate various parameters as in previous sections. Store your best    #
#                                                                               #
# model in the best_net variable.                                               #
################################################################################

In [None]:
# 在测试集上运行你的最佳神经网络分类器。你应该能达到 58% 以上的准确率。
# 通过仔细调参，也有可能超过 60%。

y_test_pred = np.argmax(best_net.loss(data["X_test"]), axis=1)
test_acc = (y_test_pred == data["y_test"]).mean()
print(test_acc)

In [None]:
# 保存最佳模型
best_net.save("best_two_layer_net_features.npy")