# 提取训练和测试数据的特征

本次将提取所有训练数据和测试数据的以下网络的特征:

- Inception V3
- Xception
- ResNet50
- VGG16

所有特征分批保存在一个目录下(你无需了解目录的结构,也可以进行训练和测试).

注意训练和测试数据的特征是分开储存的.

In [1]:
# 提取特征的数据路径,请改成你自己机器下的路径!

# 需要把特征保存在哪个位置, 注意目录必须事先创建!!!!!并保证目录下是空的!!!!(这点特别重要,不然后面训练会失败)
# 训练特征的位置
train_features_path = r'/usr/local/data/cat_dog/features/train'
# 测试特征的位置
test_features_path = r'/usr/local/data/cat_dog/features/test'

# 猫狗大战数据的位置
# 训练数据的位置
train_data_path = r'/usr/local/data/cat_dog/raw/train'
# 测试数据的位置
test_data_path = r'/usr/local/data/cat_dog/raw/test'

# 每次保存多少数据,依据你自己的机器情况调整
batch_size = 1000

将上面的路径改成你自己机器中的之后,下面的代码就不需要修改了,直接运行那些你想提取特征的代码块即可.

In [2]:
# 使用第四版本的特征提取,这个版本基于Keras开发,所以不需要自己下载网络的参数文件了(Keras会自动下载)
# 之前版本是基于TensorFlow的,因为要下载参数文件比较麻烦,所以之前版本的代码我全部不再提供
from extract_features import extract_features_v4
import os

Using TensorFlow backend.


## 提取训练数据的特征

注意以下代码块的运行时间可能比较长.

建议使用GPU运行.

### 提取Inception V3 的特征

In [3]:
extract_features_v4(train_data_path,
                    os.path.join(train_features_path, 'inception'),
                    'inception',
                    test=False,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/train/inception
Begin to extracting 25000 images features...
saved at /usr/local/data/cat_dog/features/train/inception
Took 676.61 seconds.


### 提取Xception 特征

In [4]:
extract_features_v4(train_data_path,
                    os.path.join(train_features_path, 'xception'),
                    'xception',
                    test=False,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/train/xception
Begin to extracting 25000 images features...
saved at /usr/local/data/cat_dog/features/train/xception
Took 610.11 seconds.


### 提取 ResNet50 特征

In [5]:
extract_features_v4(train_data_path,
                    os.path.join(train_features_path, 'resnet50'),
                    'resnet50',
                    test=False,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/train/resnet50




Begin to extracting 25000 images features...
saved at /usr/local/data/cat_dog/features/train/resnet50
Took 663.52 seconds.


### 提取VGG 16特征

In [6]:
extract_features_v4(train_data_path,
                    os.path.join(train_features_path, 'vgg16'),
                    'vgg16',
                    test=False,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/train/vgg16
Begin to extracting 25000 images features...
saved at /usr/local/data/cat_dog/features/train/vgg16
Took 464.28 seconds.


## 提取测试数据特征

主要提取的是Kaggle提供的test数据的特征

### 提取Inception V3 特征

In [7]:
extract_features_v4(test_data_path,
                    os.path.join(test_features_path, 'inception'),
                    'inception',
                    test=True,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/test/inception
Begin to extracting 12500 images features...
saved at /usr/local/data/cat_dog/features/test/inception
Took 353.38 seconds.


### 提取Xception 特征

In [8]:
extract_features_v4(test_data_path,
                    os.path.join(test_features_path, 'xception'),
                    'xception',
                    test=True,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/test/xception
Begin to extracting 12500 images features...
saved at /usr/local/data/cat_dog/features/test/xception
Took 325.26 seconds.


### 提取ResNet50 特征

In [9]:
extract_features_v4(test_data_path,
                    os.path.join(test_features_path, 'resnet50'),
                    'resnet50',
                    test=True,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/test/resnet50
Begin to extracting 12500 images features...
saved at /usr/local/data/cat_dog/features/test/resnet50
Took 351.72 seconds.


### 提取VGG16 特征

In [10]:
extract_features_v4(test_data_path,
                    os.path.join(test_features_path, 'vgg16'),
                    'vgg16',
                    test=True,
                    batch_size=batch_size)

mkdir /usr/local/data/cat_dog/features/test/vgg16
Begin to extracting 12500 images features...
saved at /usr/local/data/cat_dog/features/test/vgg16
Took 247.75 seconds.
