# Setup

Unless you have a powerful GPU, running feature extraction on these models will take a significant amount of time. To make things easier we precomputed bottleneck features for each network. This will allow you experiment with feature extraction even on a modest CPU. You can think of bottleneck features as feature extraction but with caching. Because the base network weights are frozen during feature extraction, the output for an image will always be the same. Thus, once the image has already been passed through the network, we can cache and reuse the output.

The files are encoded as such:

    {network}_features.npz

where `network`, in the above filename, can be one of `vgg16`, `vgg19`, `inception`, `xception`, or `resnet50`.

# Extraction

Then, to obtain the bottleneck features corresponding to the train, test, and validation sets, you should run:

    bottleneck_features = np.load('bottleneck_features/{network}_features.npz')
    train_{network} = bottleneck_features['train_{network}']
    valid_{network} = bottleneck_features['valid_{network}']
    test_{network} = bottleneck_features['test_{network}']
    
where `network`, in the above filename, can be one of `vgg16`, `vgg19`, `inception`, `xception`, or `resnet50`.

# Example Code

To obtain the vgg-16 bottleneck features corresponding to the train, test, and validation sets, you should run:
    
    bottleneck_features = np.load('bottleneck_features/vgg16_features.npz')
    train_vgg16 = bottleneck_features['train_vgg16']
    valid_vgg16 = bottleneck_features['valid_vgg16']
    test_vgg16 = bottleneck_features['test_vgg16']

# Appendix

If you'd like to see the exact code we used to compute the bottleneck features, please look below.  Each block of code is annotated with the size of the extracted bottleneck features.
    
### Xception

```
from keras.applications.xception import Xception

base_model = Xception(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.layers[-1].output)
# (nb_batches, 7, 7, 2048)
train_xception = model.predict(preprocess_input(train_tensors))
valid_xception = model.predict(preprocess_input(valid_tensors))
test_xception = model.predict(preprocess_input(test_tensors))

np.savez('xception_features', train_xception=train_xception, 
         valid_xception=valid_xception, test_xception=test_xception)
```

### Inception

```
from keras.applications.inception_v3 import InceptionV3

base_model = InceptionV3(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.layers[-1].output)
# (nb_batches, 5, 5, 2048)
train_inception = model.predict(preprocess_input(train_tensors))
valid_inception = model.predict(preprocess_input(valid_tensors))
test_inception = model.predict(preprocess_input(test_tensors))

np.savez('inception_features', train_inception=train_inception, 
         valid_inception=valid_inception, test_inception=test_inception)
```

### ResNet50

```
from keras.applications.resnet50 import ResNet50
                        
base_model = ResNet50(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.layers[-1].output)
# (nb_batches, 1, 1, 2048)
train_resnet50 = model.predict(preprocess_input(train_tensors))
valid_resnet50 = model.predict(preprocess_input(valid_tensors))
test_resnet50 = model.predict(preprocess_input(test_tensors))

np.savez('resnet50_features', train_resnet50=train_resnet50, 
         valid_resnet50=valid_resnet50, test_resnet50=test_resnet50)
```

### VGG-19

```
from keras.applications.vgg19 import VGG19
                        
base_model = VGG19(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.layers[-1].output)
# (nb_batches, 7, 7, 512)
train_vgg19 = model.predict(preprocess_input(train_tensors))
valid_vgg19 = model.predict(preprocess_input(valid_tensors))
test_vgg19 = model.predict(preprocess_input(test_tensors))

np.savez('vgg19_features', train_vgg19=train_vgg19, 
         valid_vgg19=valid_vgg19, test_vgg19=test_vgg19)
```

### VGG-16

```
base_model = VGG16(weights='imagenet', include_top=False)
model = Model(inputs=base_model.input, outputs=base_model.layers[-1].output)
# (nb_batches, 7, 7, 512)
train_vgg16 = model.predict(preprocess_input(train_tensors))
valid_vgg16 = model.predict(preprocess_input(valid_tensors))
test_vgg16 = model.predict(preprocess_input(test_tensors))

np.savez('vgg16_features', train_vgg16=train_vgg16, valid_vgg16=valid_vgg16, test_vgg16=test_vgg16)
```

In [None]:
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/valid')
test_files, test_targets = load_dataset('dogImages/test')

# load ordered list of dog names
dog_names = [item[25:-1] for item in glob("dogImages/train/*/")]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % str(len(train_files) + len(valid_files) + len(test_files)))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))

In [None]:
from tqdm import tqdm
from keras.preprocessing import image   
from PIL import ImageFile                          
ImageFile.LOAD_TRUNCATED_IMAGES = True                 

def path_to_tensor(img_path):
    img = image.load_img(img_path, target_size=(224, 224))
    x = image.img_to_array(img)
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

# get tensors suitable for supplying to keras
train_tensors = paths_to_tensor(train_files)
valid_tensors = paths_to_tensor(valid_files)
test_tensors = paths_to_tensor(test_files)