# Chapter 5

### FEATURE EXTRACTION WITH DATA AUGMENTATION

Now, let’s review the second technique we mentioned for doing feature extraction,
which is much slower and more expensive, but which allows us to use data augmentation during training: 
* extending the **conv_base model** and running it end to end on the inputs.

**NOTE** This technique is so expensive that we should only attempt it if we
have access to a GPU—it’s absolutely intractable on CPU. If we can’t run our
code on GPU, then the previous technique is the way to go.

Because models behave just like layers, we can add a model (like conv_base) to a
Sequential model just like we would add a layer.

In [1]:
from tensorflow.keras import models
from tensorflow.keras import layers
from tensorflow.keras import optimizers

In [2]:
from tensorflow.keras.applications import VGG16

conv_base = VGG16(weights='imagenet', include_top=False,input_shape=(150, 150, 3))

In [3]:
# Adding a densely connected classifier on top of the convolutional base
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg16 (Functional)           (None, 4, 4, 512)         14714688  
_________________________________________________________________
flatten (Flatten)            (None, 8192)              0         
_________________________________________________________________
dense (Dense)                (None, 256)               2097408   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 257       
Total params: 16,812,353
Trainable params: 16,812,353
Non-trainable params: 0
_________________________________________________________________


As we can see, the convolutional base of VGG16 has `14,714,688` parameters, which is
very large. The classifier we’re adding on top has 2 million parameters.
 
Before we compile and train the model, it’s very important to freeze the convolutional base.
* Freezing a layer or set of layers means preventing their weights from being updated during training. 
* If we don’t do this, then the representations that were previously learned by the convolutional base will be modified during training.

Because the Dense layers on top are randomly initialized, very large weight updates would be
propagated through the network, effectively destroying the representations previously
learned.
 
In **Tensorflow/Keras**, we freeze a network by setting its trainable attribute to `False`:

In [5]:
print('This is the number of trainable weights before freezing the conv base:',
      len(model.trainable_weights))

This is the number of trainable weights before freezing the conv base: 30


In [6]:
conv_base.trainable = False

In [7]:
print('This is the number of trainable weights after freezing the conv base:',
      len(model.trainable_weights))

This is the number of trainable weights after freezing the conv base: 4


With this setup, only the weights from the two Dense layers that we added will be
trained. That’s a total of four weight tensors: two per layer (the main weight matrix
and the bias vector). 

Note that in order for these changes to take effect, we must first
compile the model. If we ever modify weight trainability after compilation, we
should then recompile the model, or these changes will be ignored.

Now we can start training our model, with the same data-augmentation configuration that we used in the previous example.

In [8]:
# Training the model end to end with a frozen convolutional base

from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [9]:
train_datagen = ImageDataGenerator(rescale=1./255,rotation_range=40,width_shift_range=0.2,
                                   height_shift_range=0.2,shear_range=0.2,zoom_range=0.2,
                                   horizontal_flip=True,fill_mode='nearest')

In [10]:
test_datagen = ImageDataGenerator(rescale=1./255)

In [11]:
import os

base_dir = 'cats_and_dogs_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')

In [12]:
train_generator = train_datagen.flow_from_directory(train_dir,target_size=(150, 150),
                                                    batch_size=20,class_mode='binary')

Found 2000 images belonging to 2 classes.


In [13]:
validation_generator = test_datagen.flow_from_directory(validation_dir,target_size=(150, 150),
                                                        batch_size=20,class_mode='binary')

Found 1000 images belonging to 2 classes.


In [14]:
model.compile(loss='binary_crossentropy',optimizer=optimizers.RMSprop(lr=2e-5),metrics=['acc'])

Need GPU to run a below code

In [17]:
# history = model.fit_generator(train_generator,steps_per_epoch=100,epochs=30,
#                               validation_data=validation_generator,validation_steps=50)

Let’s plot the results again

In [18]:
# import matplotlib.pyplot as plt

# acc = history.history['acc']
# val_acc = history.history['val_acc']
# loss = history.history['loss']
# val_loss = history.history['val_loss']

# epochs = range(1, len(acc) + 1)

# plt.plot(epochs, acc, 'bo', label='Training acc')
# plt.plot(epochs, val_acc, 'b', label='Validation acc')
# plt.title('Training and validation accuracy')
# plt.legend()
# plt.figure()

# plt.plot(epochs, loss, 'bo', label='Training loss')
# plt.plot(epochs, val_loss, 'b', label='Validation loss')
# plt.title('Training and validation loss')
# plt.legend()
# plt.show()

As we can see, we reach a validation accuracy of about 96%. This is much better than we achieved with the small convnet trained from scratch.