# Reusing Pretrained Layers

It is not good to train a DNN from scratch, instead use lower layers pretrained architectures from a similar task. This technique is called *transfer learning*. It requires less data and less time to train.

The input shape must be same as original model input shape, if yours is different then you can add preprocessing step to resize inputs.

The output layer is replaced as it is most likely not be useful. Similarly, the upper hidden layers are less likely to be useful, since the high level features are most important for the new task. You need to find right number of layers to reuse. 

> The more similar the tasks are, the more layers from original model you should reuse.

First, you start with freezing (make non-trainable) changing output layer. Then you can start unfreezing upper layers and see model's performance. The more training data you have, more layers you can unfreeze. It is also useful to reduce learning rate when you unfreeze reused layers: this will avoid wrecking their fine-tuned weights.

You can add your layers or replace top layers if you have enough training data.

## Transfer Learning with Keras

Let's say we use a pretrained model A that is trained on image classification on 10 classes and we want a binary image classification model B.

First, you need to load original saved model and create a new model by changing output layer:

```python
model_A = keras.models.load_model('my_model_A.h5')
model_B_on_A = keras.models.Sequential(model_A.layers[:-1])
model_B_on_A.add(keras.layers.Dense(1, activation='sigmoid'))
```

Now when you train model_B_on_A, it will also affect model_A. if you want to avoid that you have to clone model_A and load pretrained weights since `clone_model` only clones architecture:

```python
model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())
```

Now for freezing layers:

```python
for layer in model_B_on_A.layers[:-1]:
    layer.trainable = False

model_B_on_A.compile(loss='binary_crossentropy', optimizer='sgd')
```

> You must always compile your model after you freeze or unfreeze layers.

One fact, if you want to use all layers with pretrained weights, the output layer's weights are randomly initialized. Hence, at start the error is pretty huge which may wreck pretrained weights. So for few epochs you must freeze pretrained layers, then unfreeze all.

In fact transfer learning does not work well on small dense networks because they learn few patterns. It will be useful in CNNs where they learn the features of images in lower hidden layers.