<a href="https://colab.research.google.com/github/rahiakela/deep-learning--from-basics-to-practice/blob/23-keras-part-1/6_saving_and_loading.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Saving and Loading

## Setup

In [1]:
from keras.datasets import mnist
from keras import backend as Keras_backend
from keras.utils.np_utils import to_categorical
import matplotlib.pyplot as plt

import numpy as np

from keras import backend as keras_backend
keras_backend.set_image_data_format('channels_last')

Using TensorFlow backend.


In [2]:
from keras.datasets import mnist
from keras import backend as keras_backend
from keras.models import Sequential
from keras.layers import Dense

# load MNIST data and save sizes
(X_train, y_train), (X_test, y_test) = mnist.load_data()

image_height = X_train.shape[1]
image_width = X_train.shape[2]
number_of_pixels = image_height * image_width


# convert to floating-point
X_train = keras_backend.cast_to_floatx(X_train)
X_test = keras_backend.cast_to_floatx(X_test)


# scale data to range [0, 1]
X_train /= 255.0
X_test /= 255.0


# save the original y_train and y_test
original_y_train = y_train
original_y_test = y_test

# replace label data with one-hot encoded versions
number_of_classes = 1 + max(np.append(y_train, y_test)).astype(np.int32)

# encode each list into one-hot arrays of the size we just found
y_train = to_categorical(y_train, num_classes=number_of_classes)
y_test = to_categorical(y_test, num_classes=number_of_classes)

# reshape samples to 2D grid, one line per image
X_train = X_train.reshape([X_train.shape[0], number_of_pixels])
X_test = X_test.reshape([X_test.shape[0], number_of_pixels])

def make_one_hidden_layer_model():

  # create an empty model
  model = Sequential()

  # add a fully-connected hidden layer with #nodes = #pixels
  model.add(Dense(number_of_pixels, activation='relu', input_shape=[number_of_pixels]))

  # add an output layer with softmax activation
  model.add(Dense(number_of_classes, activation='softmax'))

  # compile the model to turn it from specification to code
  model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

  return model

# make the model
one_hidden_layer_model = make_one_hidden_layer_model()  
one_hidden_layer_model.summary()

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz





Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 784)               615440    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                7850      
Total params: 623,290
Trainable params: 623,290
Non-trainable params: 0
_________________________________________________________________


## Saving Everything in One File

The easiest way to save our model and weights is to call a built-in
method belonging to our object that tells it to write itself to a file. The
method is, sensibly enough, called save(). When we call this method,
the model will write a file that contains both its architecture and
weights.

The model is saved in a format called HDF5, which conventionally
uses the extensions .h5 or .hdf5.

In [3]:
one_hidden_layer_model.save('One_layer_model.h5')









Later, we can read this file back in with the load_model() function.
Unlike save(), we need to import a new Keras module to access load_
model(). That’s because when we load a model, we might not yet have
an object whose methods we can call.

In [0]:
from keras.models import load_model

# to load the model and weights
model_and_weights = load_model('One_layer_model.h5')

Just like that, the model variable now contains a complete version of
the model we saved, with all the weights we’d learned as of the time
the file was written.

## Saving Just the Weights

If we only want to save the weights (probably to save a little space on
our hard drive), the method save_weights() will do the job.

In [0]:
one_hidden_layer_model.save_weights('one_layer_model_weights_only.h5')

If we want to use these weights later, then we have to first build a
model to receive them. The most common case is when our model has
the same architecture as the model we used to save the weights. Then
the weights just pour right back in to where they had been.

In [0]:
# create a model just like the one we saved the weights from
model_with_weights_only = make_model()   # a pretend function to make our model

# now read the weights back from a file and fill up the model
model_with_weights_only.load_weights('one_layer_model_weights_only.h5')

## Saving Just the Architecture

Saving both the model and its weights is the most convenient way to
save our work, since we have everything we need in one place. Saving
just the weights is useful if we want to share our trained model with
people using different libraries that aren’t set up to read the Keras
architecture information.

If we need to save just the architecture of the model, Keras supports
two different formats: JSON and YAML. These
formats are both designed to save data structures to text-only files.

The technique for saving an architecture in both cases is to use Keras
to convert the model into a big character string, and then write that
string to a file.

To get the architecture back, we read the string from the file, and then
use Keras to turn the string into a model.

To turn a model into a YAML string, we use the to_yaml() method
that is part of the model.

In [0]:
# How to save just the architecture, without weights, using YAML
import yaml

filename = 'one_layer_model_architecture_only.h5'
yaml_string = model_with_weights_only.to_model()
with open(filename, 'w') as outfile:
  yaml.dump(yaml_string, outfile)

# How to load just the architecture, without weights, using YAML
from keras.models import model_from_yaml

with open(filename) as yaml_data:
    yaml_string = yaml.load(yaml_data)

model = model_from_yaml(yaml_string)

## Using Pre-Trained Models

Some deep learning models can have dozens of layers, and may have
been trained for days or weeks on mountains of data that we don’t
have access to. But if the authors of the model have released the structure
and weights, then we can instantly use their model and all the
hard work that went into it.

We often fine-tune these pre-trained models by training them on
our own data, helping them specialize on the tasks we need to do. This
is sometimes called transfer learning.

We might even modify the architecture, such as by adding a few layers
of or own to the end of the pre-trained model. We “protect” the existing
model by telling Keras not to change their weights during training.
We say that such layers are frozen. This means that only our new layers
get updated weights as we train.

To freeze a layer, we set the layer’s optional parameter trainable to
False. We can later “thaw” a frozen layer by setting this parameter to
True and compiling it again.