<a href="https://colab.research.google.com/github/glopez21/Deep-Learning-Intro/blob/main/3_Saving_and_Loading_a_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Saving and Loading a Keras Neural Network

Complex neural networks will take a long time to fit/train.  It is helpful to be able to save these neural networks so that you can reload them later.  A reloaded neural network will not require retraining.  Keras provides three formats for neural network saving.

* **JSON** - Stores the neural network structure (no weights) in the [JSON file format](https://en.wikipedia.org/wiki/JSON).
* **HDF5** - Stores the complete neural network (with weights) in the [HDF5 file format](https://en.wikipedia.org/wiki/Hierarchical_Data_Format). Do not confuse HDF5 with [HDFS](https://en.wikipedia.org/wiki/Apache_Hadoop).  They are different.  We do not use HDFS in this class.

Usually, you will want to save in HDF5.

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
import pandas as pd
import io
import os
import requests
import numpy as np
from sklearn import metrics

In [2]:
save_path = "."

In [3]:
df = pd.read_csv('https://raw.githubusercontent.com/glopez21/Deep-Learning-Intro/main/data/auto-mpg.csv', na_values=['NA', '?'])

In [4]:
cars = df['name']

In [5]:
# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

In [6]:
# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'year', 'origin']].values
y = df['mpg'].values

In [7]:
# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x,y,verbose=2,epochs=100)

Epoch 1/100
13/13 - 1s - loss: 1371.1456 - 902ms/epoch - 69ms/step
Epoch 2/100
13/13 - 0s - loss: 692.0790 - 42ms/epoch - 3ms/step
Epoch 3/100
13/13 - 0s - loss: 485.6282 - 47ms/epoch - 4ms/step
Epoch 4/100
13/13 - 0s - loss: 396.6917 - 40ms/epoch - 3ms/step
Epoch 5/100
13/13 - 0s - loss: 354.2178 - 51ms/epoch - 4ms/step
Epoch 6/100
13/13 - 0s - loss: 328.2730 - 57ms/epoch - 4ms/step
Epoch 7/100
13/13 - 0s - loss: 284.9070 - 54ms/epoch - 4ms/step
Epoch 8/100
13/13 - 0s - loss: 243.7017 - 49ms/epoch - 4ms/step
Epoch 9/100
13/13 - 0s - loss: 204.8975 - 50ms/epoch - 4ms/step
Epoch 10/100
13/13 - 0s - loss: 171.7448 - 42ms/epoch - 3ms/step
Epoch 11/100
13/13 - 0s - loss: 148.3961 - 43ms/epoch - 3ms/step
Epoch 12/100
13/13 - 0s - loss: 124.7580 - 58ms/epoch - 4ms/step
Epoch 13/100
13/13 - 0s - loss: 105.6859 - 48ms/epoch - 4ms/step
Epoch 14/100
13/13 - 0s - loss: 100.4163 - 47ms/epoch - 4ms/step
Epoch 15/100
13/13 - 0s - loss: 82.1816 - 44ms/epoch - 3ms/step
Epoch 16/100
13/13 - 0s - loss: 

<keras.callbacks.History at 0x7f13fa857690>

In [8]:
# Predict
pred = model.predict(x)



In [9]:
# Measure RMSE error.  RMSE is common for regression.
score = np.sqrt(metrics.mean_squared_error(pred,y))
print(f"Before save score (RMSE): {score}")

Before save score (RMSE): 3.644297737478241


In [10]:
# save neural network structure to JSON (no weights)
model_json = model.to_json()
with open(os.path.join(save_path,"network.json"), "w") as json_file:
    json_file.write(model_json)

In [11]:
# save entire network to HDF5 (save everything, suggested)
model.save(os.path.join(save_path,"network.h5"))

The code below sets up a neural network and reads the data (for predictions), but it does not clear the model directory or fit the neural network. The code loads the weights from the previous fit. Now we reload the network and perform another prediction. The RMSE should match the previous one exactly if we saved and reloaded the neural network correctly.

In [12]:
from tensorflow.keras.models import load_model
model2 = load_model(os.path.join(save_path,"network.h5"))
pred = model2.predict(x)
# Measure RMSE error.  RMSE is common for regression.
score = np.sqrt(metrics.mean_squared_error(pred,y))
print(f"After load score (RMSE): {score}")

After load score (RMSE): 3.644297737478241
