In [1]:
import tensorflow as tf

In [2]:
print(tf.__version__)

2.0.0


# Feature Vector

The input to a neural network is called the feature vector. 

The size of this vector is always a fixed length.
Changing the size of the feature vector means recreating the entire neural network. 

There are usually four types of neurons in a neural network:

Input Neurons - Each input neuron is mapped to one element in the feature vector.
Hidden Neurons - Hidden neurons allow the neural network to abstract and process the input into the output.
Output Neurons - Each output neuron calculates one part of the output.
Context Neurons - Holds state between calls to the neural network to predict.
Bias Neurons - Work similar to the y-intercept of a linear equation.
These neurons are grouped into layers:

Input Layer - The input layer accepts feature vectors from the dataset. Input layers usually have a bias neuron.
Output Layer - The output from the neural network. The output layer does not have a bias neuron.
Hidden Layers - Layers that occur between the input and output layers. Each hidden layer will usually have a bias neuron.

# Activation Functions
Activation function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

Activation functions, also known as transfer functions, are used to calculate the output of each layer of a neural network. Historically neural networks have used a hyperbolic tangent, sigmoid/logistic, or linear activation function. However, modern deep neural networks primarily make use of the following activation functions:

Rectified Linear Unit (ReLU) - Used for the output of hidden layers.
Softmax - Used for the output of classification neural networks. Softmax Example
Linear - Used for the output of regression neural networks (or 2-class classification).

Relu Activation function is most common. In practice, networks with Relu tend to show better convergence performance than sigmoid.

Relu activation doesn't has vanishing gradient issue unlike sigmoid or other activation functions

Leaky Relu and Parametric Relu or Exponential Linear(ELU,SELU) are better when there are many negative values input because Relu makes negative values directly output as 0.

Leaky Relu: y=0.01x Parametric Relu: y=alpha*x Exponential Relu: y=a(e^x-1)

Concatenated ReLU (CReLU)
Concatenated ReLU has two outputs, one ReLU and one negative ReLU, concatenated together. In other words, for positive x it produces [x, 0], and for negative x it produces [0, x]. Because it has two outputs, CReLU doubles the output dimension.

In [3]:
import torch
from torch import autograd, nn
relu = nn.ReLU()
var = autograd.Variable(torch.randn(2))
relu(var)

tensor([0., 0.])

In [4]:
import tensorflow as tf
print("Tensor Flow Version: {}".format(tf.__version__))

Tensor Flow Version: 2.0.0


# Basic Matmul operation in Tensorflow

In [5]:
import tensorflow as tf

# Create a Constant op that produces a 1x2 matrix.  The op is
# added as a node to the default graph.
#
# The value returned by the constructor represents the output
# of the Constant op.
matrix1 = tf.constant([[3., 3.]])

# Create another Constant that produces a 2x1 matrix.
matrix2 = tf.constant([[2.],[2.]])

# Create a Matmul op that takes 'matrix1' and 'matrix2' as inputs.
# The returned value, 'product', represents the result of the matrix
# multiplication.
product = tf.matmul(matrix1, matrix2)

print(product)
print(float(product))

tf.Tensor([[12.]], shape=(1, 1), dtype=float32)
12.0


In [6]:
import tensorflow as tf

x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])

# Add an op to subtract 'a' from 'x'.  Run it and print the result
sub = tf.subtract(x, a)
print(sub)
print(sub.numpy())

tf.Tensor([-2. -1.], shape=(2,), dtype=float32)
[-2. -1.]


# To modify values of a variable in tensorflow

In [7]:
x

<tf.Variable 'Variable:0' shape=(2,) dtype=float32, numpy=array([1., 2.], dtype=float32)>

In [8]:
x.assign([4.0,6.0])

<tf.Variable 'UnreadVariable' shape=(2,) dtype=float32, numpy=array([4., 6.], dtype=float32)>

In [9]:
print(sub.numpy()) # earlier value

[-2. -1.]


In [10]:
sub = tf.subtract(x, a)
print(sub)
print(sub.numpy()) # after changing

tf.Tensor([1. 3.], shape=(2,), dtype=float32)
[1. 3.]


# Keras

In [12]:
import keras

Using TensorFlow backend.


# Regression Keras

In [16]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
import pandas as pd
import io
import os
import requests
import numpy as np
from sklearn import metrics

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", 
    na_values=['NA', '?'])

cars = df['name']

# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight',
       'acceleration', 'year', 'origin']].values
y = df['mpg'].values # regression

# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x,y,verbose=1,epochs=100)

Train on 398 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100


Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<tensorflow.python.keras.callbacks.History at 0x1d7114e1688>

In [17]:
x.shape

(398, 7)

In [18]:
pred = model.predict(x)
print("Shape: {}".format(pred.shape))
print(pred)

Shape: (398, 1)
[[12.582692 ]
 [12.753652 ]
 [13.100087 ]
 [15.532216 ]
 [14.760854 ]
 [11.94103  ]
 [11.183431 ]
 [12.570088 ]
 [12.554433 ]
 [11.909871 ]
 [ 5.8867693]
 [13.412953 ]
 [ 2.0308733]
 [-4.9724226]
 [28.703907 ]
 [18.055378 ]
 [17.42342  ]
 [12.703359 ]
 [27.522907 ]
 [15.332234 ]
 [31.177052 ]
 [29.660908 ]
 [29.755054 ]
 [28.62238  ]
 [14.713857 ]
 [30.699848 ]
 [34.989033 ]
 [34.91866  ]
 [38.381184 ]
 [27.601215 ]
 [20.872166 ]
 [27.09206  ]
 [25.914143 ]
 [10.367909 ]
 [22.578144 ]
 [15.550557 ]
 [13.212697 ]
 [18.441578 ]
 [19.48888  ]
 [15.056142 ]
 [16.301868 ]
 [21.42815  ]
 [25.547382 ]
 [17.823261 ]
 [23.807453 ]
 [11.2664385]
 [19.098866 ]
 [15.032704 ]
 [11.273763 ]
 [23.354588 ]
 [24.104252 ]
 [26.130451 ]
 [26.264484 ]
 [23.19965  ]
 [21.771107 ]
 [18.00475  ]
 [21.877628 ]
 [27.532795 ]
 [25.291676 ]
 [21.712315 ]
 [22.056599 ]
 [23.024662 ]
 [20.417896 ]
 [14.003041 ]
 [21.919666 ]
 [16.15038  ]
 [18.91457  ]
 [17.378925 ]
 [21.456562 ]
 [21.681324 ]
 [16

In [19]:
# Measure RMSE error.  RMSE is common for regression.
score = np.sqrt(metrics.mean_squared_error(pred,y))
print(f"Final score (RMSE): {score}")

Final score (RMSE): 8.622374518865493


In [20]:
# sample predictions
for i in range(10):
    print(f"{i+1}. Car name: {cars[i]}, MPG: {y[i]}, predicted MPG: {pred[i]}")

1. Car name: chevrolet chevelle malibu, MPG: 18.0, predicted MPG: [12.582692]
2. Car name: buick skylark 320, MPG: 15.0, predicted MPG: [12.753652]
3. Car name: plymouth satellite, MPG: 18.0, predicted MPG: [13.100087]
4. Car name: amc rebel sst, MPG: 16.0, predicted MPG: [15.532216]
5. Car name: ford torino, MPG: 17.0, predicted MPG: [14.760854]
6. Car name: ford galaxie 500, MPG: 15.0, predicted MPG: [11.94103]
7. Car name: chevrolet impala, MPG: 14.0, predicted MPG: [11.183431]
8. Car name: plymouth fury iii, MPG: 14.0, predicted MPG: [12.570088]
9. Car name: pontiac catalina, MPG: 14.0, predicted MPG: [12.554433]
10. Car name: amc ambassador dpl, MPG: 15.0, predicted MPG: [11.909871]


# Keras Classification

In [21]:
import pandas as pd
import io
import requests
import numpy as np
from sklearn import metrics
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/iris.csv", 
    na_values=['NA', '?'])

# Convert to numpy - Classification
x = df[['sepal_l', 'sepal_w', 'petal_l', 'petal_w']].values
dummies = pd.get_dummies(df['species']) # Classification
species = dummies.columns # species names used later to get column names for correponding argmax() predictions
y = dummies.values


# Build neural network
model = Sequential()
model.add(Dense(50, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(25, activation='relu')) # Hidden 2
model.add(Dense(y.shape[1],activation='softmax')) # Output

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(x,y,verbose=2,epochs=100)


Train on 150 samples
Epoch 1/100
150/150 - 1s - loss: 1.2509
Epoch 2/100
150/150 - 0s - loss: 1.0476
Epoch 3/100
150/150 - 0s - loss: 0.9299
Epoch 4/100
150/150 - 0s - loss: 0.8507
Epoch 5/100
150/150 - 0s - loss: 0.7959
Epoch 6/100
150/150 - 0s - loss: 0.7469
Epoch 7/100
150/150 - 0s - loss: 0.6987
Epoch 8/100
150/150 - 0s - loss: 0.6591
Epoch 9/100
150/150 - 0s - loss: 0.6208
Epoch 10/100
150/150 - 0s - loss: 0.5874
Epoch 11/100
150/150 - 0s - loss: 0.5592
Epoch 12/100
150/150 - 0s - loss: 0.5352
Epoch 13/100
150/150 - 0s - loss: 0.5139
Epoch 14/100
150/150 - 0s - loss: 0.4916
Epoch 15/100
150/150 - 0s - loss: 0.4726
Epoch 16/100
150/150 - 0s - loss: 0.4552
Epoch 17/100
150/150 - 0s - loss: 0.4396
Epoch 18/100
150/150 - 0s - loss: 0.4247
Epoch 19/100
150/150 - 0s - loss: 0.4114
Epoch 20/100
150/150 - 0s - loss: 0.3957
Epoch 21/100
150/150 - 0s - loss: 0.3835
Epoch 22/100
150/150 - 0s - loss: 0.3711
Epoch 23/100
150/150 - 0s - loss: 0.3570
Epoch 24/100
150/150 - 0s - loss: 0.3412
Epoc

<tensorflow.python.keras.callbacks.History at 0x1d71626f488>

In [22]:
pred = model.predict(x)
print("Shape: {pred.shape}")
print(pred)

Shape: {pred.shape}
[[9.99269783e-01 7.30180764e-04 2.41532710e-10]
 [9.97944772e-01 2.05522962e-03 1.95855088e-09]
 [9.98688519e-01 1.31148042e-03 1.28891220e-09]
 [9.97155309e-01 2.84463097e-03 5.41009637e-09]
 [9.99335825e-01 6.64129504e-04 2.41512282e-10]
 [9.98961449e-01 1.03854388e-03 2.53099097e-10]
 [9.98321950e-01 1.67801522e-03 2.25971153e-09]
 [9.98777807e-01 1.22219021e-03 6.67583988e-10]
 [9.96348321e-01 3.65167460e-03 1.17659420e-08]
 [9.98268127e-01 1.73180713e-03 1.41881595e-09]
 [9.99475420e-01 5.24624309e-04 8.23007842e-11]
 [9.98139739e-01 1.86027377e-03 1.84461946e-09]
 [9.98276472e-01 1.72348635e-03 1.69771286e-09]
 [9.98820007e-01 1.18000596e-03 2.15685469e-09]
 [9.99896884e-01 1.03136124e-04 2.88713324e-12]
 [9.99797761e-01 2.02260329e-04 1.00097135e-11]
 [9.99632955e-01 3.67011962e-04 4.74201615e-11]
 [9.99056399e-01 9.43601306e-04 3.75824594e-10]
 [9.99247313e-01 7.52659282e-04 8.94548879e-11]
 [9.99276102e-01 7.23958132e-04 2.32878772e-10]
 [9.98461127e-01 1.5

# Usually the column (pred) with the highest prediction is considered to be the prediction of the neural network.To convert the predictions to the expected iris species.  The argmax function finds the index of the maximum prediction for each row.

In [23]:
predict_classes = np.argmax(pred,axis=1)
expected_classes = np.argmax(y,axis=1)
print(f"Predictions: {predict_classes}")
print(f"Expected: {expected_classes}")

Predictions: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1
 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]
Expected: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]


In [30]:
np.argmax(y,axis=1)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2], dtype=int64)

In [33]:
# Of course it is very easy to turn these indexes back into iris species.  We just use the species list that we created earlier.
print(species[predict_classes[1:10]])

Index(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa'],
      dtype='object')


In [34]:
from sklearn.metrics import accuracy_score
correct = accuracy_score(expected_classes,predict_classes)
print(f"Accuracy: {correct}")

Accuracy: 0.9866666666666667


In [35]:
# ad hoc prediction
sample_flower = np.array( [[5.0,3.0,4.0,2.0]], dtype=float)
pred = model.predict(sample_flower)
print(pred)
pred = np.argmax(pred)
print(f"Predict that {sample_flower} is: {species[pred]}")

[[2.0889814e-04 2.2672406e-01 7.7306706e-01]]
Predict that [[5. 3. 4. 2.]] is: Iris-virginica


In [38]:
# prediction for 2 flowers
sample_flower = np.array( [[5.0,3.0,4.0,2.0],[5.2,3.5,1.5,0.8]], dtype=float)
pred = model.predict(sample_flower)
print(pred)
pred = np.argmax(pred,axis=1)
print(f"Predict that these two flowers {sample_flower} are: {species[pred]}")

[[2.0889814e-04 2.2672401e-01 7.7306706e-01]
 [9.9575365e-01 4.2463201e-03 4.1305803e-09]]
Predict that these two flowers [[5.  3.  4.  2. ]
 [5.2 3.5 1.5 0.8]] are: Index(['Iris-virginica', 'Iris-setosa'], dtype='object')


# Save Keras Model

In [39]:
# keras regression model

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
import pandas as pd
import io
import os
import requests
import numpy as np
from sklearn import metrics

save_path = "."

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", 
    na_values=['NA', '?'])

cars = df['name']

# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight',
       'acceleration', 'year', 'origin']].values
y = df['mpg'].values # regression

# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x,y,verbose=2,epochs=100)

Train on 398 samples
Epoch 1/100
398/398 - 1s - loss: 256847.5967
Epoch 2/100
398/398 - 0s - loss: 17761.0831
Epoch 3/100
398/398 - 0s - loss: 6322.2476
Epoch 4/100
398/398 - 0s - loss: 3279.4785
Epoch 5/100
398/398 - 0s - loss: 514.7990
Epoch 6/100
398/398 - 0s - loss: 563.9487
Epoch 7/100
398/398 - 0s - loss: 344.3071
Epoch 8/100
398/398 - 0s - loss: 341.5476
Epoch 9/100
398/398 - 0s - loss: 334.5883
Epoch 10/100
398/398 - 0s - loss: 324.1652
Epoch 11/100
398/398 - 0s - loss: 320.5975
Epoch 12/100
398/398 - 0s - loss: 315.3066
Epoch 13/100
398/398 - 0s - loss: 310.1297
Epoch 14/100
398/398 - 0s - loss: 306.7384
Epoch 15/100
398/398 - 0s - loss: 300.6337
Epoch 16/100
398/398 - 0s - loss: 295.2295
Epoch 17/100
398/398 - 0s - loss: 290.2338
Epoch 18/100
398/398 - 0s - loss: 286.0338
Epoch 19/100
398/398 - 0s - loss: 279.4256
Epoch 20/100
398/398 - 0s - loss: 275.7622
Epoch 21/100
398/398 - 0s - loss: 271.3305
Epoch 22/100
398/398 - 0s - loss: 265.3928
Epoch 23/100
398/398 - 0s - loss: 2

<tensorflow.python.keras.callbacks.History at 0x1d717da8108>

In [40]:
# keras model Prediction
pred = model.predict(x)

# Measure RMSE error.  RMSE is common for regression.
score = np.sqrt(metrics.mean_squared_error(pred,y))
print(f"Before save score (RMSE): {score}")

Before save score (RMSE): 5.502188727680335


In [41]:
# save neural network structure to JSON (no weights)
model_json = model.to_json()
with open(os.path.join(save_path,"network.json"), "w") as json_file:
    json_file.write(model_json)

In [42]:
# save neural network structure to YAML (no weights)
model_yaml = model.to_yaml()
with open(os.path.join(save_path,"network.yaml"), "w") as yaml_file:
    yaml_file.write(model_yaml)

In [43]:
# save entire network to HDF5 (save everything, suggested)
model.save(os.path.join(save_path,"network.h5"))

# Import saved model(load)

In [44]:
from tensorflow.keras.models import load_model
model2 = load_model(os.path.join(save_path,"network.h5"))
pred = model2.predict(x)
# Measure RMSE error.  RMSE is common for regression.
score = np.sqrt(metrics.mean_squared_error(pred,y))
print(f"After load score (RMSE): {score}")

After load score (RMSE): 5.502188727680335


# Early stopping to prevent overfitting in keras for classification

In [45]:
import pandas as pd
import io
import requests
import numpy as np
from sklearn import metrics
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/iris.csv", 
    na_values=['NA', '?'])

# Convert to numpy - Classification
x = df[['sepal_l', 'sepal_w', 'petal_l', 'petal_w']].values
dummies = pd.get_dummies(df['species']) # Classification
species = dummies.columns
y = dummies.values

# Split into validation and training sets
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)

# Build neural network
model = Sequential()
model.add(Dense(50, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(25, activation='relu')) # Hidden 2
model.add(Dense(y.shape[1],activation='softmax')) # Output
model.compile(loss='categorical_crossentropy', optimizer='adam')

monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, verbose=1, mode='auto',
        restore_best_weights=True)
model.fit(x_train,y_train,validation_data=(x_test,y_test),callbacks=[monitor],verbose=2,epochs=1000)


Train on 112 samples, validate on 38 samples
Epoch 1/1000
112/112 - 2s - loss: 1.7182 - val_loss: 1.4496
Epoch 2/1000
112/112 - 0s - loss: 1.2676 - val_loss: 1.1585
Epoch 3/1000
112/112 - 0s - loss: 1.0793 - val_loss: 1.0878
Epoch 4/1000
112/112 - 0s - loss: 1.0457 - val_loss: 1.0583
Epoch 5/1000
112/112 - 0s - loss: 1.0122 - val_loss: 0.9846
Epoch 6/1000
112/112 - 0s - loss: 0.9649 - val_loss: 0.9163
Epoch 7/1000
112/112 - 0s - loss: 0.9077 - val_loss: 0.8816
Epoch 8/1000
112/112 - 0s - loss: 0.8826 - val_loss: 0.8545
Epoch 9/1000
112/112 - 0s - loss: 0.8562 - val_loss: 0.8262
Epoch 10/1000
112/112 - 0s - loss: 0.8316 - val_loss: 0.7979
Epoch 11/1000
112/112 - 0s - loss: 0.8041 - val_loss: 0.7732
Epoch 12/1000
112/112 - 0s - loss: 0.7809 - val_loss: 0.7475
Epoch 13/1000
112/112 - 0s - loss: 0.7576 - val_loss: 0.7248
Epoch 14/1000
112/112 - 0s - loss: 0.7365 - val_loss: 0.7011
Epoch 15/1000
112/112 - 0s - loss: 0.7131 - val_loss: 0.6791
Epoch 16/1000
112/112 - 0s - loss: 0.6912 - val_l

<tensorflow.python.keras.callbacks.History at 0x1d718bca208>

In [46]:
from sklearn.metrics import accuracy_score

pred = model.predict(x_test)
predict_classes = np.argmax(pred,axis=1)
expected_classes = np.argmax(y_test,axis=1)
correct = accuracy_score(expected_classes,predict_classes)
print(f"Accuracy: {correct}")

Accuracy: 1.0


# Early stopping for keras regression

In [47]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
import pandas as pd
import io
import os
import requests
import numpy as np
from sklearn import metrics

df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", 
    na_values=['NA', '?'])

cars = df['name']

# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight',
       'acceleration', 'year', 'origin']].values
y = df['mpg'].values # regression

# Split into validation and training sets
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)

# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')

monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, verbose=1, mode='auto',
        restore_best_weights=True)
model.fit(x_train,y_train,validation_data=(x_test,y_test),callbacks=[monitor],verbose=2,epochs=1000)


Train on 298 samples, validate on 100 samples
Epoch 1/1000
298/298 - 1s - loss: 601531.3654 - val_loss: 426596.4500
Epoch 2/1000
298/298 - 0s - loss: 335496.6437 - val_loss: 218203.5463
Epoch 3/1000
298/298 - 0s - loss: 161332.0834 - val_loss: 94762.2663
Epoch 4/1000
298/298 - 0s - loss: 64166.6844 - val_loss: 31370.8041
Epoch 5/1000
298/298 - 0s - loss: 18394.3187 - val_loss: 7087.2570
Epoch 6/1000
298/298 - 0s - loss: 4648.8001 - val_loss: 1836.5724
Epoch 7/1000
298/298 - 0s - loss: 1040.8432 - val_loss: 377.3330
Epoch 8/1000
298/298 - 0s - loss: 241.1401 - val_loss: 156.6080
Epoch 9/1000
298/298 - 0s - loss: 162.8911 - val_loss: 179.3234
Epoch 10/1000
298/298 - 0s - loss: 184.2984 - val_loss: 184.6009
Epoch 11/1000
298/298 - 0s - loss: 178.7641 - val_loss: 169.5049
Epoch 12/1000
298/298 - 0s - loss: 164.4743 - val_loss: 158.5031
Epoch 13/1000
298/298 - 0s - loss: 158.8873 - val_loss: 154.5775
Epoch 14/1000
298/298 - 0s - loss: 155.4453 - val_loss: 153.8033
Epoch 15/1000
298/298 - 0s

Epoch 127/1000
298/298 - 0s - loss: 85.7200 - val_loss: 76.1153
Epoch 128/1000
298/298 - 0s - loss: 85.2002 - val_loss: 77.3899
Epoch 129/1000
298/298 - 0s - loss: 84.0575 - val_loss: 75.0421
Epoch 130/1000
298/298 - 0s - loss: 83.8872 - val_loss: 74.4798
Epoch 131/1000
298/298 - 0s - loss: 83.1597 - val_loss: 74.2514
Epoch 132/1000
298/298 - 0s - loss: 82.5767 - val_loss: 74.0298
Epoch 133/1000
298/298 - 0s - loss: 82.5184 - val_loss: 73.0249
Epoch 134/1000
298/298 - 0s - loss: 81.8958 - val_loss: 72.4504
Epoch 135/1000
298/298 - 0s - loss: 81.2926 - val_loss: 73.0990
Epoch 136/1000
298/298 - 0s - loss: 80.8175 - val_loss: 72.0204
Epoch 137/1000
298/298 - 0s - loss: 80.2946 - val_loss: 70.9528
Epoch 138/1000
298/298 - 0s - loss: 79.9049 - val_loss: 70.6063
Epoch 139/1000
298/298 - 0s - loss: 79.3853 - val_loss: 70.5067
Epoch 140/1000
298/298 - 0s - loss: 78.8545 - val_loss: 70.2605
Epoch 141/1000
298/298 - 0s - loss: 78.4770 - val_loss: 69.7306
Epoch 142/1000
298/298 - 0s - loss: 77.9

Epoch 255/1000
298/298 - 0s - loss: 46.1570 - val_loss: 38.8723
Epoch 256/1000
298/298 - 0s - loss: 45.7284 - val_loss: 38.6123
Epoch 257/1000
298/298 - 0s - loss: 45.5288 - val_loss: 38.9301
Epoch 258/1000
298/298 - 0s - loss: 45.5816 - val_loss: 38.2246
Epoch 259/1000
298/298 - 0s - loss: 45.2526 - val_loss: 39.4379
Epoch 260/1000
298/298 - 0s - loss: 45.0499 - val_loss: 37.8295
Epoch 261/1000
298/298 - 0s - loss: 44.8966 - val_loss: 38.3148
Epoch 262/1000
298/298 - 0s - loss: 45.0612 - val_loss: 38.7192
Epoch 263/1000
298/298 - 0s - loss: 44.2664 - val_loss: 37.2729
Epoch 264/1000
298/298 - 0s - loss: 44.2439 - val_loss: 37.7881
Epoch 265/1000
298/298 - 0s - loss: 44.0675 - val_loss: 37.4651
Epoch 266/1000
298/298 - 0s - loss: 43.7330 - val_loss: 36.9620
Epoch 267/1000
298/298 - 0s - loss: 44.4038 - val_loss: 36.9793
Epoch 268/1000
298/298 - 0s - loss: 44.9219 - val_loss: 37.7913
Epoch 269/1000
298/298 - 0s - loss: 44.5373 - val_loss: 36.3512
Epoch 270/1000
298/298 - 0s - loss: 43.1

Epoch 383/1000
298/298 - 0s - loss: 25.5954 - val_loss: 21.4623
Epoch 384/1000
298/298 - 0s - loss: 25.2212 - val_loss: 21.1101
Epoch 385/1000
298/298 - 0s - loss: 25.2521 - val_loss: 21.2401
Epoch 386/1000
298/298 - 0s - loss: 25.0695 - val_loss: 20.9491
Epoch 387/1000
298/298 - 0s - loss: 25.5824 - val_loss: 21.7602
Epoch 388/1000
298/298 - 0s - loss: 25.4777 - val_loss: 20.7244
Epoch 389/1000
298/298 - 0s - loss: 24.7155 - val_loss: 20.7606
Epoch 390/1000
298/298 - 0s - loss: 24.8363 - val_loss: 20.8932
Epoch 391/1000
298/298 - 0s - loss: 24.9593 - val_loss: 20.4661
Epoch 392/1000
298/298 - 0s - loss: 24.9351 - val_loss: 20.5834
Epoch 393/1000
298/298 - 0s - loss: 25.3079 - val_loss: 20.9935
Epoch 394/1000
298/298 - 0s - loss: 24.7151 - val_loss: 20.3323
Epoch 395/1000
298/298 - 0s - loss: 24.4771 - val_loss: 20.4650
Epoch 396/1000
298/298 - 0s - loss: 24.8977 - val_loss: 20.3785
Epoch 397/1000
298/298 - 0s - loss: 24.4047 - val_loss: 19.8287
Epoch 398/1000
298/298 - 0s - loss: 24.1

<tensorflow.python.keras.callbacks.History at 0x1d71a7aec48>

In [48]:
# Measure RMSE error.  RMSE is common for regression.
pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print(f"Final score (RMSE): {score}")

Final score (RMSE): 3.5470303870999276


# Keras model weights

In [49]:
for layerNum, layer in enumerate(model.layers):
    weights = layer.get_weights()[0]
    biases = layer.get_weights()[1]
    
    for toNeuronNum, bias in enumerate(biases):
        print(f'{layerNum}B -> L{layerNum+1}N{toNeuronNum}: {bias}')
    
    for fromNeuronNum, wgt in enumerate(weights):
        for toNeuronNum, wgt2 in enumerate(wgt):
            print(f'L{layerNum}N{fromNeuronNum} -> L{layerNum+1}N{toNeuronNum} = {wgt2}')

0B -> L1N0: 0.0
0B -> L1N1: -0.08754128962755203
0B -> L1N2: 0.0
0B -> L1N3: 0.0
0B -> L1N4: 0.0
0B -> L1N5: 0.025169340893626213
0B -> L1N6: 0.0
0B -> L1N7: 0.0
0B -> L1N8: -0.04608583822846413
0B -> L1N9: -0.00929209403693676
0B -> L1N10: -0.27740487456321716
0B -> L1N11: -0.02351556532084942
0B -> L1N12: 0.205696702003479
0B -> L1N13: 0.04827749729156494
0B -> L1N14: 0.0
0B -> L1N15: 0.5714868903160095
0B -> L1N16: 0.0033672847785055637
0B -> L1N17: -0.08883439004421234
0B -> L1N18: 0.0
0B -> L1N19: -0.45093655586242676
0B -> L1N20: 0.02861807309091091
0B -> L1N21: 0.0
0B -> L1N22: 0.0
0B -> L1N23: -0.03594549745321274
0B -> L1N24: 0.018059687688946724
L0N0 -> L1N0 = 0.30129334330558777
L0N0 -> L1N1 = 0.40546080470085144
L0N0 -> L1N2 = -0.41278594732284546
L0N0 -> L1N3 = 0.41416993737220764
L0N0 -> L1N4 = 0.12436291575431824
L0N0 -> L1N5 = 0.4080078899860382
L0N0 -> L1N6 = 0.1982124149799347
L0N0 -> L1N7 = 0.2423376739025116
L0N0 -> L1N8 = 0.10700766742229462
L0N0 -> L1N9 = -0.00456

L1N2 -> L2N9 = 0.10450932383537292
L1N3 -> L2N0 = 0.19307032227516174
L1N3 -> L2N1 = -0.04405874013900757
L1N3 -> L2N2 = -0.29962894320487976
L1N3 -> L2N3 = -0.009037524461746216
L1N3 -> L2N4 = 0.31300172209739685
L1N3 -> L2N5 = -0.158022940158844
L1N3 -> L2N6 = -0.3525236248970032
L1N3 -> L2N7 = -0.05689665675163269
L1N3 -> L2N8 = 0.1786433756351471
L1N3 -> L2N9 = -0.12635409832000732
L1N4 -> L2N0 = 0.1318085491657257
L1N4 -> L2N1 = 0.40179988741874695
L1N4 -> L2N2 = 0.3947342336177826
L1N4 -> L2N3 = 0.37295064330101013
L1N4 -> L2N4 = -0.2439098358154297
L1N4 -> L2N5 = -0.005406707525253296
L1N4 -> L2N6 = -0.2684944272041321
L1N4 -> L2N7 = -0.1223786473274231
L1N4 -> L2N8 = -0.16419044137001038
L1N4 -> L2N9 = -0.24121423065662384
L1N5 -> L2N0 = 0.2730081081390381
L1N5 -> L2N1 = -0.18867699801921844
L1N5 -> L2N2 = -0.20332853496074677
L1N5 -> L2N3 = 0.18829718232154846
L1N5 -> L2N4 = -0.31571701169013977
L1N5 -> L2N5 = 0.02237209677696228
L1N5 -> L2N6 = -0.1459578275680542
L1N5 -> L2N7

L2N6 -> L3N0 = 0.40386635065078735
L2N7 -> L3N0 = 0.2705513834953308
L2N8 -> L3N0 = -0.5875697731971741
L2N9 -> L3N0 = -0.38869836926460266
