# Defining neural networks with Keras

## Sequential API

![image-14](image-14.png)

- A good way to construct this model in Keras is to use the sequential API. 
- This API is simpler and makes strong assumptions about how you will construct your model. 
- It assumes that you have an input layer, some number of hidden layers, and an output layer. 
- All of these layers are ordered one after the other in a sequence.

In [1]:
from tensorflow import keras

# Define a Keras sequential model
model = keras.Sequential()

# Define the first dense layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the second dense layer
model.add(keras.layers.Dense(8, activation='relu'))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Print the model architecture
print(model.summary())

2023-05-24 02:02:17.773172: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-05-24 02:02:17.773196: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 16)                12560     
                                                                 
 dense_1 (Dense)             (None, 8)                 136       
                                                                 
 dense_2 (Dense)             (None, 4)                 36        
                                                                 
Total params: 12,732
Trainable params: 12,732
Non-trainable params: 0
_________________________________________________________________
None


2023-05-24 02:02:19.270771: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-05-24 02:02:19.270792: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-05-24 02:02:19.270808: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (9333b831-c81f-4fcc-aebe-67a924348f1e): /proc/driver/nvidia/version does not exist
2023-05-24 02:02:19.270999: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
# Define a Keras sequential model
model = keras.Sequential()

# Define the first dense layer
model.add(keras.layers.Dense(16, activation = 'sigmoid', input_shape=(784,)))

# Apply dropout to the first layer's output
model.add(keras.layers.Dropout(0.25))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile('adam', loss='categorical_crossentropy')

# Print a model summary
print(model.summary())

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 16)                12560     
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_4 (Dense)             (None, 4)                 68        
                                                                 
Total params: 12,628
Trainable params: 12,628
Non-trainable params: 0
_________________________________________________________________
None


## Functional API

![image-15](image-15.png)

-  If you want to train two models jointly to predict the same target, functional API is for that.

In [3]:
import pandas as pd
import tensorflow as tf

slang_func = pd.read_csv('datasets/slmnist.csv', header=None)
slang_func.shape

(2000, 785)

In [4]:
import numpy as np

# Cretaing 2 inputs 
inputs_1_features = np.array(slang_func.drop(0,axis=1), dtype=np.float32)
inputs_2_features = np.array(slang_func.drop(0,axis=1), dtype=np.float32)

targets = np.array(pd.get_dummies(slang_func[0]), dtype=np.float32)

In [5]:
# Creating architecture for each input
inputs1 = tf.keras.Input(shape=(784,))
hidden_1_1 = tf.keras.layers.Dense(16, activation='relu')(inputs1)
hidden_2_1 = tf.keras.layers.Dense(8, activation='relu')(hidden_1_1)
outputs_1 = tf.keras.layers.Dense(4, activation='softmax')(hidden_2_1)

inputs2 = tf.keras.Input(shape=(784,))
hidden_1_2 = tf.keras.layers.Dense(16, activation='sigmoid')(inputs2)
hidden_2_2 = tf.keras.layers.Dense(8, activation='relu')(hidden_1_2)
outputs_2 = tf.keras.layers.Dense(4, activation='softmax')(hidden_2_2)

In [6]:
# Merge the outputs 
merged = tf.keras.layers.add([outputs_1,outputs_2])

In [7]:
# Create functional model
model = tf.keras.models.Model(inputs=[inputs1, inputs2], outputs=merged)

# Summary of the model
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 784)]        0           []                               
                                                                                                  
 input_2 (InputLayer)           [(None, 784)]        0           []                               
                                                                                                  
 dense_5 (Dense)                (None, 16)           12560       ['input_1[0][0]']                
                                                                                                  
 dense_8 (Dense)                (None, 16)           12560       ['input_2[0][0]']                
                                                                                              

In [8]:
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [9]:
# Fit the model
model.fit([inputs_1_features, inputs_2_features], targets, epochs=10, validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f664c240c10>

# Training and validation with Keras

1. Build the model with layers and activation functions
2. Compile with optimizers and loss 
3. Fit the model to training data (Train the model)
4. Evaluate on training and testing sets


## The fit() operation
- Required arguments:
    - `features`
    - `labels`

- Many optional arguments
    - `batch_size` : The number of examples in each batch is the batch size, which is 32 by default. 
    - `epochs` : The number of times you train on the full set of batches is called the number of epochs.
    - `validation_split` :It divides the dataset into two parts. The first part is the train set and the second part is the validation set.

In [10]:
import pandas as pd
slang = pd.read_csv('datasets/slmnist.csv', header=None)
slang.shape

(2000, 785)

In [11]:
import numpy as np

# Define features and labels
sign_language_features = np.array(slang.drop(0,axis=1), dtype=np.float32)
sign_language_labels = np.array(pd.get_dummies(slang[0]), dtype=np.float32)

## Training with Keras

In [12]:
# Define a sequential model
model = keras.Sequential()

# Define a hidden layer
model.add(keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))

# Compile the model
model.compile('SGD', loss='categorical_crossentropy')

# Complete the fitting operation
model.fit(sign_language_features, sign_language_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f66241bc730>

## Metrics and validation with Keras

In [13]:
# Define features and labels
sign_language_features = np.array(slang.drop(0,axis=1), dtype=np.float32)
sign_language_labels = np.array(pd.get_dummies(slang[0]), dtype=np.float32)

In [14]:
# Define sequential model
model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))

# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))

# Set the optimizer, loss function, and metrics
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])

# Add the number of epochs and the validation split
model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f6624096310>

## Overfitting detection

In [15]:
# Define features and labels
sign_language_features = np.array(slang.drop(0,axis=1), dtype=np.float32)
sign_language_labels = np.array(pd.get_dummies(slang[0]), dtype=np.float32)

In [16]:
# Define sequential model
model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(1024, activation='relu', input_shape=(784,)))

# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))

# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(lr=0.001), 
              loss='categorical_crossentropy', metrics=['accuracy'])

# Complete the model fit operation
model.fit(sign_language_features, sign_language_labels, epochs=50, validation_split=0.5)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7f6608351070>

**Notice** that the validation loss, val_loss, was substantially higher than the training loss, loss. Furthermore, if val_loss started to increase before the training process was terminated, then we may have overfitted. When this happens, you will want to try decreasing the number of epochs.

## Evaluating models

In [28]:
# Define features and labels
sign_language_features = np.array(slang.drop(0,axis=1), dtype=np.float32)
sign_language_labels = np.array(pd.get_dummies(slang[0]), dtype=np.float32)

In [29]:
# Split for evaluation
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(sign_language_features, sign_language_labels,
                                                   test_size=0.3,
                                                   random_state=42)

In [30]:
import tensorflow as tf

# Create sequential model 
model = tf.keras.Sequential()

# Design first layer
model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784,)))

# Design second layer
model.add(tf.keras.layers.Dense(8, activation='relu'))

# Design output layer
model.add(tf.keras.layers.Dense(4, activation='softmax'))

# Summary of the model
model.summary()

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_20 (Dense)            (None, 16)                12560     
                                                                 
 dense_21 (Dense)            (None, 8)                 136       
                                                                 
 dense_22 (Dense)            (None, 4)                 36        
                                                                 
Total params: 12,732
Trainable params: 12,732
Non-trainable params: 0
_________________________________________________________________


In [31]:
# Compiling with optimizer and loss function
model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
             loss='categorical_crossentropy',
             metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, epochs=10, validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f65b4165f40>

In [32]:
# Evaluate the train 
train = model.evaluate(X_train, y_train)

# Evaluate the test
test = model.evaluate(X_test, y_test)




In [33]:
print("Train: Loss = ",train[0], ", Accuracy = ", train[1])
print("Test: Loss = ",test[0], ", Accuracy = ", test[1])

Train: Loss =  0.7640196681022644 , Accuracy =  0.6485714316368103
Test: Loss =  0.7368065714836121 , Accuracy =  0.6850000023841858


# Training models with Estimator API

![image-16](image-16.png)

- High level submodule
- Less flexible
- Enforces best practices
- Faster deployment
- Many premade models

## Model specification and training

1. **Define feature columns** : specify the shape and type of your data.
2. **Load and transform data** : load and transform your data within a function. The output of this function will be a dictionary object of features and your labels.
3. **Define an estimator** : use premade estimators or define custom estimators with different architectures.
4. **Apply train operation** : train the model you defined.

In [23]:
housing = pd.read_csv('datasets/kc_house_data.csv')
housing.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,condition,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,3,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,3,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,3,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,5,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,3,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503


In [24]:
# Define feature columns for bedrooms and bathrooms
bedrooms = tf.feature_column.numeric_column("bedrooms")
bathrooms = tf.feature_column.numeric_column('bathrooms')

# Define the list of feature columns
feature_list = [bedrooms, bathrooms]

# Load and transform data
def input_fn():
	# Define the labels
	labels = np.array(housing['price'])
	# Define the features
	features = {'bedrooms':np.array(housing['bedrooms']), 
                'bathrooms':np.array(housing['bathrooms'])}
	return features, labels

In [25]:
# Define the deep neural network regressor model with 2 nodes in both the first and second hidden layers
model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])
model.train(input_fn, steps=10) # <-- 10 training step

2023-05-24 02:02:35.682055: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2023-05-24 02:02:35.685204: W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT64
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT32
    }
  }
}

	while inferring type of node 'dnn/zero_fraction/cond/output/_18'


<tensorflow_estimator.python.estimator.canned.dnn.DNNRegressorV2 at 0x7f65b52e0580>

In [26]:
model.evaluate(input_fn, steps=10)

{'average_loss': 426474570000.0,
 'label/mean': 540088.25,
 'loss': 426474540000.0,
 'prediction/mean': -2.7017937,
 'global_step': 10}