# Introduction

This type of neural network connects all or part of the inputs directly to the output layer. This architecture is useful for a neural network to learn both patterns (using the deep path) and more simple rules (through the short path). In contrast to a regular MLP that simply forces all the data to flow throught the full stack of layers thus, causing simple patterns could be distorted by this sequence of transformations. 

In [1]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

2022-07-08 23:03:47.349803: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-08 23:03:47.349818: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [2]:
housing = fetch_california_housing()

X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)

In [3]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)

# Building Model with Functional API

In [None]:
input_ = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30, activation="relu")(input_)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.Concatenate()([input_, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.Model(inputs = [input_], outputs=[output])

The Input layer specifies the kind of input the model will get including the shape and dtype. Then two dense layers each with 30 neurons using the ReLU activation function. Then a Concatenate layer that concatenates the input and the output of the 2nd hidden layer. This Concatenate layer essentially combines the Deep and Wide networks. Then an output layer with only one neuron as we are predicting a value. Then we define the model specifying the inputs and outputs to use.

# Making Predictions

Steps: compile, train, evaluate, predictions

In [None]:
# 1. Compile
model.compile(loss="mean_squared_error", optimizer="sgd")

# 2. Train
history = model.fit(X_train, y_train, epochs = 15,
                   validation_data = (X_valid, y_valid))

# 3. Evaluate
mse_test = model.evaluate(X_test, y_test)

In [None]:
# take square root of loss
np.sqrt(mse_test)

## Same Model with Different Subset of Features

In [4]:
input_A = keras.layers.Input(shape = [5], name="wide input")
input_B = keras.layers.Input(shape = [6], name="deep input")
hidden1 = keras.layers.Dense(30, activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name = "output")(concat)
model_2 = keras.Model(inputs=[input_A, input_B], outputs=[output])

2022-07-08 23:03:53.721112: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-07-08 23:03:53.721541: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-07-08 23:03:53.721614: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2022-07-08 23:03:53.721664: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2022-07-08 23:03:53.721709: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Co

For this we chose input_A to be the wide input taking up 5 features, and the input that gets passed through the deep channel of the network contains 6 feaures

In [5]:
model_2.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 deep input (InputLayer)        [(None, 6)]          0           []                               
                                                                                                  
 dense (Dense)                  (None, 30)           210         ['deep input[0][0]']             
                                                                                                  
 wide input (InputLayer)        [(None, 5)]          0           []                               
                                                                                                  
 dense_1 (Dense)                (None, 30)           930         ['dense[0][0]']                  
                                                                                              

In [6]:
# Now when calling the fit method we need to pass a pair of matrices, one per input
X_train_A, X_train_B = X_train[:, :5], X_train[:, 2:]
X_valid_A, X_valid_B = X_valid[:, :5], X_valid[:, 2:]
X_test_A, X_test_B = X_test[:, :5], X_test[:, 2:]
X_new_A, X_new_B = X_test_A[:3], X_test_B[:3]

In [7]:
# 1. Compile
model_2.compile(loss="mse", optimizer=tf.keras.optimizers.SGD(learning_rate=1e-3))

# 2. Train
history = model_2.fit((X_train_A, X_train_B), y_train, epochs = 15,
                   validation_data = ((X_valid_A, X_valid_B), y_valid))

# 3. Evaluate
mse_test = model_2.evaluate((X_test_A, X_test_B), y_test)

# 4. Predict
y_preds = model_2.predict((X_new_A, X_new_B))

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


In [8]:
np.sqrt(mse_test)

100.32227366716725