In [0]:
from sklearn.datasets import fetch_california_housing

In [0]:
from sklearn.model_selection import train_test_split


In [0]:
from sklearn.preprocessing import StandardScaler

In [4]:
housing = fetch_california_housing()

Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to /root/scikit_learn_data


In [0]:
X_train_full, X_test, y_train_full, y_test = train_test_split( housing.data, housing.target)

**WHY SPLITTING TRAIN FURTHER**

In [0]:
X_train, X_valid, y_train, y_valid = train_test_split( X_train_full, y_train_full)

In [0]:
scaler = StandardScaler()

In [0]:
X_train = scaler.fit_transform(X_train)

In [0]:
X_valid = scaler.transform(X_valid)

In [0]:
X_test = scaler.transform(X_test)

In [12]:
X_train.shape[1:]

(8,)

In [13]:
X_train.shape

(11610, 8)

In [0]:
import tensorflow as tf
from tensorflow import keras

Using the Sequential API to build, train, evaluate, and use a regression MLP to make predictions is quite similar to what we did for classification. The main differences are the fact that the output layer has a single neuron (since we only want to predict a single value) and uses no activation function, and the loss function is the mean squared error.

In [0]:
input_ = keras.layers.Input(shape=X_train.shape[1:]) 
hidden1 = keras.layers.Dense(30, activation="relu")(input_) 
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) 
concat = keras.layers.Concatenate()([input_, hidden2]) 
output = keras.layers.Dense(1)(concat) 
model = keras.Model(inputs=[input_], outputs=[output])

**NOTE NONE**

In keras, a None dimension means that it can be any scalar number, so that you use this model to infer on an arbitrarily long input. This dimension does not affect the size of the network, it just denotes that you are free to select the length (number of samples) of your input during testing.

In [17]:
input_.shape

TensorShape([None, 8])

In [18]:
hidden2.shape

TensorShape([None, 30])

In [20]:
concat.shape

TensorShape([None, 38])

In [0]:
model.compile(loss="mean_squared_error", optimizer="sgd")

In [22]:
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_valid, y_valid))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [23]:
mse_test = model.evaluate(X_test, y_test)



In [0]:
X_new = X_test[:3]

In [0]:
y_pred = model.predict(X_new)

**MULTIPLE INPUT**
But what if you want to send a subset of the features through the wide path and a different subset (possibly overlapping) through the deep path (see Figure 10-15)? In this case, one solution is to use multiple inputs.

In [0]:
input_A = keras.layers.Input(shape=[5], name="wide_input") 
input_B = keras.layers.Input(shape=[6], name="deep_input") 
hidden1 = keras.layers.Dense(30, activation="relu")(input_B) 
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) 
concat = keras.layers.concatenate([input_A, hidden2]) 
output = keras.layers.Dense(1, name="output")(concat) 
model = keras.Model(inputs=[input_A, input_B], outputs=[output])

In [0]:
model.compile(loss="mse", optimizer=keras.optimizers.SGD(lr=1e-3))

In [28]:
X_train_A, X_train_B = X_train[:, :5], X_train[:, 2:] 
X_valid_A, X_valid_B = X_valid[:, :5], X_valid[:, 2:] 
X_test_A, X_test_B = X_test[:, :5], X_test[:, 2:] 
X_new_A, X_new_B = X_test_A[:3], X_test_B[:3]
history = model.fit((X_train_A, X_train_B), y_train, epochs=20, validation_data=((X_valid_A, X_valid_B), y_valid))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [29]:
mse_test = model.evaluate((X_test_A, X_test_B), y_test) 
y_pred = model.predict((X_new_A, X_new_B))



**AUXILARY OUTPUT**

In [0]:
output = keras.layers.Dense(1, name="main_output")(concat) 
aux_output = keras.layers.Dense(1, name="aux_output")(hidden2) 
model = keras.Model(inputs=[input_A, input_B], outputs=[output, aux_output])

Each output will need its own loss function.
we want to give the main output’s loss a much greater weight

In [0]:
model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer="sgd")

Now when we train the model, we need to provide labels for each output. In this example, the main output and the auxiliary output should try to predict the same thing, so they should use the same labels. So instead of passing y_train, we need to pass (y_train, y_train) (and the same goes for y_valid and y_test):

In [32]:
history = model.fit( [X_train_A, X_train_B], [y_train, y_train], epochs=20, validation_data=([X_valid_A, X_valid_B], [y_valid, y_valid]))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [33]:
total_loss, main_loss, aux_loss = model.evaluate( [X_test_A, X_test_B], [y_test, y_test])



When we evaluate the model, Keras will return the total loss, as well as all the individual losses:

In [0]:
y_pred_main, y_pred_aux = model.predict([X_new_A, X_new_B])