# Dataset

California Housing
- Data is different from Chapter 2 which has categorical features and has missing values. This is already cleaned in that sense for our purpose of focusing on the technical aspects of building a regression mlp using Sequential API. 

In [1]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow import keras

# Loading the dataset

In [2]:
housing = fetch_california_housing()

Downloading Cal. housing from https://ndownloader.figshare.com/files/5976036 to /root/scikit_learn_data


In [3]:
X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target)

# Preprocessing the dataset

In [4]:
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)

In [5]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)

**Remark:** Main differences between building a regression MLP and building a classification MLP
* the output layer has a single neuron and has no activation function
* loss function is MAE

# Creating the model using the Sequential API

In [6]:
model = keras.models.Sequential([
  keras.layers.Dense(30, activation="relu", input_shape=X_train.shape[1:]),
  keras.layers.Dense(1)                               
])

# Compiling the model

In [7]:
model.compile(loss="mean_squared_error", optimizer="sgd")

# Training and evaluating the model

In [8]:
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_valid, y_valid))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


# Evaluating the model

In [9]:
mse_test = model.evaluate(X_test, y_test)



# Making Predictions

In [11]:
X_new = X_test[:3]
y_pred = model.predict(X_new)
y_pred

array([[3.26354  ],
       [1.3323394],
       [1.8759527]], dtype=float32)