### Codio Activity 23.5: Neural Networks for Regression

**Expected Time = 90 minutes** 

**Total Points = 40** 

This activity focuses on using a neural network to build a model for regression data to predict housing prices.  Most of the work is similar to that from your earlier classification models with the inclusion of different loss functions and output layer geometry.  

#### Index

- [Problem 1](#-Problem-1)
- [Problem 2](#-Problem-2)
- [Problem 3](#-Problem-3)
- [Problem 4](#-Problem-4)

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

In [2]:
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler

#### The Data

The dataset contains basic information on houses in a given neighborhood in California, USA. The data is loaded and description of the data printed to the screen.  Your goal is to predict the Median House Value (`MedHouseVal`)
for each neighborhood.

In [4]:
houses = fetch_california_housing(as_frame=True)
houses.frame.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,MedHouseVal
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23,4.526
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22,3.585
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24,3.521
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,3.413
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,3.422


In [6]:
# print(houses.DESCR)

In [8]:
X, y = houses.data, houses.target
display(X)
X = StandardScaler().fit_transform(X)  # scale the data

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
0,8.3252,41.0,6.984127,1.023810,322.0,2.555556,37.88,-122.23
1,8.3014,21.0,6.238137,0.971880,2401.0,2.109842,37.86,-122.22
2,7.2574,52.0,8.288136,1.073446,496.0,2.802260,37.85,-122.24
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25
...,...,...,...,...,...,...,...,...
20635,1.5603,25.0,5.045455,1.133333,845.0,2.560606,39.48,-121.09
20636,2.5568,18.0,6.114035,1.315789,356.0,3.122807,39.49,-121.21
20637,1.7000,17.0,5.205543,1.120092,1007.0,2.325635,39.43,-121.22
20638,1.8672,18.0,5.329513,1.171920,741.0,2.123209,39.43,-121.32


[Back to top](#-Index)

### Problem 1

#### The Network Architecture

Use the function `Sequential()` to create a neural network `model` with the following architecture:


- A single hidden `Dense` layer with 100 hidden nodes and with `activation` equal to `relu` 
- A single hidden `Dense` layer with 1 unit and with activation equal to  `linear` 

In [10]:
### GRADED
tf.random.set_seed(42)
model = Sequential(
    [
        Dense(100, activation="relu"),
        Dense(1, activation="linear"),
    ]
)

### ANSWER CHECK
model.layers[0].units

100

[Back to top](#-Index)

### Problem 2

#### Compiling the Network


Use the function `compile` to compile `model` using `mse` as your `loss` and `mse` as your `metric`.


In [11]:
### GRADED
tf.random.set_seed(42)
# compile model here
model.compile(
    optimizer="rmsprop",
    loss="mse",
    metrics=["mse"],
)

### ANSWER CHECK
print(model.loss)

mse


[Back to top](#-Index)

### Problem 3

#### Training the model

Use the function `fit()` to fit your `model` to the `X` and `y` data. Set the argument `validation_split` equal to 0.2, the argument `epochs` equal to `20`, and the argument `verbose` to `0`. Assign your result to the variable `history`.


In [12]:
### GRADED
tf.random.set_seed(42)
history = model.fit(
    X,
    y,
    epochs=20,
    verbose=0,
    validation_split=0.2,
)

### ANSWER CHECK
print(history.history)

{'loss': [0.8567278385162354, 0.5092543363571167, 0.43461868166923523, 0.3748684525489807, 0.3519285321235657, 0.34590446949005127, 0.34400296211242676, 0.33761346340179443, 0.3439137041568756, 0.3407077491283417, 0.336196631193161, 0.3346494138240814, 0.3415217399597168, 0.33588358759880066, 0.339256227016449, 0.3744313716888428, 0.36362424492836, 0.35506248474121094, 0.35232454538345337, 0.3409075140953064], 'mse': [0.8567278385162354, 0.5092543363571167, 0.43461868166923523, 0.3748684525489807, 0.3519285321235657, 0.34590446949005127, 0.34400296211242676, 0.33761346340179443, 0.3439137041568756, 0.3407077491283417, 0.336196631193161, 0.3346494138240814, 0.3415217399597168, 0.33588358759880066, 0.339256227016449, 0.3744313716888428, 0.36362424492836, 0.35506248474121094, 0.35232454538345337, 0.3409075140953064], 'val_loss': [2.6782796382904053, 1.8503670692443848, 1.1291145086288452, 0.5751430988311768, 0.43720027804374695, 0.57865309715271, 0.5880486965179443, 0.7523414492607117, 0.

[Back to top](#-Index)

### Problem 4

#### Comparing to `LinearRegression`

Compare the performance of the model in terms of mean squared error with that of a `LinearRegression` model on the full dataset `X`, `y`. Assign your result to the variable `lr`.

Finally, use the function `mean_squared_error` with arguments equal to `y` and `lr.predict(X)` to compute your error. Assign the result to the variable `lr_mse`.

In [13]:
from sklearn.metrics import mean_squared_error

In [None]:
### GRADED
lr = LinearRegression().fit(X, y)
yhat = lr.predict(X)
lr_mse = mean_squared_error(y, yhat)

### ANSWER CHECK
print(lr_mse)