<a href="https://colab.research.google.com/github/georgiastuart/python_data_science_for_teachers/blob/main/Python_For_Data_Science_Lesson_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python for Data Science Lesson 2: Introduction to Regression Neural Networks

This lesson is inspired by [this notebook](https://www.kaggle.com/arunkumarramanan/tensorflow-tutorial-and-housing-price-prediction) on Kaggle. 

We will use the [Tensorflow](https://www.tensorflow.org/api_docs/python/tf) Python library in order to build a neural network to predict housing prices (in 1970s Boston). 

## What is a Neural Network?

A *neural network* is a mathematical structure composed of layers of neurons inspired by how our brains work. 

[This 3Blue1Brown video](https://www.youtube.com/watch?v=aircAruvnKk) is a great introduction to the structure of neural networks.

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

## Acquiring Data

For this tutorial, we'll use data that's provided by the `keras` module itself. We'll load a *training set* and its associated *labels* (output) and a *test set* with labels. 

The data is [Boston Housing Price Data](https://keras.io/api/datasets/boston_housing/) which is composed of 13 factors that may predict the price of a house in Boston. The keras dataset is simply a numpy array, but we can see what each column refers to [here](http://lib.stat.cmu.edu/datasets/boston).

In [None]:
(train_features, train_labels), (test_features, test_labels) = keras.datasets.boston_housing.load_data()

Lets look at the data:

In [None]:
print(train_features.shape)
print(train_labels.shape)

As you can see, we have 404 data observations and 13 pieces of information 

Now we need to set up the data for training the network. If we look at the data, we see that each line of data is all different magnitudes:

In [None]:
print(train_features[0, :])

Neural networks are more effectively trained when data is *normalized*, so we're going to scale each column so that they're on the same scale.

We do that by calculating the *z-score* of each datapoint (the number of standard deviations away from the mean):

$$\frac{x - \bar{x}}{\sigma}$$

where $x$ is a data point, $\bar{x}$ is the mean of the feature, and $\sigma$ is the standard deviation of the feature.

In [None]:
train_mean = np.mean(train_features, axis=0)
print(train_mean.shape, train_mean)
train_std = np.std(train_features, axis=0)

normalized_train_features = (train_features - train_mean) / train_std

In [None]:
label_mean = np.mean(train_labels)
label_std = np.std(train_labels)
normalized_train_labels = (train_labels - label_mean) / label_std

Now, if we look at the mean of the normalized train features, it will be zero for all features (within floating point error) and the standard deviation will be 1 for all features:

In [None]:
print(np.mean(normalized_train_features, axis=0))
print(np.std(normalized_train_features, axis=0))

Now that our data is normalized, we need to build the structure of our neural network:

In [None]:
def build_model():
  model = keras.Sequential([
                            Dense(20, activation=tf.nn.relu, input_shape=[normalized_train_features.shape[1]]),
                            Dense(1)
  ])

  model.compile(optimizer=keras.optimizers.Adam(), loss='mse', metrics=['mae', 'mse'])
  return model

Now we need to train our neural network. 

Here's the next video in the [3Blue1Brown Neural Network series](https://www.youtube.com/watch?v=IHZwWFHWa-w).

In [None]:
model = build_model()
history = model.fit(normalized_train_features, normalized_train_labels, epochs=1000, validation_split=0.1, verbose=0)

In [None]:
hist = pd.DataFrame(history.history)
hist

In [None]:
plt.plot(hist['mse'])
plt.xlabel('Epoch')
plt.ylabel('MSE');

Now lets look at a few pieces of test data and see what our network preducts!

In [None]:
normalized_test_features = (test_features - train_mean) / train_std
normalized_test_features.shape

In [None]:
result = model.predict(normalized_test_features)
print(result.shape)

plt.scatter(test_labels, result[:, 0] * label_std + label_mean)
plt.xlabel('True Value (1000s of $)')
plt.ylabel('Predicted Value (1000s of $)');
