# 01 - Neural Network Regression with TensorFlow

**Content of this notebook :**
- Architecture of a regression model
- Input shapes and output shapes
    - `X`: features/data (inputs)
    - `y`: labels (outputs)
- Creating custom data to view and fit
- Steps in modelling
    - Creating a model
    - Compiling a model
        - Defining a loss function
        - Setting up an optimizer
        - Creating evaluation metrics
    - Fitting a model (getting it to find patterns in our data)
- Evaluating a model
    - Visualizing the model 
    - Looking at training curves
    - Compare predictions to ground truth
- Saving a model
- Loading a model

In [1]:
# import tensorfow
import tensorflow as tf
print(tf.__version__)

2.8.0


## What is a regression problem?

There are many definitions for a regression problem but in our case, we're going to simplify it to be: predicting a number.

For example, we might want to:
- Predict the selling price of houses given information about them (such as number of rooms, size, number of bathrooms...).
- Predict coordinates of a bounding box of an item in an image.
- Predict the cost of medical insurance for an individual given their demographics (age, sex, gender, race).

## Typical architecture of a regression neural network

The word typical is on purpose. Why?

Because there are many different ways to write neural networks. But the following is a generic setup for ingesting a collection of numbers, finding patterns in them and then outputing some kind of target number.

| **Hyperparameter** | **Typical Value** |
| --- | --- |
| Input layer shape | Same shape as number of features (e.g. 3 for #bedrooms, #bathrooms, #car spaces in housing price prediction) |
| Hidden layer(s) | Problem specific, minimum = 1, maximum = unlimited |
| Neurons per hidden layer | Problem specific, generally 10 to 100 |
| Output layer shape | Same shape as desired prediction shape (e.g. 1 for house price) |
| Hidden activation | Usually ReLU (rectified linear unit) |
| Output activation | None, ReLU, logistic/tanh |
| Loss function | MSE (mean squar error) or MAE (mean absolute error) ... |
| Optimizer | SGD (stochastic gradient descent), Adam ... |

***Table 1:*** *Typical architecture of a regression network.* ***Source:*** *Adapted from page 293 of [Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow Book by Aurélien Géron](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/)*

> **Note**: A **hyperparamter** in machine learning is something a data analyst or developer can set themselves, where as a **parameter** usually describes something a model learns on its own.

## Creating data to view and fit

Since we're working on a **regression problem** (predicting a number) let's create some linear data to model.