# 02 - Neural Network Classification with TensorFlow

**Content of this notebook :**
- Architecture of a classification model
- Input shapes and output shapes
    - `X` : features/data (inputs)
    - `y` : labels (outputs)
        - "What class do the inputs belong to?"
- Creating custom data to view and fit
- Steps in modelling for binary and multiclass classification
    - Creating a model
    - Compiling a model
        - Defining a loss function
        - Setting up an optimizer
            - Finding the best learning rate
        - Creating evaluation metrics
    - Fitting a model (getting it to find patterns in our data)
    - Improving a model
- The power of non-linearity
- Evaluating classification models
    - Visualizing the model
    - Looking at training curves
    - Compare predictions to ground truth

## What is a Classification problem?

A `classification problem` involves predicting whether something is one thing or another.

For example we might want to:
- Predict whether or not someone has heart disease based on their health parameters. This is called **binary classification** since there are only two options.
- Decide whether a photo is of food, a person or a dog. This is called **multi-class classification** sine there are more than two options.
- Predict what categories should be assigned to Wikipedia article. This is called **multi-label classification** since a single article could have more than one category assigned.

## Typical Architecture of neural network classification models with TensorFlow

| **Hyperparameter** | **Binary Classification** | **Multiclass Classification** |
| --- | --- | --- |
| Input layer shape | Same shape as number of features (e.g. 5 for age,sex,height,weight, smoking status in heart disease prediction) | Same as binary classification | Same as binary classification |
| Hidden layer(s) | Problem specific, minimum = 1, maximum = unlimited | Same as binary classification |
| Neurons per hidden layer | Problem specific, generally 10 to 100 | Same as binary classification |
| Output layer shape | 1 (one class or the other) | 1 per class (e.g. 3 for food, person or dog photo) |
| Hidden activation | Usually ReLU (rectified linear unit) | Same as binary classification |
| Output activation | Sigmoid | Softmax |  
| Loss function | Cross entropy (`tf.keras.losses.BinaryCrossentropy` in TensorFlow) | Cross entropy (`tf.keras.losses.CategoricalCrossentropy` in TensorFlow) |
| Optimizer | SGD (stochastic gradient descent), Adam ... | Same as binary classification |

***Table 1:*** *Typical architecture of a classification network.* ***Source:*** *Adapted from page 295 of [Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow Book by Aurélien Géron](https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/)*

In [1]:
# import tensorflow
import tensorflow as tf
print(tf.__version__)

2.8.0


## Creating and Viewing classification data