## Model Hyperparameters

### Regression Hyperparameters

**Input Layer Shape:**  Same shape as the number of features (e.g. 3 in our example)

**Hidden Layer(s):**    Problem specific. Ranges from 1 to unlimited.

**Neurons per Hidden Layer:** Problem specific. Generally 10 to 100.

**Output Layer Shape:** Same shape as desired prediction (e.g. 1 for house price)

**Hidden Activation:** Usually Rectified Linear Unit (ReLU)

**Output Activation:** None, ReLU, logistic/tanh

**Loss Function:** Mean Squre Error (MSE), Mean Absolute Error (MAE), Huber (combination of MSE & MAE)

**Optimizer:** Stochastic Gradient Descent (SGD), Adam

Note: Stochastic means random

### Binary Classification Hyperparameters

**Input Layer Shape:** Same as number of features

**Hidden Layer(s):** Problem specific. minimum 1, maximum=unlimited

**Neurons per Hidden Layer:** Problem specific. Generally 10 to 100

**Output Layer Shape:** 1 (one class or the other)

**Hidden Activation:** Usually Rectified Linear Unit (ReLU)

**Output Activation:** Sigmoid

**Loss Function:** Cross entropy (tf.keras.losses.BinaryCrossentropy in TensorFlow)

**Optimizer:** Stochastic Gradient Descent, Adam (Adam is safest)


### Multi-Class Classification Hyperparameters

**Input Layer Shape:** Same as number of features

**Hidden Layer(s):** Problem specific. minimum 1, maximum=unlimited

**Neurons per Hidden Layer:** Problem specific. Generally 10 to 100

**Output Layer Shape:** 1 per class

**Hidden Activation:** Usually Rectified Linear Unit (ReLU)

**Output Activation:** Softmax

**Loss Function:** Cross entropy
  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*Integer Labels:* tf.keras.losses.SparseCategoricalCrossentropy()

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*One-Hot Labels:* tf.keras.losses.CategoricalCrossentropy()

**Optimizer:** Stochastic Gradient Descent, Adam (Adam is safest)

**Metrics:** Accuracy

### Convolutional Neural Network (CNN) Hyperparamters

**Input Image(s):** The images that we're going to discover patterns in

**Input Layer:** Takes in the target images and preprocesses them for further layers input_shape=[batch_size, image_height, image_width, color_channels]

**Convolution Layers:** Learns the most important features from the images; created with tf.keras.layers.Conv2D or tf.keras.layers.ConvXD

**Hidden activation:** Usually ReLU tf.keras.activations.relu

**Pooling Layer:** Reduces the dimensionality of learned image features. Usually either:

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*Average:* tf.keras.layers.AvgPool2D

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*Max:* tf.keras.layers.MaxPool2D

**Fully connected layer:** Further refines learned features from the convolution layers: tf.keras.layers.Dense

**Ouptut Layer:** output_shape = [number of classes]

**Output Activation:** sigmoid

## Binary Classification Models

### Binary Classification Base Model

In [21]:
import tensorflow as tf

def binary_classification_base_model():
  model = tf.keras.Sequential([
  tf.keras.layers.Input(shape=(2,)),
  tf.keras.layers.Dense(10, activation="relu"),
  tf.keras.layers.Dense(10, activation="relu"),
  tf.keras.layers.Dense(1, activation="sigmoid") # important to have sigmoid activation as the output layer in binary classification
])

  model.compile(loss=tf.keras.losses.BinaryCrossentropy(), # binary since we are working with 2 clases (0 & 1)
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy'])
  return model

## Multi-Class Classification Models

### Multi-Class Classification Base Models

**If labels are in one-hot representation** then loss = CategoricalCrossentropy

**If labels are in integer format** then loss = SparseCateogoricalCrossentropy


In [16]:
import tensorflow as tf
def multi_class_classification_base_model_integer_labels(input_shape, num_classes):
  model = tf.keras.Sequential([
      tf.keras.layers.Flatten(input_shape=input_shape),
      tf.keras.layers.Dense(4, activation="relu"),
      tf.keras.layers.Dense(4, activation="relu"),
      tf.keras.layers.Dense(num_classes, activation=tf.keras.activations.softmax)
  ])

  model.compile(
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      optimizer=tf.keras.optimizers.Adam(),
      metrics=["accuracy"]
  )
  return model

def multi_class_classification_base_model_one_hot_encoded_layers(input_shape, num_classes):
  # tf.one_hot(y_train, depth={num_classes})
  model = tf.keras.Sequential([
      tf.keras.layers.Flatten(input_shape=input_shape),
      tf.keras.layers.Dense(4, activation="relu"),
      tf.keras.layers.Dense(4, activation="relu"),
      tf.keras.layers.Dense(num_classes, activation=tf.keras.activations.softmax)
  ])

  model.compile(
      loss=tf.keras.losses.CategoricalCrossentropy(),
      optimizer=tf.keras.optimizers.Adam(),
      metrics=["accuracy"]
  )
  return model

#model = multi_class_classification_base_model_integer_labels((28, 28),10)