# Introduction the NN

In essence: For each epoch, for each training instance the backpropagation algorithm first makes a prediction (forward pass), measures the error, then goes through each layer in reverse to measure the error contribution from each connection (reverse pass), and finally slightly tweaks the connection weights to reduce the error (Gradient Descent step).


<img src="../img/derivatives_activation.png" width="80%">

*Derivatives: rate of change*


**Key Differences Between Perceptron and Neuron**
| Feature               | Perceptron                       | Neuron in MLP                   |
|-----------------------|-----------------------------------|----------------------------------|
| **Activation Function** | Step function                   | Nonlinear (e.g., ReLU, sigmoid) |
| **Output**            | Binary (0 or 1)                  | Continuous or nonlinear values  |
| **Usage**             | Single-layer models (linear tasks) | Multilayer networks (nonlinear tasks) |
| **Problem Solving**   | Only linear separability         | Handles nonlinear problems      |


## **Regression MLPs**

| Hyperparameter        | Typical Value                                                                 |
|-----------------------|------------------------------------------------------------------------------|
| # input neurons       | One per input feature (e.g., 28 x 28 = 784 for MNIST)                       |
| # hidden layers       | Depends on the problem. Typically 1 to 5.                                   |
| # neurons per hidden layer | Depends on the problem. Typically 10 to 100.                            |
| # output neurons      | 1 per prediction dimension (if you expect 2 values, then 2 outputs)                                                  |
| Hidden activation     | ReLU (or SELU)                                              |
| Output activation     | None or ReLU/Softplus (if positive outputs) or Logistic (0 to 1)/Tanh (hyperbolic tangent)(-1 to 1) (if bounded outputs) |
| Loss function         | MSE or MAE/Huber (if outliers)                                              |

## Classification MLPs

| Hyperparameter            | Binary classification | Multilabel binary classification | Multiclass classification |
|---------------------------|-----------------------|-----------------------------------|---------------------------|
| Input and hidden layers   | Same as regression   | Same as regression               | Same as regression       |
| # output neurons          | 1                   | 1 per label                      | 1 per class              |
| Output layer activation   | Logistic            | Logistic                          | Softmax                  |
| Loss functino   | Cross-entropy            | Cross-entropy                         | Cross-entropy                  |



## Simple image classifier with Keras

In [3]:
import tensorflow as tf
from tensorflow import keras

tf.__version__
keras.__version__


'3.6.0'

In [None]:
# the fashion ds has 10 categories, like mnist, but with clothes
fashion_mnist = keras.datasets.fashion_mnist

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

(X_train_full, y_train_full), (x_test, y_test) = fashion_mnist.load_data()