# Training Neural Networks with Keras - Basic Concepts

## 2. Basic Structure of a Neural Network
In Keras, a neural network model is defined using layers that transform input data into desired outputs. 
Layers are the basic building blocks of neural networks in Keras. A layer consists of a tensor-in tensor-out computation function  and some state, held in TensorFlow variables (the layer's weights).

Each layer performs a specific mathematical operation and passes the processed data to the next layer. 


- **Layers**: Building blocks of neural networks.
- **Input Layer**: Receives input data.
- **Hidden Layers**: Perform transformations on data.
- **Output Layer**: Produces final predictions.


### What is a Tensor in Keras?

A **tensor** in Keras is a multi-dimensional array that is used to represent data. Tensors can have various shapes and data types, making them the basic building blocks of deep learning models. They are used to define inputs, outputs, weights, and activations within neural networks.

Tensors are simmilar to numpay arrays, but can leverage parallel processing on GPUs through deep learning frameworks like TensorFlow and PyTorch, making them highly efficient for large-scale computations in neural networks.

#### Key Points:
1. **Multi-Dimensional Array**: A tensor can be a scalar (0D), vector (1D), matrix (2D), or higher-dimensional array (3D, 4D, etc.).
2. **Shape**: Defined by the number of dimensions and the size of each dimension. For example:
   - `Scalar`: Shape `()`
   - `Vector`: Shape `(3,)`
   - `Matrix`: Shape `(3, 4)`
   - `4D Tensor`: Shape `(batch_size, height, width, channels)`
3. **Data Types**: Tensors can have different data types, such as `float32`, `int32`, etc.
4. **Manipulation**: Tensors support various operations like addition, multiplication, reshaping, and slicing.

#### Tensor Example in Keras:
```python
import tensorflow as tf

# Creating a 2D tensor (Matrix)
tensor_example = tf.constant([[1.0, 2.0], [3.0, 4.0]])
print(tensor_example)


# Output:
# <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
# array([[1., 2.],
#        [3., 4.]], dtype=float32)>
```

## 3. Defining a Simple Neural Network in Keras

There are multiple ways to build a model in Keras depending on the complexity and structure of the neural network.

### 3.1 Sequential Model

The **Sequential** model is a linear stack of layers, where each layer has exactly one input and one output. It is used for simple feed-forward architectures without branching or skipping connections.

**Example: Simple Feed-Forward Network**
```python
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Define a sequential model
model = Sequential([
    Dense(32, activation='relu', input_shape=(input_dim,)),  # Input and Hidden Layer 1
    Dense(16, activation='relu'),                            # Hidden Layer 2
    Dense(1, activation='sigmoid')                           # Output Layer
])
```

### 3.2 Functional API

The **Functional API** in Keras is more flexible than the Sequential model, allowing for complex architectures such as multi-input, multi-output models, or models with shared layers. Instead of stacking layers linearly, the Functional API enables defining models by connecting layers like a graph, which provides greater control over the network structure.

- **Define Input Layer**: Start by defining the input layer using `Input()`.
- **Connect Layers**: Use layer instances (e.g., `Dense`) and connect them using function calls.
- **Create the Model**: Pass the input and output layers to the `Model()` class.

**Example:**
```python
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense

# Define the input layer
inputs = Input(shape=(input_dim,))

# Define the layers
x1 = Dense(32, activation='relu')(inputs)   # Hidden Layer 1
x2 = Dense(16, activation='relu')(x1)       # Hidden Layer 2
outputs = Dense(1, activation='sigmoid')(x2) # Output Layer

# Create the model using the Functional API
model = Model(inputs=inputs, outputs=outputs)
```

### 3.3 Model Subclassing

**Model Subclassing** provides maximum flexibility and customization for building neural networks in Keras. It involves creating a custom class that inherits from the `tf.keras.Model` class and defining the network layers and their forward pass within the class. This approach is particularly useful when building complex models that require dynamic behavior or custom training loops.

- **Create a Custom Class**: Inherit from `tf.keras.Model` and define layers in the `__init__` method.
- **Define the Forward Pass**: Implement the `call()` method, which specifies how the data flows through the layers.

**Example:**
```python
import tensorflow as tf
from tensorflow.keras.layers import Dense

# Define a custom model by subclassing tf.keras.Model
class CustomModel(tf.keras.Model):
    def __init__(self):
        super(CustomModel, self).__init__()
        self.dense1 = Dense(32, activation='relu')   # Hidden Layer 1
        self.dense2 = Dense(16, activation='relu')   # Hidden Layer 2
        self.out = Dense(1, activation='sigmoid')    # Output Layer

    def call(self, inputs):
        x = self.dense1(inputs)
        x = self.dense2(x)
        return self.out(x)

# Instantiate and build the custom model
model = CustomModel()

### 3.4 Popular Artificial Neural Network Types Supported by Keras

1. **Feed-Forward Neural Network (FFNN)**  
   - **Keras Class**: `tensorflow.keras.layers.Dense`
   - Basic architecture where data flows in one direction without cycles.
   - Suitable for general-purpose tasks like tabular data classification and regression.

2. **Convolutional Neural Network (CNN)**  
   - **Keras Class**: `tensorflow.keras.layers.Conv2D`, `tensorflow.keras.layers.MaxPooling2D`
   - Designed for image-related tasks like image classification, object detection, and segmentation.
   - Uses convolutional and pooling layers to detect spatial patterns.

   <a href="./images/Convolutional-Neural-Networks.webp"><img src='./images/Convolutional-Neural-Networks.webp' style=height:12em></a><br>
   *Image source: https://learnopencv.com/understanding-feedforward-neural-networks/*
   
3. **Recurrent Neural Network (RNN)**  
   - **Keras Class**: `tensorflow.keras.layers.SimpleRNN`
   - Designed for sequential data like time series, text, and audio.
   - Capable of maintaining state and capturing temporal dependencies.

   <a href="./images/FFNN.jpg"><img src='./images/FFNN.jpg' style=height:12em></a><br>
   *Image source: https://learnopencv.com/understanding-convolutional-neural-networks-cnn/*

4. **Long Short-Term Memory (LSTM)**  
   - **Keras Class**: `tensorflow.keras.layers.LSTM`
   - An advanced type of RNN that solves the vanishing gradient problem.
   - Effective for learning long-term dependencies in sequences.

5. **Gated Recurrent Unit (GRU)**  
   - **Keras Class**: `tensorflow.keras.layers.GRU`
   - A simplified version of LSTMs with fewer parameters.
   - Faster to train and performs well on sequential tasks.

6. **Transformer Networks**  
   - **Keras Class**: `tensorflow.keras.layers.MultiHeadAttention`, `tensorflow.keras.layers.Transformer`
   - Based on self-attention mechanisms.
   - Highly effective for NLP tasks such as machine translation and text summarization.



## 4. Compilation of the Model

Model compilation is a crucial step that defines how the model learns and optimizes its weights. During compilation, the optimizer, loss function, and evaluation metrics are specified. These parameters determine how the model will be trained and evaluated.

### Key Components:
1. **Optimizer**:
   - Controls how the model updates its weights during training.
   - Common optimizers: `adam`, `sgd`, `rmsprop`.
   - Example: `optimizer='adam'`

2. **Loss Function**:
   - Measures the difference between the true labels and model predictions.
   - For classification: `binary_crossentropy`, `categorical_crossentropy`.
   - For regression: `mean_squared_error`.
   - Example: `loss='binary_crossentropy'`

3. **Metrics**:
   - Used to evaluate the model's performance during training and testing.
   - Common metrics: `accuracy`, `precision`, `recall`.
   - Example: `metrics=['accuracy']`

### Example: Model Compilation
```python
# Compile the model with optimizer, loss function, and metrics
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])
```

**Explanation**:

`optimizer='adam'`: Uses the Adam optimizer, which is a popular choice due to its adaptive learning rate and efficient performance.

`loss='binary_crossentropy'`: Specifies the loss function for binary classification tasks.

`metrics=['accuracy']`: Sets the evaluation metric to track the accuracy of the model during training.


## 5. Training the Model

Training a model in Keras is done using the `fit` method. This method performs forward and backward passes through the network, updates the model weights based on the loss function, and monitors performance metrics over multiple epochs. During training, you can also specify data validation to track the model’s progress on unseen data.

### Key Parameters of the `fit` Method:
1. **x**: Input data (e.g., training features).
2. **y**: Target labels corresponding to the input data.
3. **epochs**: Number of complete passes through the entire training dataset.
4. **batch_size**: Number of samples per gradient update. Smaller batches use less memory but may result in less stable updates.
5. **validation_split**: Fraction of training data to be used for validation (e.g., `0.2` for 20% validation data).
6. **callbacks**: List of callback functions to monitor the training (e.g., early stopping, model checkpointing).

### Example: Training a Simple Model
```python
# Train the model using the fit method
history = model.fit(x_train, y_train, 
                    epochs=20,                 # Number of epochs to train the model
                    batch_size=32,             # Number of samples per batch
                    validation_split=0.2)      # Use 20% of the data for validation

```

*Explanation*:

`epochs=20`: The model will train for 20 complete passes over the training data.

`batch_size=32`: For each update, 32 samples will be processed before updating the model weights.

`validation_split=0`.2: Keras will reserve 20% of the training data for validation.


## 6. Evaluating the Model

After training, the model needs to be evaluated to assess its performance on unseen test data. In Keras, this is done using the `evaluate` method, which computes the loss and other metrics defined during model compilation.

### Key Points:
- **Purpose**: Measure how well the model generalizes to new data.
- **Output**: Returns the loss value and any other specified metrics, such as accuracy.

### Parameters of the `evaluate` Method:
1. **x**: Test features or input data (e.g., `x_test`).
2. **y**: Test labels corresponding to the input data (e.g., `y_test`).
3. **batch_size**: Number of samples per batch. Defaults to 32.
4. **verbose**: Verbosity mode (0 = silent, 1 = progress bar, 2 = one line per epoch).

### Example: Evaluating a Model on Test Data
```python
# Evaluate the model on test data
test_loss, test_accuracy = model.evaluate(x_test, y_test, batch_size=32, verbose=1)
print(f"Test Loss: {test_loss}")
print(f"Test Accuracy: {test_accuracy}")
```

*Explanation*:

`x_test, y_test`: Test dataset used for evaluation.

`test_loss`: The loss value calculated on the test set.

`test_accuracy`: The accuracy of the model on the test set.


## 7. Predicting with the Model

## 7. Predicting with the Model

Once the model is trained and evaluated, it can be used to make predictions on new, unseen data. This is done using the `predict` method in Keras, which returns the predicted values for the input data.

### Key Points:
- **Purpose**: To generate predictions on new data (e.g., class labels, probabilities, or regression values).
- **Output**: Returns an array of predicted values based on the model's learned weights.

### Parameters of the `predict` Method:
1. **x**: Input data for which predictions are to be made (e.g., `x_new`).
2. **batch_size**: Number of samples per batch. Defaults to 32.
3. **verbose**: Verbosity mode (0 = silent, 1 = progress bar).

### Example: Making Predictions on New Data
```python
# Predict on new input data
predictions = model.predict(x_new, batch_size=32, verbose=1)
print(predictions)
```

*Explanation*:

`x_new`: New data samples for which predictions are to be made.

`predictions`: Array of predicted values (e.g., class probabilities or labels).

For a classifier, predict returns probabilities for each class. You can use argmax to get the predicted class label.

```python
predicted_classes = predictions.argmax(axis=-1)
print(predicted_classes)
```

Regression: For a regression model, it returns the predicted numerical value.