# 3.1 Anatomy of a Neural Network

This notebook provides an overview of the fundamental components of a neural network.

## 1. Layers: The Building Blocks of Deep Learning

Neural networks are composed of layers. Each layer performs a transformation on the input data and passes it to the next layer.

*   **Input Layer:** The first layer that receives the raw input data.
*   **Hidden Layers:** Layers between the input and output layers. These layers perform complex computations on the data.
*   **Output Layer:** The final layer that produces the network's output.

### Types of Layers

*   **Dense (Fully Connected) Layers:** Each neuron in a dense layer is connected to every neuron in the previous layer.
    *   Equation: $y = \sigma(Wx + b)$
        *   $x$: Input vector
        *   $W$: Weight matrix
        *   $b$: Bias vector
        *   $\sigma$: Activation function
*   **Convolutional Layers:** Primarily used for processing grid-like data such as images. They apply convolutional filters to the input.
*   **Recurrent Layers:** Designed for sequential data, such as text or time series. They have internal memory to process sequences.

### Activation Functions

Activation functions introduce non-linearity into the network, enabling it to learn complex patterns.

*   **ReLU (Rectified Linear Unit):** $\text{ReLU}(x) = \max(0, x)$
*   **Sigmoid:** $\sigma(x) = \frac{1}{1 + e^{-x}}$
*   **Tanh (Hyperbolic Tangent):** $\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$

### Code Example: Dense Layer

This code demonstrates a simple dense layer using TensorFlow/Keras.

In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Dense
import numpy as np

# Create a simple Dense layer
dense_layer = Dense(units=10, activation='relu', input_shape=(5,))

# Generate some dummy input data
input_data = np.random.rand(1, 5) # Batch size of 1, 5 features

# Pass the input data through the layer
output_data = dense_layer(input_data)

print("Input data shape:", input_data.shape)
print("Output data shape:", output_data.shape)
print("Output data:\n", output_data.numpy())

Input data shape: (1, 5)
Output data shape: (1, 10)
Output data:
 [[0.         0.38461566 0.         0.10109182 0.3938445  0.
  0.         0.09975459 0.         0.15508273]]


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## 2. Models: Networks of Layers

A neural network model is formed by stacking different layers together in a specific architecture.

*   **Sequential Models:** Layers are stacked in a linear fashion, where the output of one layer serves as the input to the next.
*   **Functional API Models:** Allows for more flexible architectures with multiple inputs, multiple outputs, and shared layers.

### Code Example: Sequential Model

This code builds a simple sequential model.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Create a Sequential model
model = Sequential([
    Dense(units=64, activation='relu', input_shape=(784,)), # Input layer (implied) and first hidden layer
    Dense(units=64, activation='relu'), # Second hidden layer
    Dense(units=10, activation='softmax') # Output layer for classification (e.g., 10 classes)
])

# Display the model summary
model.summary()

## 3. Loss Functions and Optimizers: Keys to Configuring the Learning Process

These components are crucial for training a neural network.

*   **Loss Function:** Measures the difference between the network's predictions and the actual target values. The goal of training is to minimize this loss.
    *   **Mean Squared Error (MSE):** Used for regression tasks. $MSE = \frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2$
    *   **Categorical Crossentropy:** Used for multi-class classification tasks.
    *   **Binary Crossentropy:** Used for binary classification tasks.
*   **Optimizer:** An algorithm that updates the network's weights and biases based on the calculated loss to minimize it.
    *   **Stochastic Gradient Descent (SGD):** Updates weights in the direction opposite to the gradient of the loss function.
    *   **Adam:** An adaptive optimization algorithm that uses estimates of first and second moments of the gradients.
    *   **RMSprop:** Another adaptive learning rate optimization algorithm.

### Code Example: Compiling a Model

This code compiles the previously created sequential model with a loss function and an optimizer.

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

print("Model compiled successfully!")

Model compiled successfully!


## Summary and Conclusion

This notebook covered the fundamental components of a neural network: layers, models, loss functions, and optimizers. We explored different types of layers, activation functions, and model architectures, as well as common loss functions and optimizers used for training.

Understanding these building blocks is essential for constructing and training effective neural networks for various machine learning tasks. This notebook provided a basic introduction, and further exploration into specific layer types, advanced model architectures, and hyperparameter tuning is recommended for deeper understanding and practical application.