## **Neural Network Assignment Questions**

### 1. **Describe the Basic Structure of a Feedforward Neural Network (FNN). What is the Purpose of the Activation Function?**

A **Feedforward Neural Network (FNN)** is the simplest type of artificial neural network. It consists of an input layer, one or more hidden layers, and an output layer. In a feedforward network, the data moves in one direction—forward—from the input layer through the hidden layers to the output layer.

#### **Structure**:
- **Input Layer**: The first layer of the network that receives the input data.
- **Hidden Layers**: Layers of neurons that process the input data. Each neuron in the hidden layers performs a weighted sum of the inputs, which is passed through an activation function.
- **Output Layer**: The final layer that provides the predicted output.

#### **Purpose of the Activation Function**:
The **activation function** introduces non-linearity to the network, allowing it to learn complex patterns. Without an activation function, the network would behave like a linear model, unable to capture complex relationships. Activation functions, such as **ReLU**, **Sigmoid**, and **Tanh**, are used to decide whether a neuron should be activated or not based on the weighted sum of its inputs.

---

### 2. **Explain the Role of Convolutional Layers in a CNN. Why Are Pooling Layers Commonly Used, and What Do They Achieve?**

#### **Convolutional Layers**:
In a **Convolutional Neural Network (CNN)**, **convolutional layers** are responsible for detecting local patterns in input data (such as images). These layers use **filters** (or kernels) to convolve with the input, producing feature maps that represent the presence of features like edges, textures, or more complex patterns as we move deeper into the network.

- **Role**: The primary role of convolutional layers is to extract local spatial features from the input. These layers allow the CNN to learn hierarchical representations, detecting simple features in lower layers (e.g., edges) and complex features (e.g., shapes or objects) in higher layers.

#### **Pooling Layers**:
**Pooling layers** are used after convolutional layers to reduce the spatial dimensions (height and width) of the feature maps.

- **Why Used**: Pooling helps in reducing the computational load, controlling overfitting, and providing invariance to small translations of the input.
- **What They Achieve**: Pooling layers reduce the resolution of the feature maps, making the network less sensitive to small changes or distortions in the input data. The most common type is **max pooling**, which takes the maximum value in a sub-region of the feature map.

---

### 3. **What Is the Key Characteristic That Differentiates Recurrent Neural Networks (RNNs) From Other Neural Networks? How Does an RNN Handle Sequential Data?**

The key characteristic of **Recurrent Neural Networks (RNNs)** is their ability to process sequential data by maintaining an internal state (memory). Unlike traditional neural networks (like FNNs or CNNs), RNNs have loops in their architecture that allow information to persist over time.

#### **Handling Sequential Data**:
RNNs handle sequential data by maintaining a **hidden state** that updates at each time step based on the previous hidden state and the current input. This allows the network to remember information from earlier time steps, making it suitable for tasks involving sequences such as speech recognition, language modeling, or time series prediction.

- At each time step, the hidden state is updated using the formula:  
  \[ h_t = f(W \cdot x_t + U \cdot h_{t-1}) \]
  Where:
  - \( h_t \) is the hidden state at time \( t \),
  - \( x_t \) is the input at time \( t \),
  - \( h_{t-1} \) is the previous hidden state.

---

### 4. **Discuss the Components of a Long Short-Term Memory (LSTM) Network. How Does It Address the Vanishing Gradient Problem?**

**Long Short-Term Memory (LSTM)** is a specialized type of RNN designed to address the **vanishing gradient problem** and improve the ability of the network to learn long-term dependencies.

#### **Components of an LSTM**:
1. **Forget Gate**: Determines what proportion of the previous memory should be forgotten.
2. **Input Gate**: Decides which new information should be added to the memory.
3. **Cell State**: Represents the long-term memory of the network, which is updated by the input and forget gates.
4. **Output Gate**: Determines what the next hidden state should be, based on the current cell state.

#### **Addressing the Vanishing Gradient Problem**:
The vanishing gradient problem occurs when gradients become extremely small during backpropagation, making it difficult for the network to learn long-term dependencies. LSTMs mitigate this issue through the **cell state** and the careful gating mechanism that allows gradients to flow more easily across time steps.

- The **forget gate** allows LSTMs to retain important information over long sequences, while the **input gate** and **output gate** control the flow of new information.
- The cell state serves as a highway for information, unaffected by the activation functions, which helps preserve gradients across long time horizons.

---

### 5. **Describe the Roles of the Generator and Discriminator in a Generative Adversarial Network (GAN). What Is the Training Objective for Each?**

In a **Generative Adversarial Network (GAN)**, there are two main components: the **generator** and the **discriminator**. These components are trained in opposition to each other, which leads to the generation of highly realistic synthetic data.

#### **Generator**:
- **Role**: The generator creates synthetic data (e.g., images) from random noise. Its goal is to generate data that is indistinguishable from real data.
- **Training Objective**: The generator tries to **fool** the discriminator into classifying fake data as real. It does this by minimizing the **generator loss**, which is based on the discriminator's ability to detect fakes.

#### **Discriminator**:
- **Role**: The discriminator classifies input data as either real or fake, distinguishing between genuine data and data generated by the generator.
- **Training Objective**: The discriminator tries to **correctly classify** real and fake data. It minimizes the **discriminator loss**, which measures how accurately it distinguishes between real and fake data.

#### **Training Process**:
- The generator and discriminator are trained together in a two-player **min-max game**, where the generator aims to produce better fake data, and the discriminator aims to become better at distinguishing fake from real data.
- The goal is for the generator to generate data that the discriminator cannot distinguish from real data, leading to the generator producing realistic outputs.

