### **`model.ipynb` — Neural Network Architecture**

- *In this notebook, we define the architecture of our Feedforward Neural Network (FNN).*

- *The model consists of one input layer, two hidden layers (each using the ReLU activation function), and one output layer.*

- *This structure allows the network to learn complex patterns between words for next-word prediction tasks using the Penn Treebank dataset.*



*Here, we import PyTorch and its neural network module (`torch.nn`), which provides the essential building blocks for defining and training deep learning models.*


In [1]:
import torch
import torch.nn as nn

*In this step, we define our model class using PyTorch’s `nn.Module`.*

*Each layer performs a linear transformation followed by a ReLU activation, except for the output layer, which gives raw predictions.*

*The `forward()` method defines how data flows through the layers — from input to output.*

---

### *Activation Functions Used*

* **Hidden Layers → ReLU (`max(0, z)`):**
  *We use ReLU in hidden layers because it avoids vanishing gradients, allows the network to focus on important patterns (sparsity), and is computationally efficient. It helps the network learn faster and better, especially as the model gets deeper.*

* **Output Layer → Softmax:**
  *We use Softmax in the output layer because it converts the raw output scores into probabilities for each word in the vocabulary. The highest probability indicates the predicted next word.*

---

In [2]:
class FeedforwardNN(nn.Module):
    """Simple feedforward neural network for next-word prediction"""
    def __init__(self, input_size, hidden_size1, hidden_size2, output_size):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size1)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size1, hidden_size2)
        self.relu2 = nn.ReLU()
        self.fc3 = nn.Linear(hidden_size2, output_size)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.fc2(x)
        x = self.relu2(x)
        x = self.fc3(x)
        return x

*This code creates an instance of the model using example dimensions for each layer.*

In [3]:
model = FeedforwardNN(input_size=100, hidden_size1=64, hidden_size2=32, output_size=10)
print(model)

FeedforwardNN(
  (fc1): Linear(in_features=100, out_features=64, bias=True)
  (relu1): ReLU()
  (fc2): Linear(in_features=64, out_features=32, bias=True)
  (relu2): ReLU()
  (fc3): Linear(in_features=32, out_features=10, bias=True)
)
