# CS 4770 Natural Language Processing (Fall 2025)

## Assignment 1: PyTorch Warm-Up

In this assignment, you will use PyTorch to implement and train a simple neural network on SST-2 dataset. This assignment will help you get familiar with the PyTorch library and the basic concepts of neural networks.

In [2]:
import numpy as np
import torch

# Set random seeds
np.random.seed(99)
torch.manual_seed(99)

<torch._C.Generator at 0x210ba764590>

## Question 1: Python Lists and NumPy Arrays (5 pts).

TODO: Create a Python list of the first 12 cube numbers $(1, 8, 27, …, 1728)$.

Convert this list to a NumPy array and reshape it into a $3\times4$ matrix. Print out the matrix as a 2-dimensional array.

In [5]:
cube = [i ** 3 for i in range(1, 13)]
cube_arr = np.array(cube).reshape((3, 4))
cube_arr

array([[   1,    8,   27,   64],
       [ 125,  216,  343,  512],
       [ 729, 1000, 1331, 1728]])

## Question 2: PyTorch Tensors and Operations (11 pts)

Convert the NumPy array from Question 1 into a PyTorch float32 tensor and perform the following operations. The following operations must be done using PyTorch interfaces.

TODO: Subtract 5 from every element and print out the mean of all elements. (5 pts)

In [6]:
cube_tensor = torch.tensor(cube_arr, dtype=torch.float32)
cube_broadcasted = cube_tensor - 5
print(f"Mean of all elements: {torch.mean(cube_broadcasted).item()}")

Mean of all elements: 502.0


TODO: Create a new $4 \times 3$ tensor by transposing the original tensor. Print out the transposed tensor. (3 pts)

In [8]:
cube_T = cube_tensor.T
cube_T

tensor([[1.0000e+00, 1.2500e+02, 7.2900e+02],
        [8.0000e+00, 2.1600e+02, 1.0000e+03],
        [2.7000e+01, 3.4300e+02, 1.3310e+03],
        [6.4000e+01, 5.1200e+02, 1.7280e+03]])

TODO: Perform matrix multiplication between the original tensor and the transposed tensor. Print out the result. (3 pts)

In [9]:
cube_mult = cube_tensor @ cube_T
cube_mult

tensor([[4.8900e+03, 4.3882e+04, 1.5526e+05],
        [4.3882e+04, 4.4207e+05, 1.6484e+06],
        [1.5526e+05, 1.6484e+06, 6.2890e+06]])

## Question 3: SST-2 Sentiment Classification (30 pts)

In this question, you will build a simple neural network to predict whether a sentence expresses positive or negative sentiment. We will use the SST-2 (Stanford Sentiment Treebank, Binary version) dataset. This dataset is widely used in NLP research and is available through the HuggingFace datasets library.

Install HuggingFace datasets library.

In [None]:
# Installed via uv instead
# !pip install datasets

Load and preprocess the dataset. (You don't need to modify this cell.)

In [7]:
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from sklearn.feature_extraction.text import CountVectorizer
from datasets import load_dataset

np.random.seed(99)
torch.manual_seed(99)

def preprocess_text(text):
    """
    Function that preprocesses the string
    """
    preprocessed_text = text.lower()
    return preprocessed_text

# Load SST-2 dataset from HuggingFace
dataset = load_dataset("glue", "sst2")

# Extract train, validation and test sets
train_data = dataset["train"]
val_data = dataset["validation"]
test_data = dataset["test"]

# Process training data
train_contents = [preprocess_text(example["sentence"]) for example in train_data]
train_labels = [example["label"] for example in train_data]

# Process validation data
val_contents = [preprocess_text(example["sentence"]) for example in val_data]
val_labels = [example["label"] for example in val_data]

# Process test data (note: SST-2 test set doesn't have labels, using validation as test)
test_contents = [preprocess_text(example["sentence"]) for example in val_data]
test_labels = [example["label"] for example in val_data]

sentiments = ["Negative", "Positive"]

print(f"Train size: {len(train_contents)}, "
      f"\nVal size: {len(val_contents)}, "
      f"\nTest size: {len(test_contents)}")

# show the first review and its sentiment label
print("Sentence: ", train_data[0]["sentence"])
print("Sentiment: ", sentiments[train_labels[0]])

  from .autonotebook import tqdm as notebook_tqdm


Train size: 67349, 
Val size: 872, 
Test size: 872
Sentence:  hide new secretions from the parental units 
Sentiment:  Negative


Define the SST-2 dataset class. (You don't need to modify this cell.)

In [10]:
# Vectorize the text data using CountVectorizer
vectorizer = CountVectorizer(max_features=10000)
X_train = vectorizer.fit_transform(train_contents).toarray()
X_val = vectorizer.transform(val_contents).toarray()
X_test = vectorizer.transform(test_contents).toarray()

y_train = np.array(train_labels)
y_val = np.array(val_labels)
y_test = np.array(test_labels)

class SST2Dataset(Dataset):
    def __init__(self, texts, labels):
        self.texts = torch.tensor(texts, dtype=torch.float32)
        self.labels = torch.tensor(labels, dtype=torch.float32).unsqueeze(1)

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        return self.texts[idx], self.labels[idx]

# Create DataLoader objects
train_dataset = SST2Dataset(X_train, y_train)
val_dataset = SST2Dataset(X_val, y_val)
test_dataset = SST2Dataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

TODO: Implement a simple two-layer neural network using PyTorch's `nn.Module` for binary classification. The network should have:

- An input layer of **input\_dim** features.
- One hidden layer with 64 neurons and **ReLU** activation.
- An output layer with 1 neuron and **Sigmoid** activation.

Then initialize the model where **input\_dim** equals the shape of the data in **X\_train**, use the Binary Entropy loss function, and the Adam optimizer with 0.001 learning rate. (10 pts)

In [11]:
class SimpleNN(nn.Module):
    def __init__(self, input_dim):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(input_dim, 64)
        self.fc2 = nn.Linear(64, 1)

    def forward(self, x):
       x = self.fc1(x)
       x = torch.relu(x)
       x = self.fc2(x)
       return torch.sigmoid(x)

# TODO: initialize the model, loss and optimizer
input_dim = X_train.shape[1]
model = SimpleNN(input_dim=input_dim)
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)

TODO: Train the neural network for 10 epochs, and print out the training loss and accuracy on the training set at the end of each epoch. (10 pts)

In [12]:
num_epochs = 10

for epoch in range(num_epochs):
    model.train()
    train_loss = 0.0
    correct = 0
    total = 0
    # TODO: Complete the training loop, update the training loss and accuracy
    for i, (texts, labels) in enumerate(train_loader):
        # Forward prop
        outputs = model(texts)
        loss = criterion(outputs, labels)

        # Back prop
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Metrics
        train_loss += loss.item()
        predictions = (outputs >= 0.5).float()
        correct += (predictions == labels).sum().item()
        total += labels.shape[0]

    train_loss /= len(train_loader)
    train_acc = correct / total

    print(f"Epoch {epoch+1}/{num_epochs}, "
          f"Train Loss: {train_loss:.4f}, "
          f"Train Acc: {train_acc:.4f}, ")

Epoch 1/10, Train Loss: 0.3478, Train Acc: 0.8485, 
Epoch 2/10, Train Loss: 0.2039, Train Acc: 0.9163, 
Epoch 3/10, Train Loss: 0.1569, Train Acc: 0.9361, 
Epoch 4/10, Train Loss: 0.1264, Train Acc: 0.9482, 
Epoch 5/10, Train Loss: 0.1042, Train Acc: 0.9571, 
Epoch 6/10, Train Loss: 0.0866, Train Acc: 0.9642, 
Epoch 7/10, Train Loss: 0.0749, Train Acc: 0.9691, 
Epoch 8/10, Train Loss: 0.0650, Train Acc: 0.9723, 
Epoch 9/10, Train Loss: 0.0573, Train Acc: 0.9756, 
Epoch 10/10, Train Loss: 0.0517, Train Acc: 0.9773, 


TODO: Evaluate the test set and report the test accuracy. (10 pts)

In [13]:
model.eval()
test_correct = 0
with torch.no_grad():
    for i, (texts, labels) in enumerate(test_loader):
        # Forward prop
        outputs = model(texts)

        # Metrics
        predictions = (outputs >= 0.5).float()
        test_correct += (predictions == labels).sum().item()

test_acc = test_correct / len(test_loader.dataset)
print(f"Test Accuracy: {test_acc:.4f}")

Test Accuracy: 0.8234
