<a href="https://colab.research.google.com/github/Josh-robins/Deep-Learning/blob/main/Assignment_Mark_Prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Introduction**

This notebook implements an **assignment marker** that classifies student marks into binary categories: pass (1) or fail (0).

The dataset consists of assignment scores and corresponding labels, which are used to train a simple neural network. While this task could be efficiently solved using basic conditional logic (e.g., an `if-else` statement setting a pass threshold at 75), the goal here is educational: to explore deep learning concepts and gain hands-on experience with PyTorch.,
        
By leveraging PyTorch, we define a neural network, preprocess the data, train the model, and evaluate its performance. This over-engineered approach serves as a practical introduction to key deep learning components—such as datasets, data loaders, neural network architecture, loss functions, and optimizers—while demonstrating how these tools can be applied to even a simple classification problem.

    
The notebook is structured as follows
1. Importing libraries
2. Defining the dataset and data loaders
3. Scaling the data
4. Building the neural network model
5. Setting up the loss function and optimizer
6. Training the model
7. Evaluating the results
        
This project highlights the flexibility of PyTorch and provides a foundation for tackling more complex machine learning tasks.

**1. Importing Libraries**

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from sklearn.preprocessing import StandardScaler
import numpy as np

In [2]:
# Assignment marks: training set
train_samples_np = np.array([78, 100, 52, 89, 92, 87, 65, 40, 78, 82, 64, 78, 98, 86, 72, 81, 94, 92, 51, 71])
train_labels_np = np.array([  1,   1,  0,  1,  1,  1,  0,  0,  1,  1,  0,  1,  1,  1,  0,  1,  1,  1,  0,  0])

# Assignment marks: testing set
test_samples_np = np.array([75, 68, 99, 82, 71, 70, 68, 84, 87, 72, 61, 92, 93, 54, 63, 45, 74, 76, 83, 91])
test_labels_np = np.array([  1,  0,  1,  1,  0,  0,  0,  1,  1,  0,  0,  1,  1,  0,  0,  0,  0,  1,  1,  1])

In [None]:
# # print(train_samples_np)
# for grade in train_samples_np:
#     if grade >= 75:
#         print(f'{grade}:1')
#     else:
#         print(f'{grade}:0')

#     # print(grade)

**2. Define the Data loader**

In [5]:
class SimpleDataset(Dataset):
    def __init__(self, samples, labels):
        self.samples = torch.tensor(samples, dtype=torch.float32) #convert samples to float tensor and assign them to self.samples
        self.labels = torch.tensor(labels, dtype = torch.long)  # Convert labels to long tensor and assign to self.labels
        self.n_samples = len(self.samples) # Store the number of samples in self.n_sample

    def __len__(self):
        # Return the total number of samples
        return self.n_samples

    def __getitem__(self, index):
        return self.samples[index], self.labels[index]


loader = SimpleDataset(train_samples_np,train_labels_np)
print(loader[0])
print(loader.n_samples)


(tensor(78.), tensor(1))
20


**3. Scaling the data**

In [6]:
# Rescale the samples to have a mean of 0 and a variance of 1
scaler = StandardScaler()
train_samples_scaled = scaler.fit_transform(train_samples_np.reshape(-1, 1))
test_samples_scaled = scaler.transform(test_samples_np.reshape(-1, 1))

# Create PyTorch Datasets
train_dataset = SimpleDataset(train_samples_scaled, train_labels_np)
test_dataset = SimpleDataset(test_samples_scaled, test_labels_np)

# Create DataLoaders
train_loader = DataLoader(train_dataset, batch_size=6, shuffle=True) # Create DataLoader with train_dataset, batch_size=6, shuffle=True
test_loader = DataLoader(test_dataset, batch_size=5, shuffle=False)   # Create DataLoader with test_dataset, batch_size=5, shuffle=False

**4. Define the PyTorch Mode**

In [7]:
class SimpleNN(nn.Module):
  def __init__(self, input_size, hidden_size, output_size):
    super(SimpleNN, self).__init__()
    self.hidden = nn.Linear(input_size, hidden_size)   # Define a Linear layer from input_size to hidden_size
    self.sigmoid = nn.Sigmoid()  # Define a Sigmoid activation
    self.output = nn.Linear(hidden_size, output_size)    # Define a Linear layer from hidden_size to output_size
    self.softmax = nn.Softmax(dim=1)  # Define a Softmax activation along dimension 1

  def forward(self, x):
    x = self.hidden(x)   # Apply the hidden layer
    x = self.sigmoid(x)  # Apply the sigmoid activation
    x = self.output(x)  # Pass through the output layer
    x = self.softmax(x)  # Apply softmax to get output probabilities
    return x


# Instantiate the model
input_size = 1
hidden_size = 4
output_size = 2 # Two output classes
model = SimpleNN(input_size, hidden_size, output_size)

**5. Define Loss Function and Optimizer**

In [9]:
criterion = nn.CrossEntropyLoss()  # Define a loss function suitable for multi-class classification (e.g., CrossEntropyLoss)
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Define an optimizer (e.g., SGD or Adam) with model parameters and learning rate

**6. Train the Model**

In [10]:
num_epochs = 1000  # Set the number of training epochs

for epoch in range(num_epochs):
    for inputs, labels in train_loader:
        #print(inputs[0], labels[0])

        # Zero the gradients
        optimizer.zero_grad()

        # Forward pass through the model to get outputs
        outputs = model(inputs)
# Compute the loss using criterion
        loss = criterion(outputs, labels)

        # Backward pass (loss.backward)
        loss.backward()
        optimizer.step()


    if (epoch + 1) % 50 == 0:
        # Print epoch number and current loss
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

print("Finished Training")

Epoch [50/1000], Loss: 0.5339
Epoch [100/1000], Loss: 0.6880
Epoch [150/1000], Loss: 0.6861
Epoch [200/1000], Loss: 0.4972
Epoch [250/1000], Loss: 0.4420
Epoch [300/1000], Loss: 0.3935
Epoch [350/1000], Loss: 0.4233
Epoch [400/1000], Loss: 0.5222
Epoch [450/1000], Loss: 0.4017
Epoch [500/1000], Loss: 0.3659
Epoch [550/1000], Loss: 0.3543
Epoch [600/1000], Loss: 0.4676
Epoch [650/1000], Loss: 0.3436
Epoch [700/1000], Loss: 0.3245
Epoch [750/1000], Loss: 0.3202
Epoch [800/1000], Loss: 0.3269
Epoch [850/1000], Loss: 0.3174
Epoch [900/1000], Loss: 0.3679
Epoch [950/1000], Loss: 0.3202
Epoch [1000/1000], Loss: 0.3152
Finished Training


**7. Evaluate the Model**

In [11]:
# Set the model to evaluation mode
model.eval()
all_predicted_labels = []
all_test_labels = []

with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs) # Perform a forward pass to get outputs

        _, predicted = torch.max(outputs, 1)
        all_predicted_labels.extend(predicted.numpy())
        all_test_labels.extend(labels.numpy())

predicted_labels_np = np.array(all_predicted_labels)  # Convert all_predicted_labels to a NumPy array
test_labels_np =  np.array(all_test_labels)       # Convert all_test_labels to a numpy array

# Print predicted and true labels
print("Predicted labels on testing set:", predicted_labels_np)
print("True labels on testing set:", test_labels_np)

# Compute prediction error as a percentage
prediction_error_test = np.sum(np.abs(predicted_labels_np - test_labels_np)/len(test_labels_np))*100 # Compute the average absolute error percentage
print("Prediction error on testing set:", prediction_error_test)

Predicted labels on testing set: [1 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 1]
True labels on testing set: [1 0 1 1 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 1]
Prediction error on testing set: 0.0


This notebook successfully implemented an assignment marker using a deep learning approach with PyTorch. The neural network, despite its simplicity, achieved perfect accuracy on the test set (0% prediction error), correctly classifying assignment marks as pass (1) or fail (0).

Through this exercise, we explored critical concepts: data preprocessing with `StandardScaler`, custom dataset creation with `Dataset` and `DataLoader`, neural network design with `nn.Module`, and training with loss functions and optimizers. The model’s performance demonstrates that even a small network can learn simple patterns effectively, though its capacity far exceeds the needs of this binary classification task. For real-world applications, such complexity might be reserved for problems with more intricate data relationships.

Happy Coding

Joshua Robin: Author
      