## Student Pass/Fail Prediction

In this notebook, we build a simple binary classifier to predict whether a student will **pass (1)** or **fail (0)** using a neurol network with one hidden layer. 
The prediction is based on two features:  
- **Study Hours** (time dedicated to preparation)  
- **Previous Exam Score** (performance in the last exam)  

This task is a **classification problem**, and the goal is to learn how these features relate to the outcome.  

In [94]:
# Import libraries
import pandas as pd
import torch
from torch import nn

In [95]:
# Read df
df = pd.read_csv("./data/student_exam_data.csv")

df

Unnamed: 0,Study Hours,Previous Exam Score,Pass/Fail
0,4.370861,81.889703,0
1,9.556429,72.165782,1
2,7.587945,58.571657,0
3,6.387926,88.827701,1
4,2.404168,81.083870,0
...,...,...,...
495,4.180170,45.494924,0
496,6.252905,95.038815,1
497,1.699612,48.209118,0
498,9.769553,97.014241,1


In [96]:
# Define input (X)
X = torch.tensor(df[["Study Hours", "Previous Exam Score"]].values, dtype=torch.float32)

In [97]:
# Define output (y)
y = torch.tensor(df["Pass/Fail"], dtype=torch.float32).reshape(-1, 1)

In [98]:
# Define model
model = nn.Sequential(
    nn.Linear(2, 10),
    nn.ReLU(),
    nn.Linear(10, 1)
)

# Define loss function
criterion = nn.BCEWithLogitsLoss()

# Define optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.05)

In [99]:
from torch.utils.data import TensorDataset, DataLoader

# Create dataset from tensor
dataset = TensorDataset(X, y)

# Divides dataset in batches
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

In [100]:
# Training loop
epochs = 1000

for epoch in range(epochs):
    loss_sum = 0
    for X_batch, y_batch in dataloader:
        optimizer.zero_grad()             # Resetting the gradient
        y_pred = model(X_batch)           # Calculating the model prediction
        loss = criterion(y_pred, y_batch) # Calculating the error
        loss.backward()                   # Calculates the gradients of the parameters with respect to the loss            
        optimizer.step()                  # Update the model weights using the calculated gradients
        
        loss_sum += loss.item() 
    if epoch % 100 == 0:
        print(f"Loss: {loss_sum}")        # Print loss

Loss: 28.17239820957184
Loss: 0.9778385031968355
Loss: 1.4370754826813936
Loss: 0.6084625795483589
Loss: 0.45090202847495675
Loss: 0.3494054996408522
Loss: 0.3218006163369864
Loss: 0.7088543307036161
Loss: 0.2643882459960878
Loss: 0.7249264883866999


In [101]:
# Evaluate model
model.eval()
with torch.no_grad():
    logits = model(X)
    y_pred = nn.functional.sigmoid(logits) > 0.5
    y_pred = y_pred.type(torch.float32)

    accuracy = (y_pred == y).type(torch.float32).mean()
    print(f"Accuracy: {accuracy * 100:.2f} %")

Accuracy: 99.40 %
