<a href="https://colab.research.google.com/github/LukeSchreiber/FastAI-Projects/blob/main/Lesson5StudentGradePredictor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Hello, this is my fastai lesson 5 project. I built a student grade classifier. Its a simple pytorch model I built from scratch that predicts if a student will pass or fail based on: hours studied, attendace, previous grade.

First were going to import everything we need and set the seed and check for GPU

In [10]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim

#set the seed
torch.manual_seed(42)
np.random.seed(42)

#gpu check
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

Using device: cuda


Here is where were going to create a fake dataset, you can delete this block and change around the next block in order to use a csv uploader but for the sake of this project IM going to create a dataset like this.

In [11]:
import numpy as np
import pandas as pd

# Set random seed for reproducibility
np.random.seed(42)

# Generate 100,000 student records
n_students = 100000
hours = np.random.uniform(0, 10, n_students)  # Hours studied (0-10)
attendance = np.random.uniform(0, 100, n_students)  # Attendance (0-100%)
prev_grade = np.random.uniform(0, 100, n_students)  # Previous grade (0-100)

# Determine pass/fail
pass_prob = np.clip(prev_grade / 100 + np.random.normal(0, 0.1, n_students), 0, 1)
pass_status = (pass_prob > 0.6).astype(int)  # Pass if prob > 0.6

# Data frame
df = pd.DataFrame({
    'hours': hours,
    'attendance': prev_grade,
    'prev_grade': prev_grade,
    'pass': pass_status
})

# Save to CSV
df.to_csv('student_grades_100k.csv', index=False)
print("Generated and saved student_grades_100k.csv with 100,000 students!")

# Quick check
print("\nFirst 5 rows:\n", df.head())

Generated and saved student_grades_100k.csv with 100,000 students!

First 5 rows:
       hours  attendance  prev_grade  pass
0  3.745401   28.258797   28.258797     0
1  9.507143   45.867659   45.867659     0
2  7.319939    9.921550    9.921550     0
3  5.986585   44.683703   44.683703     0
4  1.560186   20.308135   20.308135     0


Now were going to take that csv we just created and upload it in

In [12]:
import pandas as pd
from google.colab import files

# CSV uploader (load the generated file)
df = pd.read_csv('student_grades_100k.csv')  # Load the file we just created
print("CSV loaded; file name: student_grades_100k.csv")

# Print the columns to see if the CSV is correct
print("Available columns:", df.columns.tolist())

CSV loaded; file name: student_grades_100k.csv
Available columns: ['hours', 'attendance', 'prev_grade', 'pass']


We now have to define all those features and put them into arrays

In [13]:
# Define Features
feature_cols = ["hours", "attendance", "prev_grade"]  # These are the input features
target_col = ["pass"]  # This is what we want to predict (0 or 1)

# Convert to NumPy arrays for training
x = df[feature_cols].values  # feature matrix (all input data)
y = df[target_col].values   # target vector (pass/fail outcomes)

# Print shapes and sample data to verify
print("Features shape:", x.shape)
print("Target shape:", y.shape)
print("\nFirst few rows of features:\n", x[:5])
print("\nFirst few targets:\n", y[:5])

Features shape: (100000, 3)
Target shape: (100000, 1)

First few rows of features:
 [[ 3.74540119 28.2587967  28.2587967 ]
 [ 9.50714306 45.86765905 45.86765905]
 [ 7.31993942  9.92154979  9.92154979]
 [ 5.98658484 44.68370253 44.68370253]
 [ 1.5601864  20.30813496 20.30813496]]

First few targets:
 [[0]
 [0]
 [0]
 [0]
 [0]]


Go ahead and split everything into 80 train 10 valid 10 test. Also were going to need to use the mean and standard deviation in order to use everything on the same scale. after that we can covert everything into tensors

In [14]:
# Split rows (80/10/10) with a shuffled index
idx = np.arange(len(df))
np.random.shuffle(idx)

cut_train = int(0.8 * len(df))
cut_valid = int(0.9 * len(df))

train_df = df.iloc[idx[:cut_train]].reset_index(drop=True) # Take before cut train
valid_df = df.iloc[idx[cut_train:cut_valid]].reset_index(drop=True) #Take cut = valid train
test_df  = df.iloc[idx[cut_valid:]].reset_index(drop=True) # Take after cut valid which is the last 10 percent

# Standardize using TRAIN stats only - subtract mean and divide by std
means = train_df[feature_cols].mean()  # Mean of training features
stds  = train_df[feature_cols].std().replace(0, 1)  # Std dev, avoid div-by-zero

Xtr = ((train_df[feature_cols] - means) / stds).values.astype("float32")
Xva = ((valid_df[feature_cols] - means) / stds).values.astype("float32")
Xte = ((test_df[feature_cols] - means) / stds).values.astype("float32")
                                                                          # Float 32 for speed and effiency
ytr = train_df[target_col].values.astype("float32").reshape(-1, 1)
yva = valid_df[target_col].values.astype("float32").reshape(-1, 1)
yte = test_df[target_col].values.astype("float32").reshape(-1, 1)

# Convert to PyTorch tensors and move to GPU if available
tXtr = torch.tensor(Xtr, device=device)
tXva = torch.tensor(Xva, device=device)
tXte = torch.tensor(Xte, device=device)
tytr = torch.tensor(ytr, device=device)
tyva = torch.tensor(yva, device=device)
tyte = torch.tensor(yte, device=device)

# Print shapes to verify
print("Shapes — X:", tXtr.shape, "y:", tytr.shape)

Shapes — X: torch.Size([80000, 3]) y: torch.Size([80000, 1])


Setting up the model from scratch here using nn. Have it in a OOP structure. we have 3 layers. 1. a linear layer 2. the ReLU activation 3. another linear that outputs as a single number. We are NOT using a sigmoid here since were going to use that in the accuracy function.

In [15]:
class StudentGradePredictor(nn.Module):
    # Define the neural network structure in a class structure - figured this is the cleanest way. Also sharpens my OOP skills
    def __init__(self, num_inputs, hidden_size=32):
        super().__init__()
        self.fc1 = nn.Linear(num_inputs, hidden_size)  # First layer: inputs to hidden
        self.relu = nn.ReLU()  # Activation function to add non-linearity
        self.fc2 = nn.Linear(hidden_size, 1)  # Output layer: hidden to 1 output (pass/fail)

    def forward(self, x):
        x = self.fc1(x)  # Pass through first layer
        x = self.relu(x)  # Apply activation
        x = self.fc2(x)  # Pass through output layer (raw logits)
        return x  # Return logits (BCEWithLogitsLoss handles sigmoid)

# Get number of input features and create model
num_inputs = tXtr.shape[1]  # Number of features (3 in this case)
model = StudentGradePredictor(num_inputs, hidden_size=32).to(device)
model  # Display the model structure

StudentGradePredictor(
  (fc1): Linear(in_features=3, out_features=32, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=32, out_features=1, bias=True)
)

Now that we have the data cleaned and the model setup we can now make the accuracy function. we turn off gradient tracking in the accuracy function as its not needed and to be more effiecient

In [16]:
@torch.no_grad()  # Disable gradient tracking to save memory
def compute_accuracy(model, features, labels, device):
    logits = model(features.to(device))            # Get raw model outputs
    probabilities = torch.sigmoid(logits)          # Convert logits to probabilities
    predictions = (probabilities >= 0.5).float()   # Convert to 0 or 1 based on threshold
    correct = (predictions == labels.to(device)).float()  # Compare with true labels
    accuracy = correct.mean().item()              # Average correctness
    return accuracy

    # This is just the accuracy function we will be calling it in the training loop next blocl

Setup is done! now lets train this thing.

Using a Binary Cross entropy not a MSE becasue this whole project is a binary classfication. Were using the optimizer module from Pytorch to update the models parametsres after backpropagation. Then set the training loop and call in the accuracy tracker then done!

In [17]:
loss_fn = nn.BCEWithLogitsLoss()  # Loss function for binary classification
optimizer = optim.SGD(model.parameters(), lr=0.01)  # Optimizer with learning rate

epochs = 200
for epoch in range(1, epochs + 1):
    model.train()  # Set model to training mode
    optimizer.zero_grad()  # Clear old gradients

    logits = model(tXtr)  # Get model predictions
    loss = loss_fn(logits, tytr)  # Calculate loss

    loss.backward()  # Compute gradients
    optimizer.step()  # Update weights

    if epoch == 1 or epoch % 10 == 0:
        model.eval()  # Set model to evaluation mode
        train_acc = compute_accuracy(model, tXtr, tytr, device)
        valid_acc = compute_accuracy(model, tXva, tyva, device)
        test_acc = compute_accuracy(model, tXte, tyte, device)
        print(f"Epoch {epoch:03d} | Loss {loss.item():.4f} | "
              f"Train {train_acc:.3f} | Val {valid_acc:.3f} | Test {test_acc:.3f}")

Epoch 001 | Loss 0.6409 | Train 0.585 | Val 0.578 | Test 0.579
Epoch 010 | Loss 0.6187 | Train 0.651 | Val 0.646 | Test 0.648
Epoch 020 | Loss 0.5961 | Train 0.725 | Val 0.720 | Test 0.723
Epoch 030 | Loss 0.5753 | Train 0.765 | Val 0.760 | Test 0.761
Epoch 040 | Loss 0.5561 | Train 0.792 | Val 0.789 | Test 0.789
Epoch 050 | Loss 0.5384 | Train 0.812 | Val 0.809 | Test 0.810
Epoch 060 | Loss 0.5219 | Train 0.828 | Val 0.824 | Test 0.827
Epoch 070 | Loss 0.5066 | Train 0.840 | Val 0.836 | Test 0.841
Epoch 080 | Loss 0.4924 | Train 0.849 | Val 0.846 | Test 0.850
Epoch 090 | Loss 0.4790 | Train 0.857 | Val 0.854 | Test 0.857
Epoch 100 | Loss 0.4665 | Train 0.863 | Val 0.860 | Test 0.863
Epoch 110 | Loss 0.4548 | Train 0.868 | Val 0.866 | Test 0.869
Epoch 120 | Loss 0.4438 | Train 0.873 | Val 0.870 | Test 0.872
Epoch 130 | Loss 0.4335 | Train 0.876 | Val 0.873 | Test 0.875
Epoch 140 | Loss 0.4237 | Train 0.879 | Val 0.876 | Test 0.879
Epoch 150 | Loss 0.4145 | Train 0.882 | Val 0.878 | Tes

So the model is done lets make a simple little app to use it. here were going to put the model plus scalar stats into one dictionary

In [None]:
import torch

# Bundle model state + scaler stats into one dictonary
checkpoint = {
    "model_state": model.state_dict(),
    "means": means.values.astype("float32"),
    "stds": stds.values.astype("float32"),
    "feature_cols": feature_cols
}

torch.save(checkpoint, "student_grade_model.pkl")
print("Saved model to student_grade_model.pkl")


Once you have the model you can know load it. This cell alows you reload the saved model

In [None]:
class StudentGradePredictor(nn.Module):
    def __init__(self, num_inputs, hidden_size=32):
        super().__init__()
        self.fc1 = nn.Linear(num_inputs, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, 1)
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

checkpoint = torch.load("student_grade_model.pkl", map_location="cpu")

# Restore model
num_inputs = len(checkpoint["feature_cols"])
model = StudentGradePredictor(num_inputs, hidden_size=32)
model.load_state_dict(checkpoint["model_state"])
model.eval()

# Restore scaler stats
means = checkpoint["means"]
stds = checkpoint["stds"]
feature_cols = checkpoint["feature_cols"]

print("Model and scaler loaded successfully ✅")
