# Neural Networks Homework

Welcome to your homework on **Neural Networks with PyTorch**!

In this notebook, you will practice the concepts from today's lesson:
- Data preprocessing (missing values, encoding, normalization)
- Building a Perceptron model with `torch.nn`
- Training loop (forward pass, loss, backward pass, optimizer)
- Interpreting learned parameters

**Instructions:**
- Fill in the code cells marked with `# YOUR CODE HERE`
- Do NOT change any other code
- Make sure the CSV file `StudentPerformanceFactors.csv` is in the same folder

---

In [None]:
import torch
import pandas as pd
import numpy as np

---

## Question 1: Conceptual Questions

Answer the following questions by assigning the correct option letter (as a string) to each variable.

**1a)** What type of machine learning is it when we train a model using data that includes both inputs and known outputs?
- A: Unsupervised Learning
- B: Supervised Learning
- C: Reinforcement Learning
- D: Transfer Learning

**1b)** What is the difference between an **algorithm** and a **model**?
- A: An algorithm is the trained result; a model is the procedure
- B: An algorithm is a set of rules for learning; a model is the result after training with learned parameters
- C: They are the same thing
- D: A model is always a neural network; an algorithm is not

**1c)** A Perceptron with an identity activation function is mathematically equivalent to:
- A: Logistic Regression
- B: K-Nearest Neighbors
- C: Multiple Linear Regression
- D: Decision Tree

**1d)** Why do we normalize features before training a model?
- A: To make the dataset smaller
- B: To remove missing values
- C: So that features with larger scales don't dominate those with smaller scales
- D: To convert categorical data to numbers

In [None]:
# YOUR CODE HERE
answer_1a = "__"  # Replace __ with A, B, C, or D
answer_1b = "__"
answer_1c = "__"
answer_1d = "__"

---

## Question 2: Handling Missing Values

The small dataset below has missing values. Your task is to:

1. Find which columns have missing values
2. Fill the missing values in `City` with the **mode** (most frequent value)
3. Fill the missing values in `Rating` with the **mean**

In [None]:
# --- Dataset ---
df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve', 'Frank'],
    'City': ['London', 'London', None, 'Paris', 'London', None],
    'Rating': [4.5, None, 3.8, 4.2, None, 3.5]
})

print("Before:")
print(df)
print("\nMissing values:")
print(df.isnull().sum())

In [None]:
# YOUR CODE HERE
# Fill missing values in 'City' with the mode (most frequent value)
df['City'] = ...

# Fill missing values in 'Rating' with the mean
df['Rating'] = ...

print("After:")
print(df)
print("\nMissing values:")
print(df.isnull().sum())

---

## Question 3: Encoding Categorical Data

Given the dataset below:

1. Apply **Label Encoding** to the binary column `Has_Car` (Yes -> 1, No -> 0)
2. Apply **One-Hot Encoding** to the multi-class column `Department`

**Reminder:**
- Label Encoding is for binary (2 options) or ordinal data
- One-Hot Encoding is for nominal (unranked) multi-class data

In [None]:
# --- Dataset ---
df_enc = pd.DataFrame({
    'Employee': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
    'Has_Car': ['Yes', 'No', 'Yes', 'No', 'Yes'],
    'Department': ['Sales', 'Engineering', 'Sales', 'HR', 'Engineering'],
    'Salary': [50000, 70000, 52000, 60000, 75000]
})

print("Before encoding:")
print(df_enc)

In [None]:
# YOUR CODE HERE

# Step 1: Label encode 'Has_Car' (Yes -> 1, No -> 0)
df_enc['Has_Car'] = ...

# Step 2: One-hot encode 'Department'
df_enc = ...

print("After encoding:")
print(df_enc)

---

## Question 4: Z-Score Normalization

Implement the Z-score normalization function:

$$X_{\text{norm}} = \frac{X - \mu}{\sigma}$$

Then apply it to the `Salary` column from the dataset below.

In [None]:
salaries = np.array([50000, 70000, 52000, 60000, 75000], dtype=np.float64)
print(f"Original salaries: {salaries}")

# YOUR CODE HERE
# Step 1: Compute the mean
mu = ...

# Step 2: Compute the standard deviation
sigma = ...

# Step 3: Apply z-score normalization
salaries_norm = ...

print(f"Mean: {mu}")
print(f"Std: {sigma:.2f}")
print(f"Normalized salaries: {salaries_norm}")

---

## Question 5: Encoding Questions

For each column described below, decide whether you should use **Label Encoding** or **One-Hot Encoding**.

**5a)** A column `Color` with values: `Red`, `Blue`, `Green`
- A: Label Encoding
- B: One-Hot Encoding

**5b)** A column `Smoker` with values: `Yes`, `No`
- A: Label Encoding
- B: One-Hot Encoding

**5c)** A column `Education` with values: `High School`, `Bachelor`, `Master`, `PhD` (you are certain of the ranking)
- A: Label Encoding
- B: One-Hot Encoding

**5d)** A column `Country` with values: `UK`, `France`, `Germany`, `Italy`
- A: Label Encoding
- B: One-Hot Encoding

In [None]:
# YOUR CODE HERE
answer_5a = "__"  # Replace __ with A or B
answer_5b = "__"
answer_5c = "__"
answer_5d = "__"

---

## Question 6: Build and Train a Perceptron

Now let's put it all together! You will build a **Perceptron** model to predict a student's exam score using the `StudentPerformanceFactors.csv` dataset.

The data preprocessing has been done for you. Your tasks are to:

1. Define the Perceptron model class
2. Create the loss function and optimizer
3. Implement the training loop

In [None]:
# --- Data Preprocessing (already done for you) ---

# Load the data
data = pd.read_csv('StudentPerformanceFactors.csv')

# Fill missing values
data['Teacher_Quality'] = data['Teacher_Quality'].fillna(data['Teacher_Quality'].mode()[0])
data['Parental_Education_Level'] = data['Parental_Education_Level'].fillna(data['Parental_Education_Level'].mode()[0])
data['Distance_from_Home'] = data['Distance_from_Home'].fillna(data['Distance_from_Home'].mode()[0])

# Label encode binary columns
binary_cols = [col for col in data.columns if data[col].dtype == 'object' and data[col].nunique() == 2]
for col in binary_cols:
    data[col] = data[col].map({data[col].unique()[0]: 0, data[col].unique()[1]: 1})

# One-hot encode multi-class columns
multi_cols = [col for col in data.columns if data[col].dtype == 'object' and data[col].nunique() > 2]
data = pd.get_dummies(data, columns=multi_cols)

# Normalize features
for column in data.columns:
    if column != 'Exam_Score':
        mu = np.mean(data[column])
        sigma = np.std(data[column])
        data[column] = (data[column] - mu) / sigma

# Split features and target
x = data.drop('Exam_Score', axis=1)
y = data['Exam_Score']

# Convert to tensors
x_tensor = torch.tensor(x.values, dtype=torch.float32)
y_tensor = torch.tensor(y.values, dtype=torch.float32).reshape(-1, 1)

input_size = x_tensor.shape[1]
print(f"Input features: {input_size}")
print(f"x_tensor shape: {x_tensor.shape}")
print(f"y_tensor shape: {y_tensor.shape}")

### Step 1: Define the Perceptron Model

Create a class called `Perceptron` that:
- Inherits from `torch.nn.Module`
- Has a single `torch.nn.Linear` layer with the correct input and output sizes
- Has a `forward` method that passes the input through the linear layer

In [None]:
# YOUR CODE HERE
class Perceptron(torch.nn.Module):

    def __init__(self, input_size):
        super(Perceptron, self).__init__()
        # Create a linear layer: input_size -> 1
        self.linear = ...  

    def forward(self, x):
        # Pass x through the linear layer and return the result
        return ...

# Create the model
model = Perceptron(input_size)
print(f"Model: {model}")
print(f"Number of parameters: {sum(p.numel() for p in model.parameters())}")

### Step 2: Define Loss Function and Optimizer

- Use `torch.nn.MSELoss()` as the loss function
- Use `torch.optim.SGD` as the optimizer with `learning_rate = 0.01`

In [None]:
# YOUR CODE HERE
criterion = ...  # MSE Loss
learning_rate = 0.01
optimizer = ...  # SGD optimizer with model parameters and learning rate

### Step 3: Train the Model

Implement the training loop for **1000 epochs**. For each epoch:
1. Forward pass: get predictions from the model
2. Compute the loss using the criterion
3. Backward pass: call `loss.backward()`
4. Update weights: call `optimizer.step()`
5. Clear gradients: call `optimizer.zero_grad()`
6. Print the loss every 100 epochs

In [None]:
# YOUR CODE HERE
num_epochs = 1000
loss_history = []

for epoch in range(num_epochs):
    pass  # Replace with your implementation

print(f"\nTraining complete!")
print(f"Final Loss: {loss_history[-1]:.4f}")

---

## Question 7: Interpret the Results

After training, examine the model's learned weights and answer the questions below.

In [None]:
# --- This cell extracts and displays the model's learned parameters ---
weights = model.linear.weight.data.numpy().flatten()
bias = model.linear.bias.data.numpy()[0]

feature_importance = pd.DataFrame({
    'Feature': x.columns,
    'Weight': weights
})
feature_importance['Abs_Weight'] = feature_importance['Weight'].abs()
feature_importance = feature_importance.sort_values('Abs_Weight', ascending=False)

print(f"Bias: {bias:.4f}")
print(f"\nTop 5 most important features:")
print(feature_importance.head(5).to_string(index=False))

**7a)** If a feature has a **positive** weight, what does that mean?
- A: As the feature increases, the predicted exam score decreases
- B: As the feature increases, the predicted exam score increases
- C: The feature has no effect on the prediction
- D: The feature should be removed

**7b)** If a feature has a **negative** weight, what does that mean?
- A: The feature is not important
- B: As the feature increases, the predicted exam score increases
- C: As the feature increases, the predicted exam score decreases
- D: The model is broken

**7c)** Which tells you how **important** a feature is to the model?
- A: The sign (positive/negative) of the weight
- B: The absolute value (magnitude) of the weight
- C: The name of the feature
- D: The order in which features appear in the dataset

In [None]:
# YOUR CODE HERE
answer_7a = "__"  # Replace __ with A, B, C, or D
answer_7b = "__"
answer_7c = "__"

---

## Question 8: Training Loop Components

Put the training loop steps in the correct order by assigning numbers 1-5.

- `optimizer.zero_grad()` - Clear previous gradients
- `loss = criterion(predictions, y_tensor)` - Calculate loss
- `predictions = model(x_tensor)` - Forward pass
- `optimizer.step()` - Update weights
- `loss.backward()` - Compute gradients

In [None]:
# YOUR CODE HERE
# Assign the correct order number (1-5) for each step
step_forward_pass = ...       # predictions = model(x_tensor)
step_calculate_loss = ...     # loss = criterion(predictions, y_tensor)
step_backward_pass = ...      # loss.backward()
step_update_weights = ...     # optimizer.step()
step_zero_gradients = ...     # optimizer.zero_grad()

---

## Bonus: Make a Prediction

Using the trained model, predict the exam score for a new student. The data has already been preprocessed (normalized and encoded) for you.

Remember: to make a prediction with a trained PyTorch model, you pass the input tensor through the model.

In [None]:
# A new student's data (already preprocessed)
# This represents a student who studies a lot, has high attendance, etc.
new_student = torch.zeros(1, input_size)  # Start with zeros
new_student[0, 0] = 1.5   # High hours studied (normalized)
new_student[0, 1] = 1.5   # High attendance (normalized)
new_student[0, 4] = 1.0   # High previous scores (normalized)

# YOUR CODE HERE
# Use the model to predict this student's exam score
# Hint: wrap the prediction in torch.no_grad() since we're not training
with torch.no_grad():
    predicted_score = ...

print(f"Predicted exam score: {predicted_score.item():.2f}")

---

## Summary

In this homework you practiced:

1. **Core concepts** of supervised learning, algorithms vs models, and perceptrons
2. **Handling missing values** using mode and mean
3. **Encoding categorical data** using label encoding and one-hot encoding
4. **Z-score normalization** to standardize feature scales
5. **Choosing the right encoding** for different types of data
6. **Building a Perceptron** model using `torch.nn.Module`
7. **Interpreting model weights** to understand feature importance
8. **Understanding the training loop** step by step

Great work! You now have the foundations for building neural networks!