# Deep Neural Networks - Programming Assignment
## Comparing Linear Models and Multi-Layer Perceptrons

**Student Name:** ___________________  
**Student ID:** ___________________  
**Date:** ___________________

---

## ‚ö†Ô∏è IMPORTANT INSTRUCTIONS

1. **Complete ALL sections** marked with `TODO`
2. **DO NOT modify** the `get_assignment_results()` function structure
3. **Fill in all values accurately** - these will be auto-verified
4. **After submission**, you'll receive a verification quiz based on YOUR results
5. **Run all cells** before submitting (Kernel ‚Üí Restart & Run All)

---

In [2]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import time
import warnings
warnings.filterwarnings('ignore')

# Set random seed for reproducibility
np.random.seed(42)
print('‚úì Libraries imported successfully')

‚úì Libraries imported successfully


## Section 1: Dataset Selection and Loading

**Requirements:**
- ‚â•500 samples
- ‚â•5 features
- Public dataset (UCI/Kaggle)
- Regression OR Classification problem

In [3]:
# TODO: Load your dataset
# Example: data = pd.read_csv('your_dataset.csv')
df = pd.read_csv("winequality-red.csv", sep=';')

# Dataset information (TODO: Fill these)
dataset_name = "Wine Quality"  # e.g., "Breast Cancer Wisconsin"
dataset_source = "UCI Machine Learning Repository"  # e.g., "UCI ML Repository"
n_samples = 1599      # Total number of rows
n_features = 11     # Number of features (excluding target)
problem_type = "regression"  # "regression" or "binary_classification" or "multiclass_classification"

# Problem statement (TODO: Write 2-3 sentences)
problem_statement = """
Predicting sensory wine quality scores from physicochemical measurements of red Vinho Verde wines.
This helps winemakers understand how acidity, alcohol, sulphates, and other chemical properties relate to perceived quality,
which can guide process control and product improvement.
"""

# Primary evaluation metric (TODO: Fill this)
primary_metric = "rmse"  # e.g., "recall", "accuracy", "rmse", "r2"

# Metric justification (TODO: Write 2-3 sentences)
metric_justification = """
Root Mean Squared Error (RMSE) directly measures the typical deviation between predicted and true quality scores in the
same units as the target, making the error magnitude easy to interpret.‚Äã
Since this is a numeric quality score with no dominant threshold of interest, minimizing overall prediction error is more
important than optimizing classification-style metrics, so RMSE is an appropriate primary metric.
"""

print(f"Dataset: {dataset_name}")
print(f"Source: {dataset_source}")
print(f"Samples: {n_samples}, Features: {n_features}")
print(f"Problem Type: {problem_type}")
print(f"Primary Metric: {primary_metric}")

Dataset: Wine Quality
Source: UCI Machine Learning Repository
Samples: 1599, Features: 11
Problem Type: regression
Primary Metric: rmse


## Section 2: Data Preprocessing

Preprocess your data:
1. Handle missing values
2. Encode categorical variables
3. Split into train/test sets
4. Scale features

In [5]:
# TODO: Preprocess your data
X = df.drop('quality', axis=1).values
y = df['quality'].values
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Fill these after preprocessing
train_samples = 1279       # Number of training samples
test_samples = 320        # Number of test samples
train_test_ratio = train_samples / (train_samples + test_samples)  # e.g., 0.8 for 80-20 split

print(f"Train samples: {train_samples}")
print(f"Test samples: {test_samples}")
print(f"Split ratio: {train_test_ratio:.1%}")

Train samples: 1279
Test samples: 320
Split ratio: 80.0%


## Section 3: Baseline Model Implementation

Implement from scratch (NO sklearn models!):
- Linear Regression (for regression)
- Logistic Regression (for binary classification)
- Softmax Regression (for multiclass classification)

**Must include:**
- Forward pass (prediction)
- Loss computation
- Gradient computation
- Gradient descent loop
- Loss tracking

In [7]:
class BaselineModel:
    """
    Baseline linear model with gradient descent
    Implemented here as Linear Regression (MSE loss)
    """
    def __init__(self, learning_rate=0.01, n_iterations=1000):
        self.lr = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None
        self.loss_history = []
    
    def fit(self, X, y):
        """
        Gradient descent training
        
        Steps:
        1. Initialize weights and bias
        2. For each iteration:
           a. Compute predictions (forward pass)
           b. Compute loss (MSE)
           c. Compute gradients
           d. Update weights and bias
           e. Store loss in self.loss_history
        """
        n_samples, n_features = X.shape
        
        # 1. Initialize parameters
        self.weights = np.zeros(n_features)   # shape: (n_features,)
        self.bias = 0.0
        
        # Ensure y is 1D
        y = y.ravel()
        
        # 2. Gradient descent loop
        for _ in range(self.n_iterations):
            # a. Forward pass
            y_pred = np.dot(X, self.weights) + self.bias   # (n_samples,)
            
            # b. Loss (Mean Squared Error)
            errors = y_pred - y                            # (n_samples,)
            loss = np.mean(errors ** 2)
            
            # c. Gradients of MSE w.r.t. weights and bias
            dw = (2 / n_samples) * np.dot(X.T, errors)     # (n_features,)
            db = (2 / n_samples) * np.sum(errors)          # scalar
            
            # d. Parameter update
            self.weights -= self.lr * dw
            self.bias    -= self.lr * db
            
            # e. Record loss
            self.loss_history.append(loss)
        
        return self
    
    def predict(self, X):
        """
        For regression: return linear outputs.
        """
        return np.dot(X, self.weights) + self.bias

print("‚úì Baseline model class defined")

‚úì Baseline model class defined


In [8]:
# Train baseline model
print("Training baseline model...")
baseline_start_time = time.time()

# Initialize and train your baseline model
baseline_model = BaselineModel(learning_rate=0.01, n_iterations=1000)
baseline_model.fit(X_train_scaled, y_train)  # Uses scaled features for better convergence

# Make predictions on test set
baseline_predictions = baseline_model.predict(X_test_scaled)

baseline_training_time = time.time() - baseline_start_time
print(f"‚úì Baseline training completed in {baseline_training_time:.2f}s")
print(f"‚úì Loss decreased from {baseline_model.loss_history[0]:.4f} to {baseline_model.loss_history[-1]:.4f}")


Training baseline model...
‚úì Baseline training completed in 0.04s
‚úì Loss decreased from 32.2791 to 0.4242


## Section 4: Multi-Layer Perceptron Implementation

Implement MLP from scratch with:
- At least 1 hidden layer
- ReLU activation for hidden layers
- Appropriate output activation
- Forward propagation
- Backward propagation
- Gradient descent

In [15]:
class MLP:
    """
    Multi-Layer Perceptron implemented from scratch
    """
    def __init__(self, architecture, learning_rate=0.01, n_iterations=1000):
        self.architecture = architecture
        self.lr = learning_rate
        self.n_iterations = n_iterations
        self.parameters = {}
        self.loss_history = []
        self.cache = {}
    
    def initialize_parameters(self):
        """Xavier initialization for weights, zeros for biases"""
        np.random.seed(42)
        for l in range(1, len(self.architecture)):
            n_l, n_l1 = self.architecture[l], self.architecture[l-1]
            self.parameters[f'W{l}'] = np.random.randn(n_l, n_l1) * np.sqrt(2.0 / n_l1)
            self.parameters[f'b{l}'] = np.zeros((n_l, 1))
    
    def relu(self, Z):
        """ReLU activation function"""
        return np.maximum(0, Z)
    
    def relu_derivative(self, Z):
        """ReLU derivative"""
        return (Z > 0).astype(float)
    
    def sigmoid(self, Z):
        """Sigmoid activation (for binary classification output)"""
        return 1 / (1 + np.exp(-np.clip(Z, -500, 500)))
    
    def forward_propagation(self, X):
        """Fixed forward pass with proper bias broadcasting"""
        self.cache['A0'] = X.reshape(-1, self.architecture[0])  # Ensure (m, n_input)
        
        for l in range(1, len(self.architecture)):
            # Z = W @ A_prev + b  (broadcast b across samples)
            Z = self.parameters[f'W{l}'] @ self.cache[f'A{l-1}'] + self.parameters[f'b{l}']
            self.cache[f'Z{l}'] = Z
            
            if l == len(self.architecture) - 1:  # Output layer
                A = Z  # Linear activation for regression
            else:  # Hidden layers
                A = self.relu(Z)
            
            self.cache[f'A{l}'] = A
        
        return self.cache[f'A{len(self.architecture)-1}']
  

    def backward_propagation(self, X, y):
        """Corrected backward pass with proper dimension handling"""
        m = X.shape[0]
        L = len(self.architecture) - 1
        grads = {}
        
        # Output layer gradients (MSE loss)
        y_pred = self.cache[f'A{L}']  # (m, 1)
        dZ = 2 * (y_pred - y) / m    # (m, 1)
        
        grads[f'dW{L}'] = (1/m) * (dZ @ self.cache[f'A{L-1}'].T)  # (1, n[L-1])
        grads[f'db{L}'] = np.sum(dZ, axis=0, keepdims=True)       # (1, 1)
        
        # Backpropagate through hidden layers
        for l in reversed(range(1, L)):
            # dZ[l] = dA[l] * g'(Z[l]) where dA[l] = W[l+1].T @ dZ[l+1]
            dA = grads[f'dW{l+1}'].T @ dZ  # (n[l], m) @ (m, n[l+1]) -> (n[l], 1)? Wait no
            
            # CORRECT: dZ[l] = (W[l+1].T @ dZ[l+1]) * g'(Z[l])
            dZ = (self.parameters[f'W{l+1}'].T @ dZ) * self.relu_derivative(self.cache[f'Z{l}'])
            
            grads[f'dW{l}'] = (1/m) * (dZ @ self.cache[f'A{l-1}'].T)
            grads[f'db{l}'] = np.sum(dZ, axis=0, keepdims=True)
        
        return grads

    
    def update_parameters(self, grads):
        """Update all parameters using gradient descent"""
        L = len(self.architecture) - 1
        for l in range(1, L+1):
            self.parameters[f'W{l}'] -= self.lr * grads[f'dW{l}']
            self.parameters[f'b{l}'] -= self.lr * grads[f'db{l}']
    
    def compute_loss(self, y_pred, y_true):
        """MSE loss for regression"""
        return np.mean((y_pred - y_true) ** 2)
    
    def fit(self, X, y):
        """Fixed training loop with proper input shapes"""
        self.initialize_parameters()
        
        # Ensure proper shapes
        if X.ndim == 1:
            X = X.reshape(-1, 1)
        y = y.reshape(-1, 1)  # (m, 1)
        
        for i in range(self.n_iterations):
            # Forward
            y_pred = self.forward_propagation(X)
            
            # Loss
            loss = self.compute_loss(y_pred, y)
            self.loss_history.append(loss)
            
            # Backward + Update
            grads = self.backward_propagation(X, y)
            self.update_parameters(grads)
        
        return self
    
    def predict(self, X):
        """Prediction using forward propagation"""
        return self.forward_propagation(X)

print("‚úì MLP class defined")

‚úì MLP class defined


In [16]:
# Train MLP
print("Training MLP...")
mlp_start_time = time.time()

# Define architecture: [11 input features, 16 hidden, 8 hidden, 1 output]
mlp_architecture = [11, 16, 8, 1]  
mlp_model = MLP(architecture=mlp_architecture, learning_rate=0.001, n_iterations=2000)
mlp_model.fit(X_train_scaled, y_train)

# Make predictions on test set
mlp_predictions = mlp_model.predict(X_test_scaled)

mlp_training_time = time.time() - mlp_start_time
print(f"‚úì MLP training completed in {mlp_training_time:.2f}s")
print(f"‚úì Loss decreased from {mlp_model.loss_history[0]:.4f} to {mlp_model.loss_history[-1]:.4f}")

Training MLP...


ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1279 is different from 11)

## Section 5: Evaluation and Metrics

Calculate appropriate metrics for your problem type

In [None]:
def calculate_metrics(y_true, y_pred, problem_type):
    """
    TODO: Calculate appropriate metrics based on problem type
    
    For regression: MSE, RMSE, MAE, R¬≤
    For classification: Accuracy, Precision, Recall, F1
    """
    metrics = {}
    
    if problem_type == "regression":
        # TODO: Calculate regression metrics
        pass
    elif problem_type in ["binary_classification", "multiclass_classification"]:
        # TODO: Calculate classification metrics
        pass
    
    return metrics

# Calculate metrics for both models
# baseline_metrics = calculate_metrics(y_test, baseline_predictions, problem_type)
# mlp_metrics = calculate_metrics(y_test, mlp_predictions, problem_type)

print("Baseline Model Performance:")
# print(baseline_metrics)

print("\nMLP Model Performance:")
# print(mlp_metrics)

## Section 6: Visualization

Create visualizations:
1. Training loss curves
2. Performance comparison
3. Additional domain-specific plots

In [None]:
# 1. Training loss curves
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
# TODO: Plot baseline loss
# plt.plot(baseline_model.loss_history, label='Baseline', color='blue')
plt.xlabel('Iteration')
plt.ylabel('Loss')
plt.title('Baseline Model - Training Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
# TODO: Plot MLP loss
# plt.plot(mlp_model.loss_history, label='MLP', color='red')
plt.xlabel('Iteration')
plt.ylabel('Loss')
plt.title('MLP Model - Training Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# 2. Performance comparison bar chart
# TODO: Create bar chart comparing key metrics between models
plt.figure(figsize=(10, 6))

# Example:
# metrics = ['Accuracy', 'Precision', 'Recall', 'F1']
# baseline_scores = [baseline_metrics[m] for m in metrics]
# mlp_scores = [mlp_metrics[m] for m in metrics]
# 
# x = np.arange(len(metrics))
# width = 0.35
# 
# plt.bar(x - width/2, baseline_scores, width, label='Baseline')
# plt.bar(x + width/2, mlp_scores, width, label='MLP')
# plt.xlabel('Metrics')
# plt.ylabel('Score')
# plt.title('Model Performance Comparison')
# plt.xticks(x, metrics)
# plt.legend()
# plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Section 7: Analysis and Discussion

Write your analysis (minimum 200 words)

In [None]:
analysis_text = """
TODO: Write your analysis here (minimum 200 words)

Address these questions:
1. Which model performed better and by how much?
2. Why do you think one model outperformed the other?
3. What was the computational cost difference (training time)?
4. Any surprising findings or challenges you faced?
5. What insights did you gain about neural networks vs linear models?

Write your thoughtful analysis here. Be specific and reference your actual results.
Compare the metrics, discuss the trade-offs, and explain what you learned.
"""

print(f"Analysis word count: {len(analysis_text.split())} words")
if len(analysis_text.split()) < 200:
    print("‚ö†Ô∏è  Warning: Analysis should be at least 200 words")
else:
    print("‚úì Analysis meets word count requirement")

---
---

## ‚≠ê REQUIRED: Structured Output Function

### **DO NOT MODIFY THE STRUCTURE BELOW**

This function will be called by the auto-grader. Fill in all values accurately based on your actual results.

In [None]:
def get_assignment_results():
    """
    Return all assignment results in structured format.
    
    CRITICAL: Fill in ALL values based on your actual results!
    This will be automatically extracted and validated.
    """
    
    # Calculate loss convergence flags
    baseline_initial_loss = 0.0  # TODO: baseline_model.loss_history[0]
    baseline_final_loss = 0.0    # TODO: baseline_model.loss_history[-1]
    mlp_initial_loss = 0.0       # TODO: mlp_model.loss_history[0]
    mlp_final_loss = 0.0         # TODO: mlp_model.loss_history[-1]
    
    results = {
        # ===== Dataset Information =====
        'dataset_name': dataset_name,
        'dataset_source': dataset_source,
        'n_samples': n_samples,
        'n_features': n_features,
        'problem_type': problem_type,
        'problem_statement': problem_statement,
        
        # ===== Evaluation Setup =====
        'primary_metric': primary_metric,
        'metric_justification': metric_justification,
        'train_samples': train_samples,
        'test_samples': test_samples,
        'train_test_ratio': train_test_ratio,
        
        # ===== Baseline Model Results =====
        'baseline_model': {
            'model_type': '',  # 'linear_regression', 'logistic_regression', or 'softmax_regression'
            'learning_rate': 0.0,
            'n_iterations': 0,
            'initial_loss': baseline_initial_loss,
            'final_loss': baseline_final_loss,
            'training_time_seconds': baseline_training_time,
            
            # Metrics (fill based on your problem type)
            'test_accuracy': 0.0,      # For classification
            'test_precision': 0.0,     # For classification
            'test_recall': 0.0,        # For classification
            'test_f1': 0.0,            # For classification
            'test_mse': 0.0,           # For regression
            'test_rmse': 0.0,          # For regression
            'test_mae': 0.0,           # For regression
            'test_r2': 0.0,            # For regression
        },
        
        # ===== MLP Model Results =====
        'mlp_model': {
            'architecture': mlp_architecture,
            'n_hidden_layers': len(mlp_architecture) - 2 if len(mlp_architecture) > 0 else 0,
            'total_parameters': 0,     # TODO: Calculate total weights + biases
            'learning_rate': 0.0,
            'n_iterations': 0,
            'initial_loss': mlp_initial_loss,
            'final_loss': mlp_final_loss,
            'training_time_seconds': mlp_training_time,
            
            # Metrics
            'test_accuracy': 0.0,
            'test_precision': 0.0,
            'test_recall': 0.0,
            'test_f1': 0.0,
            'test_mse': 0.0,
            'test_rmse': 0.0,
            'test_mae': 0.0,
            'test_r2': 0.0,
        },
        
        # ===== Comparison =====
        'improvement': 0.0,            # MLP primary_metric - baseline primary_metric
        'improvement_percentage': 0.0,  # (improvement / baseline) * 100
        'baseline_better': False,       # True if baseline outperformed MLP
        
        # ===== Analysis =====
        'analysis': analysis_text,
        'analysis_word_count': len(analysis_text.split()),
        
        # ===== Loss Convergence Flags =====
        'baseline_loss_decreased': baseline_final_loss < baseline_initial_loss,
        'mlp_loss_decreased': mlp_final_loss < mlp_initial_loss,
        'baseline_converged': False,  # Optional: True if converged
        'mlp_converged': False,
    }
    
    return results

## Test Your Output

Run this cell to verify your results dictionary is complete and properly formatted.

In [None]:
# Test the output
import json

try:
    results = get_assignment_results()
    
    print("="*70)
    print("ASSIGNMENT RESULTS SUMMARY")
    print("="*70)
    print(json.dumps(results, indent=2, default=str))
    print("\n" + "="*70)
    
    # Check for missing values
    missing = []
    def check_dict(d, prefix=""):
        for k, v in d.items():
            if isinstance(v, dict):
                check_dict(v, f"{prefix}{k}.")
            elif (v == 0 or v == "" or v == 0.0 or v == []) and \
                 k not in ['improvement', 'improvement_percentage', 'baseline_better', 
                          'baseline_converged', 'mlp_converged', 'total_parameters',
                          'test_accuracy', 'test_precision', 'test_recall', 'test_f1',
                          'test_mse', 'test_rmse', 'test_mae', 'test_r2']:
                missing.append(f"{prefix}{k}")
    
    check_dict(results)
    
    if missing:
        print(f"‚ö†Ô∏è  Warning: {len(missing)} fields still need to be filled:")
        for m in missing[:15]:  # Show first 15
            print(f"  - {m}")
        if len(missing) > 15:
            print(f"  ... and {len(missing)-15} more")
    else:
        print("‚úÖ All required fields are filled!")
        print("\nüéâ You're ready to submit!")
        print("\nNext steps:")
        print("1. Kernel ‚Üí Restart & Clear Output")
        print("2. Kernel ‚Üí Restart & Run All")
        print("3. Verify no errors")
        print("4. Save notebook")
        print("5. Rename as: YourStudentID_assignment.ipynb")
        print("6. Submit to LMS")
        
except Exception as e:
    print(f"‚ùå Error in get_assignment_results(): {str(e)}")
    print("\nPlease fix the errors above before submitting.")

---

## üì§ Before Submitting - Final Checklist

- [ ] **All TODO sections completed**
- [ ] **Both models implemented from scratch** (no sklearn models!)
- [ ] **get_assignment_results() function filled accurately**
- [ ] **Loss decreases for both models**
- [ ] **Analysis ‚â• 200 words**
- [ ] **All cells run without errors** (Restart & Run All)
- [ ] **Visualizations created**
- [ ] **File renamed correctly**: YourStudentID_assignment.ipynb

---

## ‚è≠Ô∏è What Happens Next

After submission:
1. ‚úÖ Your notebook will be **auto-graded** (executes automatically)
2. ‚úÖ You'll receive a **verification quiz** (10 questions, 5 minutes)
3. ‚úÖ Quiz questions based on **YOUR specific results**
4. ‚úÖ Final score released after quiz validation

**The verification quiz ensures you actually ran your code!**

---

**Good luck! üöÄ**