<a href="https://colab.research.google.com/github/sprince0031/ICT-Python-ML/blob/main/Week%205/Notebooks/week5_solutions.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python and ML Foundations: Session 5 - Solutions
## Perceptrons, MLPs & Advanced Metrics

Complete solutions for session 5 practice challenges.

## Utility code

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier, MLPRegressor
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics import confusion_matrix, classification_report, mean_squared_error, r2_score
from sklearn.datasets import load_diabetes, make_classification, fetch_california_housing

sns.set_style('whitegrid')
np.random.seed(42)

---
## Video 1: Perceptrons and MLPs - NAND Gate Solution

In [None]:
# Task 1: Create the NAND dataset
X_nand = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y_nand = np.array([1, 1, 1, 0])

print("NAND Truth Table:")
print("Input | Output")
print("------|-------")
for i in range(len(X_nand)):
    print(f"{X_nand[i]} | {y_nand[i]}")

In [None]:
# Task 2: Train a Perceptron on NAND
perceptron_nand = Perceptron(max_iter=1000, random_state=42)
perceptron_nand.fit(X_nand, y_nand)

# Test
y_pred_perceptron = perceptron_nand.predict(X_nand)

print("\nPerceptron Predictions on NAND:")
print("Input | Actual | Predicted")
print("------|--------|----------")
for i in range(len(X_nand)):
    print(f"{X_nand[i]} | {y_nand[i]:6d} | {y_pred_perceptron[i]:9d}")

acc_perceptron = accuracy_score(y_nand, y_pred_perceptron)
print(f"\nPerceptron Accuracy: {acc_perceptron:.2f}")

In [None]:
# Task 3: Train an MLP on NAND
mlp_nand = MLPClassifier(hidden_layer_sizes=(4,), activation='relu',
                         max_iter=5000, random_state=42)
mlp_nand.fit(X_nand, y_nand)

# Test
y_pred_mlp = mlp_nand.predict(X_nand)

print("\nMLP Predictions on NAND:")
print("Input | Actual | Predicted")
print("------|--------|----------")
for i in range(len(X_nand)):
    print(f"{X_nand[i]} | {y_nand[i]:6d} | {y_pred_mlp[i]:9d}")

acc_mlp = accuracy_score(y_nand, y_pred_mlp)
print(f"\nMLP Accuracy: {acc_mlp:.2f}")

In [None]:
# Task 4: Compare results
print("\n" + "="*50)
print("COMPARISON: Perceptron vs MLP on NAND")
print("="*50)
print(f"Perceptron Accuracy: {acc_perceptron:.2f}")
print(f"MLP Accuracy:        {acc_mlp:.2f}")
print("\nConclusion:")
print("  ✅ Both models successfully solve NAND")
print("  ✅ NAND is linearly separable")
print("  ✅ A simple perceptron is sufficient for NAND")
print("="*50)

---
## Video 2: MLPs for Regression - California Housing Solution

In [None]:
# Task 1: Load California Housing dataset
california = fetch_california_housing()
X_cal = california.data
y_cal = california.target

print("California Housing Dataset:")
print(f"  Samples: {X_cal.shape[0]}")
print(f"  Features: {X_cal.shape[1]}")
print(f"  Target (median house value): ${y_cal.min():.2f} to ${y_cal.max():.2f} (in $100k)")

In [None]:
# Task 2: Split and scale
X_train_cal, X_test_cal, y_train_cal, y_test_cal = train_test_split(
    X_cal, y_cal, test_size=0.2, random_state=42
)

scaler_cal = StandardScaler()
X_train_cal_scaled = scaler_cal.fit_transform(X_train_cal)
X_test_cal_scaled = scaler_cal.transform(X_test_cal)

print(f"Training samples: {X_train_cal.shape[0]}")
print(f"Test samples: {X_test_cal.shape[0]}")

In [None]:
# Task 3: Train MLP with identity activation
mlp_identity = MLPRegressor(hidden_layer_sizes=(20, 10), activation='identity',
                            max_iter=1000, random_state=42)
mlp_identity.fit(X_train_cal_scaled, y_train_cal)

y_pred_identity = mlp_identity.predict(X_test_cal_scaled)
r2_identity = r2_score(y_test_cal, y_pred_identity)
mse_identity = mean_squared_error(y_test_cal, y_pred_identity)

print("MLP with Identity Activation:")
print(f"  R² Score: {r2_identity:.4f}")
print(f"  MSE: {mse_identity:.4f}")

In [None]:
# Task 4: Train MLP with tanh activation
mlp_tanh = MLPRegressor(hidden_layer_sizes=(20, 10), activation='tanh',
                        max_iter=1000, random_state=42)
mlp_tanh.fit(X_train_cal_scaled, y_train_cal)

y_pred_tanh = mlp_tanh.predict(X_test_cal_scaled)
r2_tanh = r2_score(y_test_cal, y_pred_tanh)
mse_tanh = mean_squared_error(y_test_cal, y_pred_tanh)

print("MLP with Tanh Activation:")
print(f"  R² Score: {r2_tanh:.4f}")
print(f"  MSE: {mse_tanh:.4f}")

In [None]:
# Task 5: Compare R² scores
print("\n" + "="*50)
print("COMPARISON: Identity vs Tanh Activation")
print("="*50)
print(f"Identity R²: {r2_identity:.4f}")
print(f"Tanh R²:     {r2_tanh:.4f}")
print(f"\nImprovement: {(r2_tanh - r2_identity):.4f}")
print("\n✅ Non-linear activation (tanh) performs better!")
print("="*50)

In [None]:
# Task 6: Visualize predictions
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Identity activation plot
axes[0].scatter(y_test_cal, y_pred_identity, alpha=0.5)
axes[0].plot([y_test_cal.min(), y_test_cal.max()], 
             [y_test_cal.min(), y_test_cal.max()], 'r--', lw=2)
axes[0].set_xlabel('Actual Values')
axes[0].set_ylabel('Predicted Values')
axes[0].set_title(f'Identity Activation\nR² = {r2_identity:.4f}')
axes[0].grid(True, alpha=0.3)

# Tanh activation plot
axes[1].scatter(y_test_cal, y_pred_tanh, alpha=0.5, color='green')
axes[1].plot([y_test_cal.min(), y_test_cal.max()], 
             [y_test_cal.min(), y_test_cal.max()], 'r--', lw=2)
axes[1].set_xlabel('Actual Values')
axes[1].set_ylabel('Predicted Values')
axes[1].set_title(f'Tanh Activation\nR² = {r2_tanh:.4f}')
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nInterpretation:")
print("  - Red dashed line = perfect predictions")
print("  - Points closer to line = better predictions")
print("  - Tanh shows better fit to actual values")

---
## Video 3: Advanced Metrics - Fraud Detection Solution

In [None]:
# Task 1: Create imbalanced dataset
X_fraud, y_fraud = make_classification(
    n_samples=10000, n_features=20, n_informative=15,
    n_redundant=5, n_classes=2, weights=[0.98, 0.02],
    random_state=42
)

print("Credit Card Fraud Dataset:")
unique, counts = np.unique(y_fraud, return_counts=True)
for cls, count in zip(unique, counts):
    label = "Legitimate" if cls == 0 else "Fraudulent"
    print(f"  {label} (Class {cls}): {count:5d} samples ({count/len(y_fraud)*100:5.1f}%)")

# Visualize
plt.figure(figsize=(8, 5))
plt.bar(['Legitimate\n(Class 0)', 'Fraudulent\n(Class 1)'], counts, 
        color=['lightgreen', 'salmon'])
plt.ylabel('Number of Transactions')
plt.title('Class Distribution - Highly Imbalanced Dataset')
plt.ylim(0, max(counts) * 1.1)
for i, count in enumerate(counts):
    plt.text(i, count + 100, f'{count} ({count/len(y_fraud)*100:.1f}%)', 
             ha='center', fontweight='bold')
plt.show()

In [None]:
# Task 2: Train MLP classifier
X_train_fraud, X_test_fraud, y_train_fraud, y_test_fraud = train_test_split(
    X_fraud, y_fraud, test_size=0.3, random_state=42, stratify=y_fraud
)

scaler_fraud = StandardScaler()
X_train_fraud_scaled = scaler_fraud.fit_transform(X_train_fraud)
X_test_fraud_scaled = scaler_fraud.transform(X_test_fraud)

mlp_fraud = MLPClassifier(hidden_layer_sizes=(30, 20), activation='relu',
                          max_iter=1000, random_state=42)
mlp_fraud.fit(X_train_fraud_scaled, y_train_fraud)

y_pred_fraud = mlp_fraud.predict(X_test_fraud_scaled)

print("Model trained successfully!")
print(f"Training samples: {X_train_fraud.shape[0]}")
print(f"Test samples: {X_test_fraud.shape[0]}")

In [None]:
# Task 3: Confusion matrix
cm_fraud = confusion_matrix(y_test_fraud, y_pred_fraud)

plt.figure(figsize=(10, 8))
sns.heatmap(cm_fraud, annot=True, fmt='d', cmap='Blues', cbar=False,
            xticklabels=['Predicted Legitimate', 'Predicted Fraud'],
            yticklabels=['Actual Legitimate', 'Actual Fraud'])
plt.title('Confusion Matrix - Fraud Detection', fontsize=16, fontweight='bold')
plt.ylabel('Actual Class', fontsize=12)
plt.xlabel('Predicted Class', fontsize=12)
plt.show()

print("\nConfusion Matrix Components:")
print(f"  True Negatives (TN):  {cm_fraud[0, 0]:5d} - Correctly identified legitimate")
print(f"  False Positives (FP): {cm_fraud[0, 1]:5d} - Legitimate flagged as fraud")
print(f"  False Negatives (FN): {cm_fraud[1, 0]:5d} - Fraud missed (dangerous!)")
print(f"  True Positives (TP):  {cm_fraud[1, 1]:5d} - Correctly identified fraud")

In [None]:
# Task 4: Manual calculations
TN = cm_fraud[0, 0]
FP = cm_fraud[0, 1]
FN = cm_fraud[1, 0]
TP = cm_fraud[1, 1]

accuracy_manual = (TP + TN) / (TP + TN + FP + FN)
precision_manual = TP / (TP + FP) if (TP + FP) > 0 else 0
recall_manual = TP / (TP + FN) if (TP + FN) > 0 else 0
f1_manual = 2 * (precision_manual * recall_manual) / (precision_manual + recall_manual) if (precision_manual + recall_manual) > 0 else 0

print("="*70)
print("MANUAL METRIC CALCULATIONS FOR FRAUD DETECTION")
print("="*70)

print("\n1. ACCURACY:")
print(f"   Formula: (TP + TN) / Total")
print(f"   Calculation: ({TP} + {TN}) / {TP + TN + FP + FN}")
print(f"   Result: {accuracy_manual:.4f} ({accuracy_manual*100:.2f}%)")
print(f"   ⚠️  High accuracy can be misleading with imbalanced data!")

print("\n2. PRECISION (Fraud Class):")
print(f"   Formula: TP / (TP + FP)")
print(f"   Meaning: Of all predicted frauds, how many were actually fraud?")
print(f"   Calculation: {TP} / ({TP} + {FP})")
print(f"   Result: {precision_manual:.4f} ({precision_manual*100:.2f}%)")

print("\n3. RECALL (Fraud Class):")
print(f"   Formula: TP / (TP + FN)")
print(f"   Meaning: Of all actual frauds, how many did we catch?")
print(f"   Calculation: {TP} / ({TP} + {FN})")
print(f"   Result: {recall_manual:.4f} ({recall_manual*100:.2f}%)")
print(f"   ⚠️  Critical metric! Missing fraud is costly!")

print("\n4. F1-SCORE (Fraud Class):")
print(f"   Formula: 2 * (Precision * Recall) / (Precision + Recall)")
print(f"   Meaning: Harmonic mean balancing precision and recall")
print(f"   Result: {f1_manual:.4f} ({f1_manual*100:.2f}%)")

print("\n" + "="*70)

# Verify
accuracy_sklearn = accuracy_score(y_test_fraud, y_pred_fraud)
precision_sklearn = precision_score(y_test_fraud, y_pred_fraud)
recall_sklearn = recall_score(y_test_fraud, y_pred_fraud)
f1_sklearn = f1_score(y_test_fraud, y_pred_fraud)

print("\nVerification (sklearn):")
print(f"  Accuracy:  {accuracy_sklearn:.4f} ✓")
print(f"  Precision: {precision_sklearn:.4f} ✓")
print(f"  Recall:    {recall_sklearn:.4f} ✓")
print(f"  F1-Score:  {f1_sklearn:.4f} ✓")

In [None]:
# Task 5: Classification report
print("\n" + "="*70)
print("CLASSIFICATION REPORT")
print("="*70)
print(classification_report(y_test_fraud, y_pred_fraud,
                          target_names=['Legitimate', 'Fraudulent']))
print("="*70)

### Task 6: Discussion - Answers

**1. Which metric is most important for fraud detection?**

Answer: **RECALL** is the most critical metric for fraud detection. Here's why:
- Recall measures how many fraudulent transactions we successfully detect
- Missing a fraudulent transaction (False Negative) can result in significant financial losses
- It's better to flag some legitimate transactions as suspicious (False Positives) than to miss actual fraud
- High recall ensures we catch most fraud cases, even if we have some false alarms

**2. Why is accuracy not sufficient for this problem?**

Answer: Accuracy is misleading with imbalanced data:
- With 98% legitimate transactions, a naive model that predicts everything as "legitimate" achieves 98% accuracy
- This model would catch ZERO fraud cases but still have "high accuracy"
- Accuracy doesn't tell us how well we detect the minority class (fraud)
- For imbalanced datasets, we need metrics that focus on the minority class performance

**3. What's worse in fraud detection: False Positives or False Negatives?**

Answer: **False Negatives are worse:**
- False Negative = Missing actual fraud (letting fraud go through)
  - Results in direct financial losses
  - Damages customer trust
  - Can lead to larger fraud patterns

- False Positive = Flagging legitimate transaction as fraud
  - Causes customer inconvenience
  - Requires manual review
  - But no financial loss

**4. How would you improve the model to better detect fraud?**

Possible improvements:
- **Class weights**: Penalize fraud misclassification more heavily
- **Resampling**: Use SMOTE or undersampling to balance classes
- **Ensemble methods**: Use Random Forest or XGBoost
- **Threshold tuning**: Lower the classification threshold to increase recall
- **Feature engineering**: Create fraud-specific features (transaction patterns, time, amount)
- **Anomaly detection**: Use isolation forests or autoencoders
- **Cost-sensitive learning**: Assign different costs to different types of errors

---
## Summary

### Video 1: Perceptrons and MLPs
- NAND gate is linearly separable, solvable by both Perceptron and MLP
- Perceptrons work for simple linear problems
- MLPs add flexibility with hidden layers

### Video 2: MLPs for Regression
- Identity (linear) activation limits the model to linear relationships
- Non-linear activations (tanh, ReLU) enable modeling complex patterns
- R² score improved significantly with non-linear activation

### Video 3: Advanced Metrics
- Accuracy is misleading with imbalanced data
- Confusion matrix reveals TP, TN, FP, FN
- Precision = TP / (TP + FP) - "How many predicted positives are correct?"
- Recall = TP / (TP + FN) - "How many actual positives did we find?"
- F1-score balances precision and recall
- For fraud detection, prioritize **RECALL** to minimize missed frauds
- Classification report provides comprehensive metrics for all classes