<a href="https://colab.research.google.com/github/sprince0031/ICT-Python-ML/blob/main/Week%205/Notebooks/week5.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python and ML Foundations: Session 5
## Perceptrons, MLPs & Advanced Metrics

Welcome to the session 5 practice notebook! Complete the following challenges to reinforce your learning.

## Utility code
Run the cell below to import necessary libraries:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier, MLPRegressor
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics import confusion_matrix, classification_report, mean_squared_error, r2_score
from sklearn.datasets import load_diabetes, make_classification

sns.set_style('whitegrid')
np.random.seed(42)

---
# Video Challenges

## Video 1: Perceptrons and MLPs

### Challenge: Implement NAND Gate with MLP

The NAND gate is a fundamental logic gate. The output is 0 (False) ONLY when both inputs are 1 (True), otherwise it's 1 (True).

**NAND Truth Table:**
```
Input A | Input B | Output
--------|---------|--------
   0    |    0    |   1
   0    |    1    |   1
   1    |    0    |   1
   1    |    1    |   0
```

**Your tasks:**
1. Create the NAND dataset
2. Try training a Perceptron on NAND
3. Train an MLP on NAND
4. Compare their performance

**Hint:** NAND is linearly separable, so even a Perceptron should work!

In [None]:
# Task 1: Create the NAND dataset
# Define X_nand and y_nand arrays


In [None]:
# Task 2: Train a Perceptron on NAND
# Create a Perceptron, fit it, and evaluate


In [None]:
# Task 3: Train an MLP on NAND
# Create an MLPClassifier with hidden layers, fit it, and evaluate


In [None]:
# Task 4: Compare and discuss the results
# Print predictions and accuracies for both models


---
## Video 2: MLPs for Regression and Classification

### Challenge: Compare Activations on California Housing Dataset

Use the California Housing dataset to compare MLP regressors with different activation functions.

**Your tasks:**
1. Load the California Housing dataset from sklearn
2. Split and scale the data
3. Train an MLP regressor with identity (linear) activation
4. Train an MLP regressor with tanh activation
5. Compare their R² scores
6. Visualize predictions vs actual values for both

**Hint:** Use `from sklearn.datasets import fetch_california_housing`

In [None]:
# Task 1: Load California Housing dataset
from sklearn.datasets import fetch_california_housing
# Load the dataset and create X and y


In [None]:
# Task 2: Split and scale the data
# Use train_test_split and StandardScaler


In [None]:
# Task 3: Train MLP with identity activation
# Create MLPRegressor with activation='identity'


In [None]:
# Task 4: Train MLP with tanh activation
# Create MLPRegressor with activation='tanh'


In [None]:
# Task 5: Compare R² scores
# Calculate and print R² for both models


In [None]:
# Task 6: Visualize predictions vs actual
# Create scatter plots comparing predictions with actual values


---
## Video 3: Advanced Metrics

### Challenge: Analyze Credit Card Fraud Detection Metrics

Imagine you're building a credit card fraud detection system. You have a highly imbalanced dataset where only 2% of transactions are fraudulent.

**Your tasks:**
1. Create an imbalanced dataset (98% legitimate, 2% fraud)
2. Train an MLP classifier
3. Generate and visualize the confusion matrix
4. Manually calculate precision, recall, and F1-score
5. Display the classification report
6. Explain which metric is most important for fraud detection and why

**Hint:** Use `make_classification` with `weights=[0.98, 0.02]`

In [None]:
# Task 1: Create imbalanced dataset
# Use make_classification with appropriate weights


In [None]:
# Task 2: Train MLP classifier
# Split data, scale, and train an MLPClassifier


In [None]:
# Task 3: Generate and visualize confusion matrix
# Use confusion_matrix and create a heatmap


In [None]:
# Task 4: Manually calculate precision, recall, F1-score
# Extract TP, TN, FP, FN from confusion matrix and calculate metrics


In [None]:
# Task 5: Display classification report
# Use classification_report


### Task 6: Discussion

Answer the following questions:

1. **Which metric is most important for fraud detection?**
   - Your answer:

2. **Why is accuracy not sufficient for this problem?**
   - Your answer:

3. **What's worse in fraud detection: False Positives or False Negatives?**
   - Your answer:

4. **How would you improve the model to better detect fraud?**
   - Your answer: