# Logistic Regression Exercises

Welcome to the hands-on exercises for Logistic Regression! üéØ

This notebook contains practical exercises to help you understand and master logistic regression concepts.

## üìö Learning Objectives
By completing these exercises, you will:
- Understand the mathematical foundations of logistic regression
- Implement logistic regression from scratch
- Use scikit-learn for practical applications
- Evaluate model performance using various metrics
- Handle real-world classification problems
- Tune hyperparameters for optimal performance

## üèÅ Getting Started
Run the cell below to import all necessary libraries:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, classification_report, roc_curve, roc_auc_score
)
from sklearn.datasets import make_classification, load_breast_cancer
import warnings
warnings.filterwarnings('ignore')

# Set style for better plots
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("‚úÖ All libraries imported successfully!")
print("üìö Ready to start learning Logistic Regression!")

---

## Exercise 1: Understanding the Sigmoid Function üìà

The sigmoid function is the heart of logistic regression. Let's explore it!

In [None]:
def sigmoid(z):
    """Sigmoid activation function"""
    return 1 / (1 + np.exp(-np.clip(z, -500, 500)))

# TODO: Create a range of z values from -10 to 10
z = np.linspace(-10, 10, 100)

# TODO: Calculate sigmoid values
sigmoid_values = sigmoid(z)

# TODO: Plot the sigmoid function
plt.figure(figsize=(10, 6))
plt.plot(z, sigmoid_values, 'b-', linewidth=3, label='Sigmoid Function')
plt.axhline(y=0.5, color='r', linestyle='--', alpha=0.7, label='Decision Threshold')
plt.axvline(x=0, color='gray', linestyle=':', alpha=0.7)
plt.xlabel('z (Linear Combination)', fontsize=12)
plt.ylabel('œÉ(z) (Probability)', fontsize=12)
plt.title('Sigmoid Function - The Heart of Logistic Regression', fontsize=14)
plt.grid(True, alpha=0.3)
plt.legend()
plt.ylim(-0.05, 1.05)
plt.show()

print(f"When z=0: œÉ(z) = {sigmoid(0):.3f}")
print(f"When z=2: œÉ(z) = {sigmoid(2):.3f}")
print(f"When z=-2: œÉ(z) = {sigmoid(-2):.3f}")

### ü§î Question 1.1
**What happens to the sigmoid function as z approaches positive and negative infinity?**

*Write your answer here:*

### ü§î Question 1.2
**Why is the sigmoid function perfect for binary classification?**

*Write your answer here:*

---

## Exercise 2: Generate and Visualize Classification Data üìä

In [None]:
# TODO: Generate a synthetic binary classification dataset
# Use make_classification with:
# - n_samples=500
# - n_features=2 (for easy visualization)
# - n_redundant=0
# - n_informative=2
# - random_state=42

X, y = make_classification(
    n_samples=500,
    n_features=2,
    n_redundant=0,
    n_informative=2,
    random_state=42
)

# TODO: Create a scatter plot to visualize the data
plt.figure(figsize=(10, 8))
plt.scatter(X[y==0, 0], X[y==0, 1], c='red', alpha=0.6, label='Class 0', s=50)
plt.scatter(X[y==1, 0], X[y==1, 1], c='blue', alpha=0.6, label='Class 1', s=50)
plt.xlabel('Feature 1', fontsize=12)
plt.ylabel('Feature 2', fontsize=12)
plt.title('Binary Classification Dataset', fontsize=14)
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print(f"Dataset shape: {X.shape}")
print(f"Class distribution: {np.bincount(y)}")
print(f"Feature 1 range: [{X[:, 0].min():.2f}, {X[:, 0].max():.2f}]")
print(f"Feature 2 range: [{X[:, 1].min():.2f}, {X[:, 1].max():.2f}]")

---

## Exercise 3: Train Your First Logistic Regression Model üöÄ

In [None]:
# TODO: Split the data into training and testing sets
# Use test_size=0.2 and random_state=42
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# TODO: Create and train a LogisticRegression model
model = LogisticRegression(random_state=42)
model.fit(X_train, y_train)

# TODO: Make predictions on the test set
y_pred = model.predict(X_test)
y_pred_proba = model.predict_proba(X_test)[:, 1]

# TODO: Calculate and display performance metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_proba)

print("üìä Model Performance Metrics:")
print(f"Accuracy:  {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall:    {recall:.4f}")
print(f"F1-Score:  {f1:.4f}")
print(f"ROC-AUC:   {roc_auc:.4f}")

# Display model coefficients
print(f"\nüî¢ Model Coefficients:")
print(f"Intercept (Œ≤‚ÇÄ): {model.intercept_[0]:.4f}")
print(f"Feature 1 (Œ≤‚ÇÅ): {model.coef_[0][0]:.4f}")
print(f"Feature 2 (Œ≤‚ÇÇ): {model.coef_[0][1]:.4f}")

### ü§î Question 3.1
**What do the coefficients tell you about the importance of each feature?**

*Write your answer here:*

---

## Exercise 4: Visualize the Decision Boundary üé®

In [None]:
def plot_decision_boundary(X, y, model, title="Decision Boundary"):
    """Plot decision boundary for 2D data"""
    h = 0.02
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    
    Z = model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:, 1]
    Z = Z.reshape(xx.shape)
    
    plt.figure(figsize=(12, 8))
    plt.contourf(xx, yy, Z, levels=50, alpha=0.6, cmap='RdYlBu')
    plt.colorbar(label='Probability of Class 1')
    plt.contour(xx, yy, Z, levels=[0.5], colors='black', linestyles='--', linewidths=3)
    
    scatter = plt.scatter(X[:, 0], X[:, 1], c=y, cmap='RdYlBu', edgecolors='black', s=50)
    plt.xlabel('Feature 1', fontsize=12)
    plt.ylabel('Feature 2', fontsize=12)
    plt.title(title, fontsize=14)
    plt.show()

# TODO: Plot the decision boundary for your model
plot_decision_boundary(X_test, y_test, model, "Logistic Regression Decision Boundary")

### ü§î Question 4.1
**What does the decision boundary represent? Why is it linear?**

*Write your answer here:*

---

## Exercise 5: ROC Curve Analysis üìà

In [None]:
# TODO: Plot the ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
roc_auc = roc_auc_score(y_test, y_pred_proba)

plt.figure(figsize=(10, 8))
plt.plot(fpr, tpr, color='darkorange', lw=3, 
         label=f'ROC Curve (AUC = {roc_auc:.3f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', 
         label='Random Classifier')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate', fontsize=12)
plt.ylabel('True Positive Rate', fontsize=12)
plt.title('Receiver Operating Characteristic (ROC) Curve', fontsize=14)
plt.legend(loc="lower right")
plt.grid(alpha=0.3)
plt.show()

# TODO: Find the optimal threshold
# Calculate Youden's J statistic (TPR - FPR)
j_scores = tpr - fpr
optimal_idx = np.argmax(j_scores)
optimal_threshold = thresholds[optimal_idx]

print(f"üìä ROC Analysis:")
print(f"AUC Score: {roc_auc:.4f}")
print(f"Optimal Threshold: {optimal_threshold:.4f}")
print(f"TPR at optimal threshold: {tpr[optimal_idx]:.4f}")
print(f"FPR at optimal threshold: {fpr[optimal_idx]:.4f}")

### ü§î Question 5.1
**What does an AUC of 0.5 mean? What about 1.0?**

*Write your answer here:*

### ü§î Question 5.2
**When would you use a different threshold than 0.5?**

*Write your answer here:*

---

## Exercise 6: Real-World Dataset - Breast Cancer Diagnosis üè•

Now let's work with a real medical dataset!

In [None]:
# TODO: Load the breast cancer dataset
data = load_breast_cancer()
X_cancer, y_cancer = data.data, data.target

print("üè• Breast Cancer Dataset Information:")
print(f"Dataset shape: {X_cancer.shape}")
print(f"Number of features: {X_cancer.shape[1]}")
print(f"Classes: {data.target_names}")
print(f"Class distribution: {np.bincount(y_cancer)}")
print(f"\nFirst 5 feature names:")
for i, name in enumerate(data.feature_names[:5]):
    print(f"  {i+1}. {name}")

In [None]:
# TODO: Split the data
X_train_cancer, X_test_cancer, y_train_cancer, y_test_cancer = train_test_split(
    X_cancer, y_cancer, test_size=0.2, random_state=42, stratify=y_cancer
)

# TODO: Apply feature scaling (very important for this dataset!)
scaler = StandardScaler()
X_train_cancer_scaled = scaler.fit_transform(X_train_cancer)
X_test_cancer_scaled = scaler.transform(X_test_cancer)

# TODO: Train logistic regression with and without scaling
# Model without scaling
model_unscaled = LogisticRegression(random_state=42, max_iter=1000)
model_unscaled.fit(X_train_cancer, y_train_cancer)

# Model with scaling
model_scaled = LogisticRegression(random_state=42)
model_scaled.fit(X_train_cancer_scaled, y_train_cancer)

# TODO: Compare performance
y_pred_unscaled = model_unscaled.predict(X_test_cancer)
y_pred_scaled = model_scaled.predict(X_test_cancer_scaled)

acc_unscaled = accuracy_score(y_test_cancer, y_pred_unscaled)
acc_scaled = accuracy_score(y_test_cancer, y_pred_scaled)

print("üìä Feature Scaling Impact:")
print(f"Accuracy without scaling: {acc_unscaled:.4f}")
print(f"Accuracy with scaling:    {acc_scaled:.4f}")
print(f"Improvement: {acc_scaled - acc_unscaled:.4f}")

### ü§î Question 6.1
**Why does feature scaling improve performance so much for this dataset?**

*Write your answer here:*

In [None]:
# TODO: Analyze feature importance
feature_importance = np.abs(model_scaled.coef_[0])
top_features_idx = np.argsort(feature_importance)[-10:]

plt.figure(figsize=(12, 8))
top_10_features = feature_importance[top_features_idx]
top_10_names = [data.feature_names[i] for i in top_features_idx]

plt.barh(range(len(top_10_features)), top_10_features, color='skyblue')
plt.yticks(range(len(top_10_features)), top_10_names)
plt.xlabel('Coefficient Magnitude', fontsize=12)
plt.title('Top 10 Most Important Features (Breast Cancer Diagnosis)', fontsize=14)
plt.grid(axis='x', alpha=0.3)
plt.tight_layout()
plt.show()

print("üîç Top 5 Most Important Features:")
for i, idx in enumerate(reversed(top_features_idx[-5:])):
    print(f"{i+1}. {data.feature_names[idx]}: {feature_importance[idx]:.4f}")

---

## Exercise 7: Hyperparameter Tuning Challenge üéõÔ∏è

In [None]:
# TODO: Experiment with different C values (regularization strength)
C_values = [0.001, 0.01, 0.1, 1, 10, 100, 1000]
train_scores = []
test_scores = []

for C in C_values:
    # TODO: Train model with different C values
    model = LogisticRegression(C=C, random_state=42)
    model.fit(X_train_cancer_scaled, y_train_cancer)
    
    # TODO: Calculate scores
    train_score = model.score(X_train_cancer_scaled, y_train_cancer)
    test_score = model.score(X_test_cancer_scaled, y_test_cancer)
    
    train_scores.append(train_score)
    test_scores.append(test_score)
    
    print(f"C={C:7.3f}: Train={train_score:.4f}, Test={test_score:.4f}")

# TODO: Plot the regularization path
plt.figure(figsize=(12, 8))
plt.semilogx(C_values, train_scores, 'b-o', label='Training Accuracy', linewidth=2)
plt.semilogx(C_values, test_scores, 'r-o', label='Test Accuracy', linewidth=2)
plt.xlabel('C (Inverse of Regularization Strength)', fontsize=12)
plt.ylabel('Accuracy', fontsize=12)
plt.title('Regularization Path - Effect of C Parameter', fontsize=14)
plt.legend()
plt.grid(alpha=0.3)
plt.show()

# Find best C
best_idx = np.argmax(test_scores)
best_C = C_values[best_idx]
print(f"\nüèÜ Best C value: {best_C}")
print(f"Best test accuracy: {test_scores[best_idx]:.4f}")

### ü§î Question 7.1
**What happens when C is very small vs very large? Which indicates overfitting?**

*Write your answer here:*

---

## Exercise 8: Cross-Validation Deep Dive üîÑ

In [None]:
# TODO: Perform cross-validation with different scoring metrics
model = LogisticRegression(C=best_C, random_state=42)

scoring_metrics = ['accuracy', 'precision', 'recall', 'f1', 'roc_auc']
cv_results = {}

print("üîÑ Cross-Validation Results (5-fold):")
print("-" * 50)

for metric in scoring_metrics:
    scores = cross_val_score(model, X_train_cancer_scaled, y_train_cancer, 
                           cv=5, scoring=metric)
    cv_results[metric] = scores
    print(f"{metric:10s}: {scores.mean():.4f} (+/- {scores.std() * 2:.4f})")

# TODO: Create a box plot of CV scores
plt.figure(figsize=(12, 8))
data_for_boxplot = [cv_results[metric] for metric in scoring_metrics]
plt.boxplot(data_for_boxplot, labels=scoring_metrics)
plt.ylabel('Score', fontsize=12)
plt.title('Cross-Validation Scores Distribution', fontsize=14)
plt.xticks(rotation=45)
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

---

## Exercise 9: Create Your Own Implementation üõ†Ô∏è

Challenge: Implement a simplified version of logistic regression!

In [None]:
class SimpleLogisticRegression:
    def __init__(self, learning_rate=0.01, max_iterations=1000):
        self.learning_rate = learning_rate
        self.max_iterations = max_iterations
        self.weights = None
        self.bias = None
        
    def sigmoid(self, z):
        # TODO: Implement the sigmoid function
        return 1 / (1 + np.exp(-np.clip(z, -500, 500)))
    
    def fit(self, X, y):
        # TODO: Initialize weights and bias
        n_features = X.shape[1]
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        # TODO: Implement gradient descent
        for i in range(self.max_iterations):
            # Forward pass
            z = np.dot(X, self.weights) + self.bias
            predictions = self.sigmoid(z)
            
            # Calculate gradients
            dw = (1/len(X)) * np.dot(X.T, (predictions - y))
            db = (1/len(X)) * np.sum(predictions - y)
            
            # Update parameters
            self.weights -= self.learning_rate * dw
            self.bias -= self.learning_rate * db
    
    def predict_proba(self, X):
        # TODO: Implement probability prediction
        z = np.dot(X, self.weights) + self.bias
        return self.sigmoid(z)
    
    def predict(self, X):
        # TODO: Implement binary prediction
        probabilities = self.predict_proba(X)
        return (probabilities >= 0.5).astype(int)

# TODO: Test your implementation
my_model = SimpleLogisticRegression(learning_rate=0.1, max_iterations=1000)
my_model.fit(X_train, y_train)

# Make predictions
my_predictions = my_model.predict(X_test)
my_accuracy = accuracy_score(y_test, my_predictions)

print(f"üöÄ Your Implementation Results:")
print(f"Accuracy: {my_accuracy:.4f}")
print(f"Weights: {my_model.weights}")
print(f"Bias: {my_model.bias:.4f}")

# Compare with sklearn
sklearn_model = LogisticRegression(random_state=42)
sklearn_model.fit(X_train, y_train)
sklearn_accuracy = sklearn_model.score(X_test, y_test)

print(f"\nüìä Comparison:")
print(f"Your implementation: {my_accuracy:.4f}")
print(f"Scikit-learn:       {sklearn_accuracy:.4f}")
print(f"Difference:         {abs(my_accuracy - sklearn_accuracy):.4f}")

### ü§î Question 9.1
**How close is your implementation to scikit-learn's? What could cause differences?**

*Write your answer here:*

---

## üèÜ Final Challenge: Build a Complete Classification Pipeline

Put everything together!

In [None]:
def complete_classification_pipeline(X, y, test_size=0.2):
    """
    Complete end-to-end classification pipeline
    
    TODO: Complete this function to include:
    1. Data splitting
    2. Feature scaling
    3. Model training
    4. Hyperparameter tuning
    5. Model evaluation
    6. Visualization
    
    Returns: trained model, performance metrics
    """
    
    # TODO: Implement the complete pipeline
    print("üöÄ Starting Complete Classification Pipeline...")
    
    # 1. Data splitting
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=test_size, random_state=42, stratify=y
    )
    print(f"‚úÖ Data split: {X_train.shape[0]} train, {X_test.shape[0]} test")
    
    # 2. Feature scaling
    scaler = StandardScaler()
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.transform(X_test)
    print("‚úÖ Features scaled")
    
    # 3. Model training with best hyperparameters
    model = LogisticRegression(C=1.0, random_state=42)
    model.fit(X_train_scaled, y_train)
    print("‚úÖ Model trained")
    
    # 4. Evaluation
    y_pred = model.predict(X_test_scaled)
    y_pred_proba = model.predict_proba(X_test_scaled)[:, 1]
    
    metrics = {
        'accuracy': accuracy_score(y_test, y_pred),
        'precision': precision_score(y_test, y_pred),
        'recall': recall_score(y_test, y_pred),
        'f1': f1_score(y_test, y_pred),
        'roc_auc': roc_auc_score(y_test, y_pred_proba)
    }
    
    print("\nüìä Final Results:")
    for metric, value in metrics.items():
        print(f"{metric.upper():10s}: {value:.4f}")
    
    return model, scaler, metrics

# TODO: Run your pipeline on the breast cancer dataset
final_model, final_scaler, final_metrics = complete_classification_pipeline(
    X_cancer, y_cancer
)

print("\nüéâ Pipeline completed successfully!")

---

## üéì Congratulations!

You've completed all the Logistic Regression exercises! üéâ

### What you've learned:
- ‚úÖ Understanding the sigmoid function
- ‚úÖ Mathematical foundations of logistic regression
- ‚úÖ Implementation from scratch
- ‚úÖ Using scikit-learn effectively
- ‚úÖ Feature scaling importance
- ‚úÖ Hyperparameter tuning
- ‚úÖ Model evaluation techniques
- ‚úÖ Cross-validation
- ‚úÖ Real-world application

### Next steps:
1. Try logistic regression on your own datasets
2. Experiment with regularization (L1 vs L2)
3. Learn about multinomial logistic regression
4. Compare with other classification algorithms
5. Study advanced topics like class imbalance handling

### ü§î Final Reflection Questions:

1. **When would you choose logistic regression over other algorithms?**

*Write your answer here:*

2. **What are the main limitations of logistic regression?**

*Write your answer here:*

3. **How would you handle a dataset with 10 classes using logistic regression?**

*Write your answer here:*

Great job! üåü Keep practicing and exploring!