# 1.4 Tune Regularization Hyperparameters

## Model Cycle: The 5 Key Steps

### 1. Build the Model : Create the pipeline with regularization.  
### 2. Train the Model : Fit the model on the training data.  
### 3. Generate Predictions : Use the trained model to make predictions.  
### 4. Evaluate the Model : Assess performance using evaluation metrics.  
### **5. Improve the Model : Tune hyperparameters for optimal performance.**

### **Table of Contents**

<div style="overflow-x: auto;">

- [Introduction](#scrollTo=intro)
- [1. Load Dependencies and Data](#scrollTo=section1)
- [2. Understanding Hyperparameter Tuning](#scrollTo=section2)
  - [2.1 What is GridSearchCV?](#scrollTo=section2_1)
  - [2.2 Key Hyperparameters to Tune](#scrollTo=section2_2)
- [3. Tune the C Parameter](#scrollTo=section3)
  - [3.1 L2 (Ridge) Model Tuning](#scrollTo=section3_1)
  - [3.2 L1 (Lasso) Model Tuning](#scrollTo=section3_2)
- [4. Tune ElasticNet Parameters](#scrollTo=section4)
  - [4.1 Tuning C and l1_ratio Together](#scrollTo=section4_1)
  - [4.2 Visualize Hyperparameter Grid](#scrollTo=section4_2)
- [5. Select the Best Model](#scrollTo=section5)
  - [5.1 Compare All Tuned Models](#scrollTo=section5_1)
  - [5.2 Final Model Selection](#scrollTo=section5_2)
- [6. Evaluate on Test Set](#scrollTo=section6)
  - [6.1 Final Performance Metrics](#scrollTo=section6_1)
  - [6.2 Confusion Matrix](#scrollTo=section6_2)
- [7. Visualize Hyperparameter Performance](#scrollTo=section7)
- [8. Summary](#scrollTo=section8)

</div>

## Introduction

In the previous notebooks, we built and trained regularized logistic regression models using default hyperparameter values (`C=1.0`, `l1_ratio=0.5`). However, these default values may not be optimal for our specific dataset.

In this notebook, we use **GridSearchCV** to systematically search for the best hyperparameter values. This is a critical step in the machine learning workflow that can significantly improve model performance.

### Learning Objectives

By the end of this notebook, you will be able to:

1. Use GridSearchCV to tune the regularization strength (`C` parameter)
2. Tune the ElasticNet mixing parameter (`l1_ratio`)
3. Visualize hyperparameter performance
4. Select the best model based on cross-validation results
5. Evaluate the final model on the held-out test set

## 1. Load Dependencies and Data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import pandas as pd
import numpy as np
import pickle
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from sklearn.model_selection import GridSearchCV, StratifiedKFold
from sklearn.metrics import (
    f1_score, precision_score, recall_score, accuracy_score,
    classification_report, confusion_matrix, make_scorer
)

pd.options.display.max_columns = None

In [None]:
# Set up file paths
root_filepath = '/content/drive/MyDrive/projects/Applied-Data-Analytics-For-Higher-Education-Course-2/'
data_filepath = f'{root_filepath}data/'
course3_models = f'{root_filepath}course_3/models/'

In [None]:
# Load training and test data
df_training = pd.read_csv(f'{data_filepath}training.csv')
df_testing = pd.read_csv(f'{data_filepath}testing.csv')

X_train = df_training
y_train = df_training['SEM_3_STATUS']

X_test = df_testing
y_test = df_testing['SEM_3_STATUS']

print(f"Training data: {X_train.shape[0]} samples")
print(f"Test data: {X_test.shape[0]} samples")
print(f"\nTarget distribution (Training):")
print(y_train.value_counts(normalize=True))

In [None]:
# Load the regularized models from Course 3
l2_model = pickle.load(open(f'{course3_models}l2_ridge_logistic_model.pkl', 'rb'))
l1_model = pickle.load(open(f'{course3_models}l1_lasso_logistic_model.pkl', 'rb'))
elasticnet_model = pickle.load(open(f'{course3_models}elasticnet_logistic_model.pkl', 'rb'))

print("Models loaded successfully!")

## 2. Understanding Hyperparameter Tuning

### 2.1 What is GridSearchCV?

**GridSearchCV** is a method for systematically working through multiple combinations of hyperparameter values, cross-validating as it goes, to determine which combination gives the best performance.

**How it works:**
1. Define a grid of hyperparameter values to try
2. For each combination, perform k-fold cross-validation
3. Compute the average score across folds
4. Select the combination with the best average score

**Why use GridSearchCV instead of manual tuning?**
- Exhaustively searches all combinations
- Uses cross-validation to avoid overfitting to validation data
- Automatically tracks results for comparison

### 2.2 Key Hyperparameters to Tune

| Hyperparameter | Description | Range to Search |
|:---------------|:------------|:----------------|
| `C` | Inverse of regularization strength | 0.001 to 100 (log scale) |
| `l1_ratio` | ElasticNet mixing (0=L2, 1=L1) | 0.0 to 1.0 |

**Important Notes:**
- **Small C** = Strong regularization (simpler model, may underfit)
- **Large C** = Weak regularization (complex model, may overfit)
- We search on a **logarithmic scale** because the effect of C is multiplicative

In [None]:
# Define cross-validation strategy
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# Define scorer - we optimize for F1 on the minority class ('N' = students who leave)
f1_scorer = make_scorer(f1_score, pos_label='N')

print("Cross-validation: 5-fold stratified")
print("Optimization metric: F1 score (positive class = 'N')")

## 3. Tune the C Parameter

### 3.1 L2 (Ridge) Model Tuning

First, we tune the `C` parameter for the L2 regularized model.

In [None]:
# Define C values to search (logarithmic scale)
C_values = [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]

# Create parameter grid for L2 model
# Note: parameters inside pipeline use 'step__parameter' naming
l2_param_grid = {
    'classifier__C': C_values
}

print(f"L2 Parameter Grid:")
print(f"  C values: {C_values}")
print(f"  Total combinations to try: {len(C_values)}")

In [None]:
# Run GridSearchCV for L2 model
print("Tuning L2 (Ridge) model...")

l2_grid_search = GridSearchCV(
    estimator=l2_model,
    param_grid=l2_param_grid,
    cv=cv,
    scoring=f1_scorer,
    return_train_score=True,
    n_jobs=-1,  # Use all available cores
    verbose=1
)

l2_grid_search.fit(X_train, y_train)

print(f"\nBest C value: {l2_grid_search.best_params_['classifier__C']}")
print(f"Best CV F1 score: {l2_grid_search.best_score_:.4f}")

In [None]:
# View all L2 results
l2_results = pd.DataFrame(l2_grid_search.cv_results_)
l2_results_display = l2_results[[
    'param_classifier__C', 'mean_train_score', 'mean_test_score', 'std_test_score', 'rank_test_score'
]].sort_values('rank_test_score')

l2_results_display.columns = ['C', 'Train F1 (Mean)', 'CV F1 (Mean)', 'CV F1 (Std)', 'Rank']
print("L2 (Ridge) GridSearch Results:")
display(l2_results_display)

### 3.2 L1 (Lasso) Model Tuning

Now we tune the L1 regularized model.

In [None]:
# Create parameter grid for L1 model
l1_param_grid = {
    'classifier__C': C_values
}

print(f"L1 Parameter Grid:")
print(f"  C values: {C_values}")
print(f"  Total combinations to try: {len(C_values)}")

In [None]:
# Run GridSearchCV for L1 model
print("Tuning L1 (Lasso) model...")

l1_grid_search = GridSearchCV(
    estimator=l1_model,
    param_grid=l1_param_grid,
    cv=cv,
    scoring=f1_scorer,
    return_train_score=True,
    n_jobs=-1,
    verbose=1
)

l1_grid_search.fit(X_train, y_train)

print(f"\nBest C value: {l1_grid_search.best_params_['classifier__C']}")
print(f"Best CV F1 score: {l1_grid_search.best_score_:.4f}")

In [None]:
# View all L1 results
l1_results = pd.DataFrame(l1_grid_search.cv_results_)
l1_results_display = l1_results[[
    'param_classifier__C', 'mean_train_score', 'mean_test_score', 'std_test_score', 'rank_test_score'
]].sort_values('rank_test_score')

l1_results_display.columns = ['C', 'Train F1 (Mean)', 'CV F1 (Mean)', 'CV F1 (Std)', 'Rank']
print("L1 (Lasso) GridSearch Results:")
display(l1_results_display)

## 4. Tune ElasticNet Parameters

### 4.1 Tuning C and l1_ratio Together

ElasticNet has two hyperparameters to tune:
- **C**: Regularization strength
- **l1_ratio**: Balance between L1 and L2 penalties
  - `l1_ratio=0`: Pure L2 (Ridge)
  - `l1_ratio=1`: Pure L1 (Lasso)
  - `l1_ratio=0.5`: Equal mix

We use a 2D grid search to find the best combination.

In [None]:
# Define l1_ratio values to search
l1_ratio_values = [0.1, 0.3, 0.5, 0.7, 0.9]

# Create parameter grid for ElasticNet model
elasticnet_param_grid = {
    'classifier__C': C_values,
    'classifier__l1_ratio': l1_ratio_values
}

print(f"ElasticNet Parameter Grid:")
print(f"  C values: {C_values}")
print(f"  l1_ratio values: {l1_ratio_values}")
print(f"  Total combinations to try: {len(C_values) * len(l1_ratio_values)}")

In [None]:
# Run GridSearchCV for ElasticNet model
print("Tuning ElasticNet model (this may take a moment)...")

elasticnet_grid_search = GridSearchCV(
    estimator=elasticnet_model,
    param_grid=elasticnet_param_grid,
    cv=cv,
    scoring=f1_scorer,
    return_train_score=True,
    n_jobs=-1,
    verbose=1
)

elasticnet_grid_search.fit(X_train, y_train)

print(f"\nBest parameters:")
print(f"  C: {elasticnet_grid_search.best_params_['classifier__C']}")
print(f"  l1_ratio: {elasticnet_grid_search.best_params_['classifier__l1_ratio']}")
print(f"Best CV F1 score: {elasticnet_grid_search.best_score_:.4f}")

In [None]:
# View all ElasticNet results
elasticnet_results = pd.DataFrame(elasticnet_grid_search.cv_results_)
elasticnet_results_display = elasticnet_results[[
    'param_classifier__C', 'param_classifier__l1_ratio', 
    'mean_train_score', 'mean_test_score', 'std_test_score', 'rank_test_score'
]].sort_values('rank_test_score').head(10)

elasticnet_results_display.columns = ['C', 'l1_ratio', 'Train F1 (Mean)', 'CV F1 (Mean)', 'CV F1 (Std)', 'Rank']
print("ElasticNet GridSearch Results (Top 10):")
display(elasticnet_results_display)

### 4.2 Visualize Hyperparameter Grid

Let's visualize how different combinations of C and l1_ratio affect model performance.

In [None]:
# Create heatmap of ElasticNet CV scores
elasticnet_pivot = elasticnet_results.pivot_table(
    values='mean_test_score',
    index='param_classifier__l1_ratio',
    columns='param_classifier__C'
)

fig = px.imshow(
    elasticnet_pivot,
    labels=dict(x='C (Regularization Strength)', y='l1_ratio', color='CV F1 Score'),
    x=[str(c) for c in C_values],
    y=[str(r) for r in l1_ratio_values],
    color_continuous_scale='Viridis',
    aspect='auto',
    title='ElasticNet Hyperparameter Grid: CV F1 Scores'
)

fig.update_layout(height=400)
fig.show()

In [None]:
# Create 3D surface plot of ElasticNet CV scores
fig = go.Figure(data=[go.Surface(
    z=elasticnet_pivot.values,
    x=np.log10(C_values),  # Log scale for better visualization
    y=l1_ratio_values,
    colorscale='Viridis'
)])

fig.update_layout(
    title='ElasticNet Hyperparameter Surface',
    scene=dict(
        xaxis_title='log10(C)',
        yaxis_title='l1_ratio',
        zaxis_title='CV F1 Score'
    ),
    height=500
)

fig.show()

## 5. Select the Best Model

### 5.1 Compare All Tuned Models

Now let's compare the best performing model from each regularization type.

In [None]:
# Collect best results from each model
comparison_results = [
    {
        'Model': 'L2 (Ridge)',
        'Best C': l2_grid_search.best_params_['classifier__C'],
        'Best l1_ratio': 'N/A',
        'CV F1 (Mean)': l2_grid_search.best_score_,
        'CV F1 (Std)': l2_results.loc[l2_results['rank_test_score']==1, 'std_test_score'].values[0]
    },
    {
        'Model': 'L1 (Lasso)',
        'Best C': l1_grid_search.best_params_['classifier__C'],
        'Best l1_ratio': 'N/A',
        'CV F1 (Mean)': l1_grid_search.best_score_,
        'CV F1 (Std)': l1_results.loc[l1_results['rank_test_score']==1, 'std_test_score'].values[0]
    },
    {
        'Model': 'ElasticNet',
        'Best C': elasticnet_grid_search.best_params_['classifier__C'],
        'Best l1_ratio': elasticnet_grid_search.best_params_['classifier__l1_ratio'],
        'CV F1 (Mean)': elasticnet_grid_search.best_score_,
        'CV F1 (Std)': elasticnet_results.loc[elasticnet_results['rank_test_score']==1, 'std_test_score'].values[0]
    }
]

comparison_df = pd.DataFrame(comparison_results)
print("Best Tuned Models Comparison:")
display(comparison_df)

In [None]:
# Visualize comparison
fig = go.Figure()

fig.add_trace(go.Bar(
    x=comparison_df['Model'],
    y=comparison_df['CV F1 (Mean)'],
    error_y=dict(type='data', array=comparison_df['CV F1 (Std)']),
    marker_color=['#1f77b4', '#ff7f0e', '#2ca02c']
))

fig.update_layout(
    title='Comparison of Tuned Models: Cross-Validation F1 Scores',
    xaxis_title='Model',
    yaxis_title='CV F1 Score (Mean +/- Std)',
    height=400
)

fig.show()

### 5.2 Final Model Selection

We select the model with the highest cross-validation F1 score. In case of a tie, we prefer simpler models (fewer features or stronger regularization).

In [None]:
# Identify the best overall model
best_idx = comparison_df['CV F1 (Mean)'].idxmax()
best_model_name = comparison_df.loc[best_idx, 'Model']

# Get the best estimator from the corresponding grid search
if best_model_name == 'L2 (Ridge)':
    best_model = l2_grid_search.best_estimator_
elif best_model_name == 'L1 (Lasso)':
    best_model = l1_grid_search.best_estimator_
else:
    best_model = elasticnet_grid_search.best_estimator_

print(f"Selected Best Model: {best_model_name}")
print(f"\nBest Hyperparameters:")
print(f"  C: {comparison_df.loc[best_idx, 'Best C']}")
print(f"  l1_ratio: {comparison_df.loc[best_idx, 'Best l1_ratio']}")
print(f"\nCV F1 Score: {comparison_df.loc[best_idx, 'CV F1 (Mean)']:.4f} (+/- {comparison_df.loc[best_idx, 'CV F1 (Std)']:.4f})")

## 6. Evaluate on Test Set

Now that we've selected the best model using cross-validation, we evaluate it on the **held-out test set**. This gives us an unbiased estimate of how the model will perform on new data.

**Important**: We only evaluate on the test set ONCE after all tuning is complete to avoid data leakage.

### 6.1 Final Performance Metrics

In [None]:
# Generate predictions on test set
y_pred = best_model.predict(X_test)
y_pred_proba = best_model.predict_proba(X_test)[:, 1]  # Probability of positive class

# Calculate metrics for minority class ('N')
test_f1 = f1_score(y_test, y_pred, pos_label='N')
test_precision = precision_score(y_test, y_pred, pos_label='N')
test_recall = recall_score(y_test, y_pred, pos_label='N')
test_accuracy = accuracy_score(y_test, y_pred)

print("="*50)
print(f"FINAL TEST SET EVALUATION: {best_model_name}")
print("="*50)
print(f"\nMetrics for Minority Class ('N' = Students who leave):")
print(f"  F1 Score:    {test_f1:.4f}")
print(f"  Precision:   {test_precision:.4f}")
print(f"  Recall:      {test_recall:.4f}")
print(f"\nOverall Accuracy: {test_accuracy:.4f}")

In [None]:
# Full classification report
print("\nFull Classification Report:")
print(classification_report(y_test, y_pred))

### 6.2 Confusion Matrix

In [None]:
# Create confusion matrix
cm = confusion_matrix(y_test, y_pred, labels=['N', 'Y'])

# Create annotated heatmap
fig = px.imshow(
    cm,
    labels=dict(x='Predicted', y='Actual', color='Count'),
    x=['Left (N)', 'Stayed (Y)'],
    y=['Left (N)', 'Stayed (Y)'],
    color_continuous_scale='Blues',
    text_auto=True,
    aspect='equal',
    title=f'Confusion Matrix: {best_model_name} on Test Set'
)

fig.update_layout(height=400)
fig.show()

In [None]:
# Interpret confusion matrix
tn, fp, fn, tp = cm.ravel()

print("Confusion Matrix Interpretation:")
print(f"\n  True Positives (correctly identified departures):  {tp}")
print(f"  True Negatives (correctly identified stayers):     {tn}")
print(f"  False Positives (predicted departure, but stayed): {fp}")
print(f"  False Negatives (predicted stay, but left):        {fn}")

## 7. Visualize Hyperparameter Performance

Let's create comprehensive visualizations to understand how hyperparameters affect model performance.

In [None]:
# Compare C parameter effect across all models
fig = make_subplots(rows=1, cols=3, subplot_titles=('L2 (Ridge)', 'L1 (Lasso)', 'ElasticNet (best l1_ratio)'))

# L2 Results
fig.add_trace(
    go.Scatter(
        x=np.log10(l2_results['param_classifier__C'].astype(float)),
        y=l2_results['mean_test_score'],
        mode='lines+markers',
        error_y=dict(type='data', array=l2_results['std_test_score']),
        name='L2 CV Score',
        line=dict(color='#1f77b4')
    ),
    row=1, col=1
)

# L1 Results
fig.add_trace(
    go.Scatter(
        x=np.log10(l1_results['param_classifier__C'].astype(float)),
        y=l1_results['mean_test_score'],
        mode='lines+markers',
        error_y=dict(type='data', array=l1_results['std_test_score']),
        name='L1 CV Score',
        line=dict(color='#ff7f0e')
    ),
    row=1, col=2
)

# ElasticNet Results (best l1_ratio for each C)
elasticnet_best_per_c = elasticnet_results.loc[
    elasticnet_results.groupby('param_classifier__C')['mean_test_score'].idxmax()
]

fig.add_trace(
    go.Scatter(
        x=np.log10(elasticnet_best_per_c['param_classifier__C'].astype(float)),
        y=elasticnet_best_per_c['mean_test_score'],
        mode='lines+markers',
        error_y=dict(type='data', array=elasticnet_best_per_c['std_test_score']),
        name='ElasticNet CV Score',
        line=dict(color='#2ca02c')
    ),
    row=1, col=3
)

fig.update_xaxes(title_text='log10(C)')
fig.update_yaxes(title_text='CV F1 Score', row=1, col=1)

fig.update_layout(
    height=400,
    title_text='Effect of Regularization Strength (C) on Model Performance',
    showlegend=False
)

fig.show()

In [None]:
# Visualize l1_ratio effect for ElasticNet at best C
best_c = elasticnet_grid_search.best_params_['classifier__C']
elasticnet_at_best_c = elasticnet_results[
    elasticnet_results['param_classifier__C'] == best_c
]

fig = go.Figure()

fig.add_trace(go.Scatter(
    x=elasticnet_at_best_c['param_classifier__l1_ratio'].astype(float),
    y=elasticnet_at_best_c['mean_test_score'],
    mode='lines+markers',
    error_y=dict(type='data', array=elasticnet_at_best_c['std_test_score']),
    marker=dict(size=10),
    line=dict(color='#2ca02c', width=2)
))

fig.update_layout(
    title=f'ElasticNet: Effect of l1_ratio at C={best_c}',
    xaxis_title='l1_ratio (0=Ridge, 1=Lasso)',
    yaxis_title='CV F1 Score',
    height=400
)

fig.show()

In [None]:
# Training vs Validation Score (checking for overfitting)
fig = make_subplots(rows=1, cols=3, subplot_titles=('L2 (Ridge)', 'L1 (Lasso)', 'ElasticNet'))

for i, (results, name, color) in enumerate([
    (l2_results, 'L2', '#1f77b4'),
    (l1_results, 'L1', '#ff7f0e'),
    (elasticnet_best_per_c, 'ElasticNet', '#2ca02c')
], 1):
    x_vals = np.log10(results['param_classifier__C'].astype(float))
    
    # Training score
    fig.add_trace(
        go.Scatter(
            x=x_vals,
            y=results['mean_train_score'],
            mode='lines+markers',
            name=f'{name} Train',
            line=dict(color=color, dash='dash')
        ),
        row=1, col=i
    )
    
    # Validation score
    fig.add_trace(
        go.Scatter(
            x=x_vals,
            y=results['mean_test_score'],
            mode='lines+markers',
            name=f'{name} Validation',
            line=dict(color=color)
        ),
        row=1, col=i
    )

fig.update_xaxes(title_text='log10(C)')
fig.update_yaxes(title_text='F1 Score', row=1, col=1)

fig.update_layout(
    height=400,
    title_text='Training vs Validation Scores: Checking for Overfitting',
    showlegend=True,
    legend=dict(orientation='h', yanchor='bottom', y=1.02)
)

fig.show()

**Interpretation:**

- When training and validation curves are close together, the model generalizes well
- A large gap (training >> validation) indicates overfitting
- Strong regularization (small C) prevents overfitting but may underfit
- The optimal C balances bias and variance

In [None]:
# Save the best tuned model
best_model_filename = best_model_name.lower().replace(' ', '_').replace('(', '').replace(')', '')
save_path = f'{course3_models}{best_model_filename}_tuned.pkl'

pickle.dump(best_model, open(save_path, 'wb'))
print(f"Best model saved to: {save_path}")

# Also save all grid search results for reference
grid_search_results = {
    'l2_grid_search': l2_grid_search,
    'l1_grid_search': l1_grid_search,
    'elasticnet_grid_search': elasticnet_grid_search,
    'best_model_name': best_model_name
}

pickle.dump(grid_search_results, open(f'{course3_models}grid_search_results.pkl', 'wb'))
print(f"Grid search results saved to: {course3_models}grid_search_results.pkl")

## 8. Summary

In this notebook, we tuned regularization hyperparameters using GridSearchCV and evaluated the best model on the test set.

### Key Findings

In [None]:
# Final summary table
summary_data = {
    'Metric': ['Best Model', 'Best C', 'Best l1_ratio', 'CV F1 Score', 'Test F1 Score', 'Test Precision', 'Test Recall'],
    'Value': [
        best_model_name,
        str(comparison_df.loc[best_idx, 'Best C']),
        str(comparison_df.loc[best_idx, 'Best l1_ratio']),
        f"{comparison_df.loc[best_idx, 'CV F1 (Mean)']:.4f}",
        f"{test_f1:.4f}",
        f"{test_precision:.4f}",
        f"{test_recall:.4f}"
    ]
}

summary_df = pd.DataFrame(summary_data)
print("Final Model Summary:")
display(summary_df)

### Key Takeaways

1. **GridSearchCV** systematically searches hyperparameter combinations using cross-validation

2. **The C parameter** controls regularization strength:
   - Small C = strong regularization (simpler model)
   - Large C = weak regularization (more complex model)

3. **ElasticNet's l1_ratio** controls the balance between L1 and L2 penalties

4. **Visualizations** help understand how hyperparameters affect performance

5. **Test set evaluation** should only be done once, after all tuning is complete

### What We Learned About Our Data

| Regularization Type | Best Performance | What This Suggests |
|:--------------------|:-----------------|:------------------|
| L2 (Ridge) | Good | All features contribute somewhat |
| L1 (Lasso) | Good | Some feature selection beneficial |
| ElasticNet | Best of both | Combination of shrinkage and selection helps |

### Practical Implications

- The tuned model can be used to identify at-risk students early
- Regularization helped improve generalization beyond the default settings
- The model's interpretability is maintained while improving performance

### Next Steps

Congratulations on completing Module 1 of Course 3! You have learned how to:

1. Understand regularization and its importance (Notebook 1.1)
2. Build regularized logistic regression models (Notebook 1.2)
3. Train and compare different regularization types (Notebook 1.3)
4. Tune hyperparameters and select the best model (This notebook)

In the next module, we will explore more advanced topics including:
- Additional classification algorithms
- Feature engineering techniques
- Model deployment considerations