
### **01_classification_basics.ipynb**

1. **Introduction to Classification**
   - What is classification?  
   - Binary vs multiclass problems  
   - Overview of the Scikit-learn API

2. **Dataset Preparation**
   - Loading datasets (`load_iris`, `make_classification`)  
   - Splitting data (`train_test_split`)  
   - Feature scaling (`StandardScaler`)

3. **Training a Basic Classifier**
   - Fitting models: `LogisticRegression`, `KNeighborsClassifier`, etc.  
   - Predicting and understanding outputs  
   - Class probability predictions (`predict_proba()`)

4. **Decision Boundaries**
   - Plotting classification regions  
   - Visualizing decision surfaces

5. **Basic Evaluation Metrics**
   - Accuracy score  
   - Confusion matrix  
   - Classification report (precision, recall, F1-score)

6. **Handling Imbalanced Datasets**
   - Class weights  
   - Stratified splitting  
   - Resampling (optional intro)

---

### **02_regression_pipeline.ipynb**

1. **Regression Problem Setup**
   - What is regression?  
   - Loading regression datasets (e.g., `load_boston`, synthetic)

2. **Linear Regression Models**
   - `LinearRegression`  
   - `Ridge`, `Lasso` for regularization  
   - Polynomial regression with `PolynomialFeatures`

3. **Scikit-learn Pipelines**
   - Creating a preprocessing + modeling pipeline  
   - Scaling, polynomial features, and model in one step  
   - `Pipeline()` and `make_pipeline()`

4. **Pipeline Advantages**
   - Clean code structure  
   - Prevention of data leakage  
   - Grid search compatibility

5. **Model Evaluation**
   - Regression metrics: MSE, RMSE, MAE, R²  
   - Train vs test performance comparison  
   - Residual plots

---

### **03_cross_validation.ipynb**

1. **Why Cross-Validation Matters**
   - Bias-variance trade-off  
   - Overfitting detection  
   - Evaluation beyond train/test split

2. **Cross-Validation Methods**
   - `KFold`, `StratifiedKFold`, `ShuffleSplit`  
   - `cross_val_score()` and `cross_validate()`  
   - Scoring options

3. **Using CV in Pipelines**
   - Integrating with `Pipeline`  
   - `cross_val_score()` on a pipeline  

4. **Nested Cross-Validation**
   - Hyperparameter tuning inside CV  
   - Avoiding leakage during tuning

5. **Visualization of CV Results**
   - Boxplots of scores  
   - Mean and std visualization  
   - Fold-wise performance breakdown

---

### **04_metrics_visualization.ipynb**

1. **Classification Metrics and Visualization**
   - Confusion matrix (heatmap)  
   - ROC curve & AUC  
   - Precision-Recall curve  
   - Visualizing thresholds

2. **Regression Metrics and Visualization**
   - Residual plot  
   - Prediction error plot  
   - Actual vs predicted scatter plot

3. **Advanced Evaluation Tools**
   - `classification_report` as a DataFrame  
   - `sklearn.metrics.plot_*()` utilities  
   - Visualizing overfitting with learning curves

4. **Custom Metric Functions**
   - Creating custom scoring functions for CV  
   - Using `make_scorer()`

---

### **05_model_tuning.ipynb**

1. **Overview of Hyperparameter Tuning**
   - What are hyperparameters?  
   - Search strategies overview  

2. **Grid Search**
   - Using `GridSearchCV`  
   - Param grid definition  
   - Best estimator extraction

3. **Randomized Search**
   - `RandomizedSearchCV` for faster exploration  
   - Defining distributions and iteration control  

4. **Tuning with Pipelines**
   - Tuning preprocessing and model together  
   - Nested parameter names in `param_grid`

5. **Evaluation during Tuning**
   - Cross-validation during search  
   - Scoring metrics selection  
   - Analyzing `cv_results_`

6. **Visualization of Tuning Results**
   - Heatmaps for parameter scores  
   - Line plots and validation curves  
   - Best parameter vs performance trade-offs

7. **Final Model Deployment**
   - Refitting on full data  
   - Exporting model with `joblib`  
   - Prediction on new/test data


## **06_advanced_sklearn.ipynb**

1. **Introduction to Advanced Scikit-Learn**
   - Recap of core algorithms and pipelines  
   - Motivation for deepening model and feature engineering techniques  

2. **Ensemble Methods Deep Dive**
   - Advanced ensemble techniques: bagging, boosting, stacking  
   - Hyperparameter considerations for ensemble models (e.g., Random Forests, Gradient Boosting)  
   - Best practices in ensemble design and interpretation

3. **Advanced Hyperparameter Optimization**
   - Beyond grid and randomized search: Bayesian optimization and meta-learning  
   - Automated hyperparameter tuning frameworks  
   - Cross-validation strategies for robust tuning  

4. **Feature Selection and Dimensionality Reduction**
   - Techniques for feature importance estimation (permutation importance, LASSO-based methods)  
   - Advanced feature selection: Recursive Feature Elimination (RFE), SelectFromModel  
   - Dimensionality reduction methods (PCA, t-SNE, UMAP) for preprocessing

5. **Model Interpretability and Diagnostics**
   - Tools for model interpretability: Partial Dependence Plots, SHAP, and LIME in-depth  
   - Error analysis and residual diagnostics  
   - Strategies for detecting and handling overfitting/underfitting

6. **Advanced Pipeline Integration**
   - Creating custom transformers and estimators  
   - Building multi-stage pipelines for complex workflows  
   - Strategies for pipeline automation and reproducibility

7. **Case Studies and Best Practices**
   - End-to-end projects showcasing advanced model building  
   - Discussion of trade-offs in model complexity versus interpretability  
   - Lessons learned from real-world applications

---
