### Week 15: Intro to Predictive Modeling (Optional)
---
* Linear Regression with scikit-learn
* Model training, evaluation basics
* Avoiding overfitting, test/train split
  
**Project:** 
* Predict future sales or employee attrition using linear regression

**Learning Objectives**

By the end of this week, you will be able to:
- Understand what a predictive model is
- Build a simple linear regression model using `scikit-learn`
- Split data into training and testing sets
- Evaluate model performance (R², MAE, RMSE)
- Use the model to make predictions


# What is Predictive Modeling?

> Predictive modeling uses **historical data** to forecast **future outcomes**.

Examples:
- Predict next month’s sales
- Predict employee attrition
- Predict customer ratings or satisfaction

## Linear Regression Overview

- Simplest form of prediction
- Assumes a **linear relationship** between input (X) and output (y)
- Predicts **continuous** values


## Modeling Workflow

1. **Define X (features) and y (target)**
2. **Split** the data into training and test sets
3. **Train** the model using `.fit()`
4. **Predict** on new data using `.predict()`
5. **Evaluate** the model

## Key Metrics

| Metric | Use |
|--------|-----|
| R²     | How well model explains variance (closer to 1 = better) |
| MAE    | Mean Absolute Error (lower = better) |
| RMSE   | Root Mean Squared Error (lower = better) |


### Tips for Good Predictions

- Make sure your features (X) are numeric
- Remove or fill missing values
- Avoid using features that "leak" future information
- Trai/test split is critical to avoid overfitting

### scikit-learn Model Summary

| Step        | Function                              |
| ----------- | ------------------------------------- |
| Split data  | `train_test_split()`                  |
| Train model | `model.fit()`                         |
| Predict     | `model.predict()`                     |
| Evaluate    | `r2_score()`, `mean_absolute_error()` |
