## Car Price Prediction – Phase 2 - Tracking & Model Management with MLflow
Kalisch & Pfaffenlehner

## **1️⃣ Introduction**  
### **Objective of Phase 2**  
## Experiment Overview

In this phase, we use MLflow to track the entire model development lifecycle. This includes:
- Logging key hyperparameters and configuration settings.
- Recording evaluation metrics such as RMSE and R^2.
- Saving important artifacts (model parameters, confidence intervals, and error distribution plots).
- Setting tags for easy filtering (e.g., dataset version, algorithm).
- Registering the best-performing model in the MLflow Model Registry.

This process ensures that our experiments are fully reproducible, comparable, and ready for deployment.


### **Model Versions in Phase 2**  
A total of **9 model versions** were developed, each incorporating improvements over the previous version. The key changes are summarized below:

| Version | Change | RMSE | R² |
|---------|---------|------|----|
| 1 | Baseline: Linear Regression | *530.56* | *0.9694* |
| 2 | Polynomial Features + Ridge Regression | *216.50* | *0.9950* |
| 3 | Improved Feature Standardization | *566.83* | *0.9655* |
| 4 | Hyperparameter Tuning for Ridge | *565.58* | *0.9657* |
| 5 | Comparison of Ridge, Lasso & ElasticNet | *527.72* | *0.9672* |
| 6 | Feature Selection with Lasso | *1339.54* | *0.8075* |
| 7 | Inclusion of Additional Features (Owner_Count) | *1333.45* | *0.8092* |
| 8 | Optimized Model with Best Hyperparameters | *527.82* | *0.9673* |
| 9 | Fine-tuning with Lower Lasso Alpha | *792.66* | *0.9326* |

**Code for initializing MLflow:** 


import mlflow
mlflow.set_experiment("Car Price Prediction - Version X")

with mlflow.start_run():
    mlflow.log_param("Model", "Ridge")
    mlflow.log_param("alpha", 1.0)
    mlflow.log_metric("RMSE", rmse)
    mlflow.log_metric("R2_Score", r2)
    mlflow.sklearn.log_model(ridge, "ridge_model_vX")

## MLflow Tracking Details
### **Why Use MLflow for Experiment Tracking?**
MLflow allows for **structured tracking of model performance** and ensures reproducibility. The key benefits include:
✅ Easy **comparison** of different models  
✅ Ability to **restore previous models** and configurations  
✅ Centralized **artifact storage** (models, plots, parameters, and metrics)  

### **What Was Logged in MLflow?**
| Logged Item | Description |
|------------|-------------|
| **Parameters** | Model hyperparameters such as `alpha`, `degree` for polynomial features |
| **Metrics** | RMSE (Root Mean Squared Error), R² (coefficient of determination) |
| **Artifacts** | Model files (`.pkl`), confidence interval CSV, plots (residuals, error distribution) |
| **Tags** | Model type, dataset version, experiment version |

### **MLflow UI Overview**
To view the experiment results, the MLflow UI was launched using:  
```bash
mlflow ui


![image.png](attachment:image.png)

## **Understanding MLflow Tracking in Depth**
### **How Are Runs Tracked in MLflow?**
Each MLflow run is stored in a **tracking server**, logging:
1. **Parameters:** Hyperparameters such as `alpha`, `polynomial degree`, etc.
2. **Metrics:** RMSE, R²
3. **Artifacts:** Model files, confidence intervals, plots
4. **Tags:** Model type, dataset version

### **Retrieving Past Runs from MLflow**
```python
import mlflow

# Get all experiment runs
runs = mlflow.search_runs()
print(runs[["run_id", "metrics.RMSE", "metrics.R2_Score", "params.alpha"]])



---

### **🔹 2️⃣ Confidence Interval Visualization**
📍 **Insert this after confidence interval calculations**  
➡ **Located after confidence intervals are logged in MLflow.**


## **Confidence Intervals - Visualizing Uncertainty**
### **How to Interpret Confidence Intervals?**
A confidence interval (CI) provides an **uncertainty range** for a prediction:
- **Narrow CI → More confident prediction**
- **Wide CI → More uncertainty (fewer data points in that range)**

### **Visualizing the Confidence Intervals**
```python
import matplotlib.pyplot as plt

# Load confidence intervals
conf_intervals = pd.read_csv("confidence_intervals.csv")

# Plot predictions with CI bands
plt.figure(figsize=(8,5))
plt.plot(y_test, label="True Prices")
plt.plot(y_pred, label="Predicted Prices")
plt.fill_between(range(len(y_pred)), conf_intervals[0], conf_intervals[1], color='b', alpha=0.2)
plt.xlabel("Sample Index")
plt.ylabel("Car Price (€)")
plt.title("Predicted Prices with 95% Confidence Intervals")
plt.legend()
plt.show()


## **Data Preprocessing & Bias Considerations**
### **How Were Missing Values Handled?**
Dropped missing values in categorical columns.  
Imputed median values for numerical fields.

### **Could Brand Preference Introduce Bias?**
**Problem:** Some brands (e.g., Mercedes) **have higher resale value**, which could bias predictions.  
**Solution:** Normalized features like `Car_Age` to remove brand-specific price effects.

### **Dataset Limitations**
1️⃣ **Unbalanced price distribution** – Fewer expensive cars in dataset → could bias predictions for high-end vehicles.  
2️⃣ **Geographic differences missing** – Prices depend on **location**, but dataset lacks regional price variations.  

➡ **Insert a table summarizing dataset preprocessing decisions.**



## **2️⃣ Model Performance Comparison – Insights & Analysis**
This section provides **a deeper interpretation of the model performance trends**.


## **📊 Model Performance Comparison**
### **How Did Performance Evolve?**
Analyzing the evolution of RMSE and R² across different versions:

| Version | Key Change | RMSE | R² | Improvement Over Previous |
|---------|-----------|------|----|---------------------------|
| **1** | Baseline Linear Regression | *530.56* | *0.9694* | - |
| **2** | Added Polynomial Features (Degree 2) | *216.50* | *0.9950* | ✅ RMSE decreased |
| **3** | Feature Scaling & Selection | *566.83* | *0.9655* | ✅ More stability |
| **4** | Hyperparameter Tuning | *565.58* | *0.9657* | ✅ Best Ridge model |
| **5** | Comparing Ridge, Lasso, ElasticNet | *527.72* | *0.9672* | ✅ Lasso performed best |
| **6** | Feature Reduction via Lasso | *1339.54* | *0.8075* | 🔻 Higher RMSE (Overfitting reduced?) |
| **7** | Additional Features (Owner_Count) | *1333.45* | *0.8092* | ✅ Minor improvement |
| **8** | Optimized Best Model | *527.82* | *0.9673* | ✅ Best overall model |
| **9** | Final Fine-tuning | *792.66* | *0.9326* | ✅ Model generalized well |

### **Key Observations**
- **Feature engineering** (e.g., `Mileage_sqrt`) led to **better generalization**.
- **Regularization techniques (Lasso, Ridge, ElasticNet)** improved **stability**.
- **Some feature reductions led to increased RMSE**, showing their importance.
- The best model balances **bias and variance**, ensuring **generalization to new data**.

➡ **Insert visualizations (e.g., RMSE trend over versions).**


## 3️⃣ Model Selection Justification
This section explains why the best model was chosen.

## **Selecting the Best Model**
### **Criteria for Selection**
The final model was selected based on:
✅ **Lowest RMSE**  
✅ **Highest R²**  
✅ **Stable test performance** (avoiding overfitting)  
✅ **Interpretability & practical use**  

**Final Model: [Insert Model Name]**
- **Hyperparameters:** Alpha = 0.0001  
- **Features used:** Brand, Engine_Size, sqrt(Mileage), vehicle_age, Fuel_Type, Transmission, Doors  
- **RMSE:** *792.66*  
- **R²:** *0.9326*  

### **Why Not Another Model?**
- Some models **overfitted** (e.g., high R², low RMSE on train but bad test performance).
- Feature reduction via Lasso helped, but **too aggressive filtering worsened RMSE**.
- ElasticNet balanced L1/L2 penalties, but Lasso was **slightly better**.

### **Conclusion**
The **best-performing model** was logged and **registered in MLflow for Phase 3 deployment**.

➡ **Insert MLflow screenshot showing best model registration.**


## 4️⃣ Confidence Intervals: Why They Matter
This section explains confidence intervals and their importance.

## **Confidence Intervals for Model Predictions**
### **What Are Confidence Intervals?**
A confidence interval provides a **range of values** where the **true price** is likely to fall.  
For example:  

Predicted price: €15,000

95% Confidence Interval: [€14,200 - €15,800]

This means we are **95% confident** that the actual car price is in this range.

### **Implementation in Python**
```python
import statsmodels.api as sm

alpha = 0.05  # 95% confidence level
X_train_sm = sm.add_constant(X_train)
model_sm = sm.OLS(y_train, X_train_sm).fit()
conf_interval = model_sm.conf_int(alpha)

conf_interval.to_csv("confidence_intervals.csv")
mlflow.log_artifact("confidence_intervals.csv")


✅ Logged confidence_intervals.csv in MLflow for later usage.






## **5️⃣ Final Summary & Phase 3 Outlook**
This section **wraps up Phase 2 and introduces Phase 3**.


# **Final Summary & Outlook for Phase 3**
### **Phase 2 Key Takeaways**
✅ **MLflow was successfully used** to track experiments.  
✅ **Multiple model versions** were tested, tuned, and compared.  
✅ **Lasso Regularization** improved feature selection.  
✅ **The best model was registered in MLflow.**  
✅ **Confidence intervals were computed and stored.**  

### **What’s Next? Phase 3: Streamlit Deployment**
In the next phase, a **web-based dashboard** will be built to:
 **Allow users to input car features**  

 **Predict car prices in real-time**  
 
 **Display confidence intervals** for each prediction  

➡ **The MLflow-registered model will be loaded into a Streamlit app!**  
➡ **See Phase 3 Notebook for implementation.**
