# Automobile Insurance Claim Cost Prediction

This project aims to predict automobile insurance claim costs using machine learning models, with a focus on actuarial relevance and robustness to high-severity claims.

---

## Project Description
Insurance claim costs are highly skewed, with rare but expensive claims having a major financial impact.  
This project applies and compares several regression models to predict the total claim cost (`charge_totale`) and selects the most suitable model based on performance and business criteria.

---

## Dataset
- Source: Kaggle – Automobile Insurance Dataset  
- Number of observations: ~47,500  
- Target variable: `charge_totale`  
- Features: driver characteristics, vehicle information, exposure, and past claims

---

## Methodology
The project follows a complete machine learning workflow:

- Exploratory Data Analysis (EDA)
- Data preprocessing (encoding, scaling, train/test split)
- Model training and comparison
- Hyperparameter tuning using cross-validation
- Evaluation using MAE, RMSE, and R²
- Overfitting and underfitting analysis using learning curves

---

## Models Evaluated
- Linear Regression (baseline)
- Support Vector Regression (SVR)
- Random Forest
- XGBoost (baseline and tuned)

The final model was selected based on its ability to control large prediction errors, which is critical in insurance applications.

---

## Results
While Random Forest achieved the lowest MAE, the tuned XGBoost model obtained the best RMSE and R².  
Since large errors are particularly costly in insurance, **XGBoost (tuned)** was selected as the final model.

---

## Overfitting and Underfitting
Learning curve analysis and cross-validation results show that the final model generalizes well.  
No significant overfitting or underfitting was detected, and the model achieves a good bias–variance trade-off.

---

## Files
- `projet_lastf.ipynb` – Final notebook (data processing, modeling, evaluation)
- `Rapport Machine Learning.pdf` – Project report
- `requirements.txt` – Required Python libraries
- `README.md` – Project overview

---

## Installation and Execution
To run the project:

```bash
pip install -r requirements.txt
jupyter notebook projet_lastf.ipynb
