Skip to content

πŸš— Car Price Prediction πŸš— A Machine Learning system for predicting car prices using advanced regression models and custom evaluation metrics.

Notifications You must be signed in to change notification settings

DaneshCode/CarPricePrediction

Repository files navigation

πŸš— Car Price Prediction

Python scikit-learn pandas License

A Machine Learning system for predicting car prices using advanced regression models and custom evaluation metrics.

Features β€’ Installation β€’ Usage β€’ Models β€’ NewMetric β€’ Project Structure


πŸ“‹ Overview

This project implements a comprehensive car price prediction system using machine learning techniques. It features a custom evaluation metric called NewMetric designed specifically for car price prediction, along with multiple regression models for comparison.

Key Highlights

  • 🎯 Custom NewMetric for specialized car price evaluation
  • πŸ€– Multiple ML Models including Gradient Boosting, Random Forest, and more
  • πŸ“Š Visual Analytics with detailed comparison charts
  • πŸ’Ύ Model Persistence for easy deployment and reuse
  • πŸ”„ Interactive Prediction system for real-time price estimation

✨ Features

Feature Description
Model Training Train multiple regression models and compare their performance
NewMetric Evaluation Custom metric combining MAE, RMSE, and Relative Error
Model Comparison Side-by-side comparison of 5 different ML algorithms
Visualization Generate publication-ready charts and graphs
Model Export Save trained models for production use
Batch Prediction Predict prices for multiple cars from Excel files
Interactive CLI User-friendly command-line interface

πŸ›  Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Step 1: Clone the Repository

git clone https://github.com/yourusername/CarPricePrediction.git
cd CarPricePrediction

Step 2: Install Dependencies

pip install pandas numpy scikit-learn matplotlib openpyxl

Required Libraries

Library Version Purpose
pandas β‰₯1.3.0 Data manipulation and analysis
numpy β‰₯1.20.0 Numerical computing
scikit-learn β‰₯1.0.0 Machine learning algorithms
matplotlib β‰₯3.4.0 Data visualization
openpyxl β‰₯3.0.0 Excel file support

πŸš€ Usage

1. Training the Model

To train and build the final prediction model:

python car_price_prediction.py

What this does:

  • Loads data from data.xlsx
  • Trains a Gradient Boosting model with optimized parameters
  • Evaluates performance using NewMetric
  • Saves the trained model to car_price_model.pkl
  • Generates visualization charts

Output Files:

  • car_price_model.pkl - Trained model file
  • final_model_results.png - Prediction vs Actual charts
  • final_feature_importance.png - Feature importance visualization

2. Comparing Multiple Models

To compare different ML algorithms:

python "Comparison of models.py"

Models Compared:

  • Linear Regression
  • Ridge Regression
  • Lasso Regression
  • Random Forest Regressor
  • Gradient Boosting Regressor

Output Files:

  • model_results.png - Model comparison charts
  • feature_importance.png - Feature importance for best model

3. Using the Trained Model

To make predictions with the saved model:

python use_model.py

Available Options:

Option Description
1 Display list of features
2 Interactive prediction (enter values manually)
3 Batch prediction from Excel file
4 Exit

4. Programmatic Usage

from use_model import load_model, predict_price

# Load the trained model
model, feature_names = load_model("car_price_model.pkl")

# Prepare feature values (normalized between 0 and 1)
feature_values = {
    "Ϊ©ΫŒΩ„ΩˆΩ…ΨͺΨ±_Ω†Ψ±Ω…Ψ§Ω„": 0.3,
    "Ψ³Ψ§Ω„_Ω†Ψ±Ω…Ψ§Ω„": 0.8,
    # ... other features
}

# Get prediction
predicted_price = predict_price(model, feature_names, feature_values)
print(f"Predicted Price: {predicted_price:,.0f} Toman")

5. Batch Prediction from Excel

from use_model import load_model, predict_from_excel

# Load model
model, feature_names = load_model()

# Predict for all cars in Excel file
results = predict_from_excel(
    model,
    feature_names,
    excel_path="new_cars.xlsx",
    output_path="predictions.xlsx"
)

πŸ€– Models

Supported Algorithms

Model Description Best For
Gradient Boosting Ensemble of weak learners ⭐ Best overall performance
Random Forest Ensemble of decision trees Robust to overfitting
Ridge Regression L2 regularized linear When features are correlated
Lasso Regression L1 regularized linear Feature selection
Linear Regression Basic linear model Baseline comparison

Final Model Configuration

The production model uses Gradient Boosting Regressor with optimized parameters:

GradientBoostingRegressor(
    n_estimators=200,
    learning_rate=0.1,
    max_depth=5,
    min_samples_split=5,
    min_samples_leaf=2,
    subsample=0.8,
    random_state=42
)

🎯 NewMetric

What is NewMetric?

NewMetric is a custom evaluation metric designed specifically for car price prediction. It combines multiple error measures to provide a comprehensive assessment of model performance.

Formula

$$\text{NewMetric} = 0.4 \times \text{MAE}_{norm} + 0.4 \times \text{RMSE}_{norm} + 0.2 \times \text{RelativeError}$$

Where:

  • MAE_norm = MAE / Mean Price (Normalized Mean Absolute Error)
  • RMSE_norm = RMSE / Mean Price (Normalized Root Mean Square Error)
  • RelativeError = Mean of |Actual - Predicted| / Actual

Interpretation

NewMetric Value Performance
< 0.10 🟒 Excellent
0.10 - 0.15 🟑 Good
0.15 - 0.20 🟠 Average
> 0.20 πŸ”΄ Needs Improvement

Note: Lower values indicate better performance.


πŸ“ Project Structure

CarPricePrediction/
β”‚
β”œβ”€β”€ πŸ“„ car_price_prediction.py    # Main training script with final model
β”œβ”€β”€ πŸ“„ Comparison of models.py    # Model comparison and evaluation
β”œβ”€β”€ πŸ“„ use_model.py               # Inference and prediction utilities
β”‚
β”œβ”€β”€ πŸ“Š data.xlsx                  # Training dataset (required)
β”œβ”€β”€ πŸ€– car_price_model.pkl        # Saved model (generated)
β”‚
β”œβ”€β”€ πŸ“ˆ final_model_results.png    # Prediction charts (generated)
β”œβ”€β”€ πŸ“ˆ final_feature_importance.png
β”œβ”€β”€ πŸ“ˆ model_results.png
β”œβ”€β”€ πŸ“ˆ feature_importance.png
β”‚
β”œβ”€β”€ πŸ“– README.md                  # English documentation
└── πŸ“– README_FA.md               # Persian documentation

File Descriptions

File Purpose
car_price_prediction.py Trains the final Gradient Boosting model, evaluates it, and saves it for production use
Comparison of models.py Compares 5 different ML models using NewMetric and traditional metrics
use_model.py Provides utilities for loading saved models and making predictions
data.xlsx Excel file containing training data with normalized features
car_price_model.pkl Serialized trained model for deployment

πŸ“Š Data Format

The input Excel file (data.xlsx) should contain:

Required Columns

Column Type Description
Ω‚ΫŒΩ…Ψͺ Numeric Target variable (price in Toman)
*_Ω†Ψ±Ω…Ψ§Ω„ Numeric (0-1) Normalized feature columns

Example Features

  • Ϊ©ΫŒΩ„ΩˆΩ…ΨͺΨ±_Ω†Ψ±Ω…Ψ§Ω„ - Normalized mileage
  • Ψ³Ψ§Ω„_Ω†Ψ±Ω…Ψ§Ω„ - Normalized year
  • Ψ±Ω†Ϊ―_Ω†Ψ±Ω…Ψ§Ω„ - Normalized color encoding
  • And more...

πŸ“ˆ Output Examples

Model Comparison Chart

The system generates comparison charts showing:

  • NewMetric scores for all models
  • MAPE (Mean Absolute Percentage Error)
  • Actual vs Predicted scatter plot
  • Error distribution histogram

Sample Predictions

βœ… Actual: 1,200,000,000 | Predicted: 1,180,000,000 | Error: 1.7%
βœ… Actual:   850,000,000 | Predicted:   870,000,000 | Error: 2.4%
⚠️ Actual:   500,000,000 | Predicted:   450,000,000 | Error: 10.0%

πŸ”§ Troubleshooting

Common Issues

Issue Solution
Model file not found Run car_price_prediction.py first to generate the model
Missing features warning Some features in your data may not match the model's expected features
Memory error Reduce dataset size or use a machine with more RAM

Font Issues (Persian Display)

If Persian text doesn't display correctly in charts, install a Persian-compatible font:

plt.rcParams["font.family"] = "DejaVu Sans"

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ‘₯ Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“¬ Contact

For questions or support, please open an issue on GitHub.


Made with ❀️ for the Car Industry

⭐ Star this repo if you find it helpful!

About

πŸš— Car Price Prediction πŸš— A Machine Learning system for predicting car prices using advanced regression models and custom evaluation metrics.

Resources

Stars

Watchers

Forks

Languages