# Prediction of New Car Prices in 2025 Based on Technical Specifications

# Model Interface XGBoost

---

**Name:** Hafiz Alfariz

**Dataset:** This project uses the "Cars Datasets 2025" from Kaggle, which contains technical specifications and prices for various new car models released in 2025.
https://www.kaggle.com/datasets/abdulmalik1518/cars-datasets-2025

---

# 1. Import Library

In [1]:
import pandas as pd
import numpy as np
import cloudpickle

# 2. Load pipeline and model

In [2]:
# Load pipeline and model
with open("preprocessing_pipeline.pkl", "rb") as f:
    preprocessing_pipeline = cloudpickle.load(f)
with open("best_xgb_rand.pkl", "rb") as f:
    model = cloudpickle.load(f)

print("Model and pipeline successfully loaded.")

Model and pipeline successfully loaded.


# 3. Prediction Function

In [3]:
# Prediction function
def predict_price(df_input):
    df_input = df_input.copy()
    df_input['cars_prices'] = 0  # dummy column to prevent pipeline errors

    # Ensure empty numeric columns are filled with NaN
    for col in ['horsepower', 'total_speed', 'performance0__100_km_h',
                'seats', 'torque', 'engine_cc', 'battery_capacity_kwh']:
        df_input[col] = df_input[col].replace('', np.nan)

    # Data transformation
    X_new = preprocessing_pipeline.transform(df_input)
    pred_log = model.predict(X_new)
    pred_price = np.expm1(pred_log)
    df_input['predicted_price'] = pred_price
    return df_input[['company_names', 'cars_names', 'predicted_price']]


# 4. New input for prediction

In [4]:
# New data for prediction
new_input = pd.DataFrame([
    {
        'company_names': 'BYD',
        'cars_names': 'Seal EV',
        'engines': 'Electric Motor',
        'horsepower': '530',
        'total_speed': '210',
        'performance0__100_km_h': '3.8',
        'fuel_types': 'Electric',
        'seats': '5',
        'torque': '670',
        'engine_cc': '',  # Electric, there is no engine cc
        'battery_capacity_kwh': '82'
    },
    {
        'company_names': 'Honda',
        'cars_names': 'CR-V Hybrid',
        'engines': '2.0L Hybrid',
        'horsepower': '184',
        'total_speed': '190',
        'performance0__100_km_h': '8.6',
        'fuel_types': 'Hybrid',
        'seats': '5',
        'torque': '315',
        'engine_cc': '1993',
        'battery_capacity_kwh': '1.4'
    },
    {
        'company_names': 'Toyota',
        'cars_names': 'Corolla Hybrid',
        'engines': '1.8L Hybrid',
        'horsepower': '121',
        'total_speed': '180',
        'performance0__100_km_h': '10.9',
        'fuel_types': 'Hybrid',
        'seats': '5',
        'torque': '142',
        'engine_cc': '1798',
        'battery_capacity_kwh': '1.3'
    },
    {
        'company_names': 'BMW',
        'cars_names': 'M340i XDRIVE',
        'engines': 'I6',
        'horsepower': '382',
        'total_speed': '250',
        'performance0__100_km_h': '4.1',
        'fuel_types': 'Petrol',
        'seats': '5',
        'torque': '500',
        'engine_cc': '2998',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'Nissan',
        'cars_names': 'GT-R',
        'engines': 'V6',
        'horsepower': '600',
        'total_speed': '315',
        'performance0__100_km_h': '2.9',
        'fuel_types': 'Petrol',
        'seats': '4',
        'torque': '637',
        'engine_cc': '3799',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'Hyundai',
        'cars_names': 'IONIQ 5',
        'engines': 'Electric Motor',
        'horsepower': '225',
        'total_speed': '185',
        'performance0__100_km_h': '7.4',
        'fuel_types': 'Electric',
        'seats': '5',
        'torque': '350',
        'engine_cc': '',
        'battery_capacity_kwh': '72.6'
    },
    {
        'company_names': 'Mazda',
        'cars_names': 'CX-5',
        'engines': '2.5L SkyActiv-G',
        'horsepower': '187',
        'total_speed': '200',
        'performance0__100_km_h': '8.8',
        'fuel_types': 'Petrol',
        'seats': '5',
        'torque': '252',
        'engine_cc': '2500',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'FERRARI',
        'cars_names': 'F8 TRIBUTO',
        'engines': 'V8',
        'horsepower': '710',
        'total_speed': '340',
        'performance0__100_km_h': '2.9',
        'fuel_types': 'Petrol',
        'seats': '2',
        'torque': '770',
        'engine_cc': '3900',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'TOYOTA',
        'cars_names': 'PRIUS',
        'engines': 'I4',
        'horsepower': '121',
        'total_speed': '180',
        'performance0__100_km_h': '10.5',
        'fuel_types': 'Hybrid',
        'seats': '5',
        'torque': '142',
        'engine_cc': '1798',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'AUDI',
        'cars_names': 'RS7 SPORTBACK',
        'engines': 'V8',
        'horsepower': '591',
        'total_speed': '305',
        'performance0__100_km_h': '3.6',
        'fuel_types': 'Petrol',
        'seats': '5',
        'torque': '800',
        'engine_cc': '3993',
        'battery_capacity_kwh': ''
    },
    {
        'company_names': 'TESLA',
        'cars_names': 'Model S Plaid',
        'engines': 'Tri Electric Motors',
        'horsepower': '1020',
        'total_speed': '322',
        'performance0__100_km_h': '2.1',
        'fuel_types': 'Electric',
        'seats': '5',
        'torque': '1400',
        'engine_cc': '',
        'battery_capacity_kwh': '100'
    },
    {
        'company_names': 'HONDA',
        'cars_names': 'CIVIC TYPE R',
        'engines': 'I4',
        'horsepower': '315',
        'total_speed': '272',
        'performance0__100_km_h': '5.7',
        'fuel_types': 'Petrol',
        'seats': '5',
        'torque': '400',
        'engine_cc': '1996',
        'battery_capacity_kwh': ''
    }
])



# 5. Make predictions for the new car price

In [5]:
# 5. Predict the price of a new car
result_new = predict_price(new_input)

# 6. Display Result

In [6]:
# Display results as a complete table
result_display = new_input.copy()
result_display['predicted_price_$'] = result_new['predicted_price'].apply(lambda x: f"${x:,.0f}")

print("\nTable of predicted car prices (with input):")
print(result_display.to_string(justify='right'))



Table of predicted car prices (with input):
   company_names      cars_names              engines horsepower total_speed performance0__100_km_h fuel_types seats torque engine_cc battery_capacity_kwh predicted_price_$
0            BYD         Seal EV       Electric Motor        530         210                    3.8   Electric     5    670                             82          $110,431
1          Honda     CR-V Hybrid          2.0L Hybrid        184         190                    8.6     Hybrid     5    315      1993                  1.4           $37,844
2         Toyota  Corolla Hybrid          1.8L Hybrid        121         180                   10.9     Hybrid     5    142      1798                  1.3           $26,520
3            BMW    M340i XDRIVE                   I6        382         250                    4.1     Petrol     5    500      2998                                $55,816
4         Nissan            GT-R                   V6        600         315              

# Model Inference Conclusion

* The XGBoost model, which has been trained and fine-tuned, successfully predicts the price of new cars in 2025 based on technical specification inputs.

* The preprocessing pipeline runs smoothly, capable of handling various data types and missing values in the input features.

* The predicted prices are presented clearly and are easy to understand, using a table format that includes both car specifications and predicted prices in USD.

* This inference model can be used to automatically estimate new car prices, providing value for consumers, dealers, and manufacturers in decision-making.

* Overall, the system is ready to be used for predicting new car prices with relevant specification inputs.