<a href="https://colab.research.google.com/github/elemnurguner/data-ai-projects/blob/main/ikinciElAra%C3%A7FiyatTahmini(demoveri).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

İkinci El Araç Fiyat Tahmini

Veri Setini Hazırlama (Örnek Veri ile Başlangıç)
Önce küçük bir örnek veri seti oluşturalım (daha sonra gerçek veriyle değiştirebilirsiniz):

In [None]:
import pandas as pd
from io import StringIO

# Örnek veri seti (5 araç)
sample_data = """
name,year,km_driven,fuel,seller_type,transmission,owner,selling_price
Hyundai Creta 1.6 CRDi,2015,41000,Diesel,Individual,Manual,First Owner,650000
Honda City i-VTEC,2017,28000,Petrol,Individual,Automatic,First Owner,850000
Maruti Swift VDI,2014,80000,Diesel,Individual,Manual,Second Owner,300000
Toyota Innova Crysta,2019,35000,Diesel,Dealer,Automatic,First Owner,1200000
Kia Seltos HTK,2020,15000,Petrol,Individual,Manual,First Owner,950000
"""

df = pd.read_csv(StringIO(sample_data))
print("Veri seti önizleme:")
print(df.head())

Veri seti önizleme:
                     name  year  km_driven    fuel seller_type transmission  \
0  Hyundai Creta 1.6 CRDi  2015      41000  Diesel  Individual       Manual   
1       Honda City i-VTEC  2017      28000  Petrol  Individual    Automatic   
2        Maruti Swift VDI  2014      80000  Diesel  Individual       Manual   
3    Toyota Innova Crysta  2019      35000  Diesel      Dealer    Automatic   
4          Kia Seltos HTK  2020      15000  Petrol  Individual       Manual   

          owner  selling_price  
0   First Owner         650000  
1   First Owner         850000  
2  Second Owner         300000  
3   First Owner        1200000  
4   First Owner         950000  


2. Veri Ön İşleme
A. Kategorik Değişkenleri Dönüştürme (One-Hot Encoding)

In [None]:
# Kategorik sütunları sayısala çevirme
df = pd.get_dummies(df, columns=['fuel', 'seller_type', 'transmission', 'owner'])
print("\nOne-Hot Encoding sonrası:")
print(df.head())


One-Hot Encoding sonrası:
                     name  year  km_driven  selling_price  fuel_Diesel  \
0  Hyundai Creta 1.6 CRDi  2015      41000         650000         True   
1       Honda City i-VTEC  2017      28000         850000        False   
2        Maruti Swift VDI  2014      80000         300000         True   
3    Toyota Innova Crysta  2019      35000        1200000         True   
4          Kia Seltos HTK  2020      15000         950000        False   

   fuel_Petrol  seller_type_Dealer  seller_type_Individual  \
0        False               False                    True   
1         True               False                    True   
2        False               False                    True   
3        False                True                   False   
4         True               False                    True   

   transmission_Automatic  transmission_Manual  owner_First Owner  \
0                   False                 True               True   
1                

B. Gereksiz Sütunları Çıkarma


In [None]:
# 'name' sütunu model için gerekli değil
X = df.drop(['selling_price', 'name'], axis=1)
y = df['selling_price']
print("\nÖzellik matrisi (X):")
print(X.head())


Özellik matrisi (X):
   year  km_driven  fuel_Diesel  fuel_Petrol  seller_type_Dealer  \
0  2015      41000         True        False               False   
1  2017      28000        False         True               False   
2  2014      80000         True        False               False   
3  2019      35000         True        False                True   
4  2020      15000        False         True               False   

   seller_type_Individual  transmission_Automatic  transmission_Manual  \
0                    True                   False                 True   
1                    True                    True                False   
2                    True                   False                 True   
3                   False                    True                False   
4                    True                   False                 True   

   owner_First Owner  owner_Second Owner  
0               True               False  
1               True               Fal

3. Eğitim ve Test Ayrımı


In [None]:
from sklearn.model_selection import train_test_split

# %80 eğitim, %20 test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print("\nEğitim verisi boyutu:", X_train.shape)
print("Test verisi boyutu:", X_test.shape)


Eğitim verisi boyutu: (4, 10)
Test verisi boyutu: (1, 10)


4. Model Oluşturma (RandomForestRegressor)


In [None]:
from sklearn.ensemble import RandomForestRegressor

# Basit bir model
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Test seti performansı
from sklearn.metrics import mean_absolute_error, r2_score

y_pred = model.predict(X_test)
print("\nModel Performansı:")
print("MAE:", mean_absolute_error(y_test, y_pred))
print("R2:", r2_score(y_test, y_pred))


Model Performansı:
MAE: 39500.0
R2: nan




5. Hiperparametre Optimizasyonu (GridSearchCV)


In [None]:
from sklearn.model_selection import GridSearchCV

# Parametre grid'i
param_grid = {
    'n_estimators': [50, 100],
    'max_depth': [None, 5, 10],
    'min_samples_split': [2, 5]
}

# GridSearchCV (cv=3, çünkü küçük veri seti)
grid_search = GridSearchCV(
    RandomForestRegressor(random_state=42),
    param_grid,
    cv=3,
    scoring='r2'
)
grid_search.fit(X_train, y_train)

# En iyi parametreler
print("\nEn iyi parametreler:", grid_search.best_params_)
print("En iyi R2 skoru:", grid_search.best_score_)

# Optimize edilmiş model
best_model = grid_search.best_estimator_

6. Model Değerlendirme


In [None]:
# Optimize modelin test performansı
y_pred_best = best_model.predict(X_test)
print("\nOptimize Model Performansı:")
print("MAE:", mean_absolute_error(y_test, y_pred_best))
print("R2:", r2_score(y_test, y_pred_best))


Optimize Model Performansı:
MAE: 91000.0
R2: nan




7. Yeni Tahmin Yapma


In [None]:
# Yeni araç özellikleri
new_car = {
    'year': 2018,
    'km_driven': 45000,
    'fuel_Diesel': 1,
    'fuel_Petrol': 0,
    'seller_type_Individual': 1,
    'seller_type_Dealer': 0,
    'transmission_Manual': 1,
    'transmission_Automatic': 0,
    'owner_First Owner': 1,
    'owner_Second Owner': 0
}

# DataFrame'e çevirme
new_car_df = pd.DataFrame([new_car])

# Eksik sütunları tamamlama (modelin eğitildiği tüm sütunlar)
for col in X_train.columns:
    if col not in new_car_df.columns:
        new_car_df[col] = 0

# Tahmin
predicted_price = best_model.predict(new_car_df[X_train.columns])
print("\nTahmini Fiyat:", predicted_price[0])


Tahmini Fiyat: 842000.0


8. Modeli Kaydetme (Opsiyonel)


In [None]:
import joblib

# Modeli kaydet
joblib.dump(best_model, 'arac_fiyat_tahmini_modeli.pkl')

# Kaydedilmiş modeli yükle
loaded_model = joblib.load('arac_fiyat_tahmini_modeli.pkl')