## Kredi Kartı Sahtekarlık Tespiti - Model Eğitimi
### Bu notebook, kredi kartı sahtekarlık tespiti için çeşitli modellerin eğitilmesi sürecini içermektedir.

In [9]:
import os
import pickle
import numpy as np
import pandas as pd
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, classification_report, roc_auc_score
)
import matplotlib.pyplot as plt
import seaborn as sns

# Proje modüllerini import et
import sys
sys.path.append('..')
from src.utils.logger import setup_logger
from src.constants import TARGET, MODELS_DIR, RANDOM_STATE
from src.models.xgboost import train_xgboost
from src.models.logistic_regression import train_logistic_regression
from src.preprocessing.pipeline import FraudPreprocessor
from src.config import XGBOOST_PARAMS, LOGISTIC_REGRESSION_PARAMS

In [10]:
logger = setup_logger("model_training")

# Verileri yükle
processed_dir = '../data/processed/'
balanced_train = pd.read_csv(f'{processed_dir}balanced_train.csv')
processed_test = pd.read_csv(f'{processed_dir}processed_test.csv')

print(f"Eğitim seti boyutu: {balanced_train.shape}")
print(f"Test seti boyutu: {processed_test.shape}")

Eğitim seti boyutu: (68235, 41)
Test seti boyutu: (56962, 41)


In [11]:
with open('../models/preprocessing_pipeline.pkl', 'rb') as f:
    preprocessor = pickle.load(f)

X_train, y_train = preprocessor.get_features_and_target(balanced_train)
X_test, y_test = preprocessor.get_features_and_target(processed_test)

logger.info("Veriler başarıyla yüklendi ve ön işlendi")

2025-05-09 00:30:15,894 | INFO | pipeline.py:110 | Splitting features and target
2025-05-09 00:30:15,904 | INFO | pipeline.py:110 | Splitting features and target
2025-05-09 00:30:15,913 | INFO | 2841981692.py:7 | Veriler başarıyla yüklendi ve ön işlendi
2025-05-09 00:30:15,913 | INFO | 2841981692.py:7 | Veriler başarıyla yüklendi ve ön işlendi


In [12]:
logger.info("XGBoost model eğitimi başlıyor...")
xgb_model = train_xgboost(X_train, y_train, XGBOOST_PARAMS)

# Test seti tahminleri
xgb_probs = xgb_model.predict(X_test)
xgb_preds = (xgb_probs >= 0.5).astype(int)

logger.info("XGBoost eğitimi tamamlandı")

2025-05-09 00:30:15,922 | INFO | 317311089.py:1 | XGBoost model eğitimi başlıyor...
2025-05-09 00:30:15,922 | INFO | 317311089.py:1 | XGBoost model eğitimi başlıyor...
2025-05-09 00:30:15,924 | INFO | base_model.py:29 | Initialized xgboost model
2025-05-09 00:30:15,925 | INFO | xgboost.py:31 | Training XGBoost model with 68235 samples


Parameters: { "n_estimators" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.




2025-05-09 00:30:41,431 | INFO | xgboost.py:60 | XGBoost model training completed
2025-05-09 00:30:41,435 | INFO | xgboost.py:68 | Making predictions on 56962 samples
2025-05-09 00:30:41,653 | INFO | 317311089.py:8 | XGBoost eğitimi tamamlandı
2025-05-09 00:30:41,653 | INFO | 317311089.py:8 | XGBoost eğitimi tamamlandı


In [13]:
logger.info("Logistic Regression model eğitimi başlıyor...")
lr_model = train_logistic_regression(X_train, y_train, LOGISTIC_REGRESSION_PARAMS)

# Test seti tahminleri
lr_probs = lr_model.predict_proba(X_test)[:, 1]
lr_preds = (lr_probs >= 0.5).astype(int)

logger.info("Logistic Regression eğitimi tamamlandı")

2025-05-09 00:30:41,662 | INFO | 1604418239.py:1 | Logistic Regression model eğitimi başlıyor...
2025-05-09 00:30:41,662 | INFO | 1604418239.py:1 | Logistic Regression model eğitimi başlıyor...
2025-05-09 00:30:41,664 | INFO | base_model.py:29 | Initialized logistic_regression model
2025-05-09 00:30:41,665 | INFO | logistic_regression.py:36 | Training Logistic Regression model with 68235 samples
2025-05-09 00:30:42,374 | INFO | logistic_regression.py:39 | Logistic Regression model training completed
2025-05-09 00:30:42,375 | INFO | logistic_regression.py:55 | Making probability predictions on 56962 samples
2025-05-09 00:30:42,389 | INFO | 1604418239.py:8 | Logistic Regression eğitimi tamamlandı
2025-05-09 00:30:42,389 | INFO | 1604418239.py:8 | Logistic Regression eğitimi tamamlandı


In [14]:
# Hücre 6: Modelleri ve Verileri Kaydetme (DÜZELTİLMİŞ)

import joblib  # pickle'den daha iyi

def save_model(model, model_name):
    path = os.path.join(MODELS_DIR, f'{model_name}.pkl')
    joblib.dump(model, path)  # pickle yerine joblib
    logger.info(f"{model_name} kaydedildi: {path}")

save_model(xgb_model, 'xgboost_model')
save_model(lr_model, 'logistic_regression_model')

# TEST VERİLERİNİ DOĞRU ŞEKİLDE KAYDET (DataFrame olarak)
joblib.dump(X_test, os.path.join(MODELS_DIR, 'X_test.pkl'))  # Özellik isimleri korunur
joblib.dump(y_test, os.path.join(MODELS_DIR, 'y_test.pkl'))
logger.info("Test verileri başarıyla kaydedildi")

2025-05-09 00:30:42,415 | INFO | 4249473250.py:8 | xgboost_model kaydedildi: c:\Users\PC\Desktop\fraud_eye\notebooks\..\models\xgboost_model.pkl
2025-05-09 00:30:42,415 | INFO | 4249473250.py:8 | xgboost_model kaydedildi: c:\Users\PC\Desktop\fraud_eye\notebooks\..\models\xgboost_model.pkl
2025-05-09 00:30:42,418 | INFO | 4249473250.py:8 | logistic_regression_model kaydedildi: c:\Users\PC\Desktop\fraud_eye\notebooks\..\models\logistic_regression_model.pkl
2025-05-09 00:30:42,418 | INFO | 4249473250.py:8 | logistic_regression_model kaydedildi: c:\Users\PC\Desktop\fraud_eye\notebooks\..\models\logistic_regression_model.pkl
2025-05-09 00:30:42,431 | INFO | 4249473250.py:16 | Test verileri başarıyla kaydedildi
2025-05-09 00:30:42,431 | INFO | 4249473250.py:16 | Test verileri başarıyla kaydedildi
