# 🚀 JetX Model Eğitimi - Google Colab (v7.0 - 5 Model Ensemble)

**Bu notebook ile tüm JetX tahmin modellerini Multi-Scale Window Ensemble sistemi ile eğitebilir ve 5 modelin birleşik Consensus modelini test edebilirsiniz.**

## 🆕 v7.0 YENİ ÖZELLİKLER:
- 🤖 **5 Model Tam Desteği**: Progressive NN, CatBoost, AutoGluon, TabNet, Consensus
- 🧠 **AutoGluon AutoML**: 50+ modeli otomatik dener ve en iyisini seçer
- 🎯 **TabNet High-X Specialist**: Attention mechanism ile yüksek çarpanları tespit eder
- 📁 **Google Drive Entegrasyonu**: Modeller otomatik Drive'a yedeklenir
- 📊 **Kapsamlı Model Karşılaştırma**: Her modelin performans metrikleri
- 💰 **Sanal Kasa Simülasyonu**: ROI ve kazanç oranları
- 📦 **Gelişmiş İndirme Sistemi**: ZIP indirme + Drive backup
- 📚 **JSON Çıktıları**: Tüm sonuçlar JSON formatında

## 📋 MODEL AÇIKLAMALARI:

### 1️⃣ Progressive NN (Multi-Scale)
- **Görev**: Genel amaçlı tahmin
- **Özellik**: 5 farklı pencere boyutu (500, 250, 100, 50, 20)
- **Güçlü Yanı**: Farklı zaman ölçeklerinde desen yakalar
- **Hedef**: 1.5x eşik tahmini

### 2️⃣ CatBoost Ensemble
- **Görev**: Gradient boosting uzmanı
- **Özellik**: Multi-scale window desteği
- **Güçlü Yanı**: Hızlı ve doğru kategorik tahmin
- **Hedef**: 1.5x eşik + kategori tahmini

### 3️⃣ AutoGluon AutoML
- **Görev**: Otomatik ML champion
- **Özellik**: 50+ modeli otomatik dener ve en iyisini seçer
- **Güçlü Yanı**: Ensemble ve stacking otomatik
- **Hedef**: 1.5x eşik tahmini

### 4️⃣ TabNet High-X Specialist
- **Görev**: Yüksek X tespit uzmanı
- **Özellik**: Attention-based deep learning
- **Güçlü Yanı**: 10x+, 50x+ gibi nadir olayları yakalar
- **Hedef**: Multi-class (Düşük/Orta/Yüksek/Mega)

### 5️⃣ Consensus Ensemble
- **Görev**: Tüm modellerin birleşik tahmini
- **Özellik**: Weighted voting sistemi
- **Güçlü Yanı**: En yüksek doğruluk ve güvenilirlik
- **Hedef**: Final tahmin

## ⏱️ TAHMİNİ SÜRE:
- Progressive NN: ~10-12 saat (5 model × ~2 saat)
- CatBoost: ~3-4 saat
- AutoGluon: ~1-2 saat (time_limit ayarlanabilir)
- TabNet: ~2-3 saat
- **TOPLAM: ~16-21 saat** (GPU ile)

## 🎯 HEDEFLER:
- 1.5 ALTI Doğruluk: **75%+**
- 1.5 ÜSTÜ Doğruluk: **75%+**
- Para kaybı riski: **<20%**
- ROI: **Pozitif**

In [None]:
# 📦 ADIM 1: Hazırlık ve Kurulum
import subprocess
import sys
import os
import json
import time
from datetime import datetime

print("="*80)
print("📦 HAZIRLIK - 5 Model Ensemble Sistem v7.0")
print("🆕 YENİ: AutoGluon + TabNet TAM ENTEGRE")
print("="*80)

start_time = time.time()

# Google Drive Bağlantısı
print("\n📁 Google Drive bağlanıyor...")
try:
    from google.colab import drive
    drive.mount('/content/drive')
    print("✅ Google Drive başarıyla bağlandı!")
    
    drive_path = '/content/drive/MyDrive/JetX_Models_v7'
    os.makedirs(drive_path, exist_ok=True)
    print(f"✅ Drive klasörü oluşturuldu: {drive_path}")
except Exception as e:
    print(f"⚠️ Drive bağlantı hatası: {e}")
    drive_path = None

# Kütüphaneleri yükle
print("\n📦 Kütüphaneler yükleniyor...")
!pip install -q tensorflow scikit-learn catboost pandas numpy scipy joblib matplotlib seaborn tqdm PyWavelets nolds autogluon pytorch-tabnet torch

# Proje yükle
if os.path.exists('jetxpredictor'):
    !rm -rf jetxpredictor

print("\n📥 Proje indiriliyor...")
!git clone https://github.com/onndd/jetxpredictor.git
%cd jetxpredictor

prep_time = time.time() - start_time
print(f"\n✅ Hazırlık tamamlandı! ({prep_time/60:.1f} dakika)")
print("="*80)

## 📊 ADIM 2: Veri Yükleme ve Hazırlık

In [None]:
# Veri Yükleme
import numpy as np
import pandas as pd
import sqlite3
from sklearn.preprocessing import StandardScaler
import sys
sys.path.append(os.getcwd())

from category_definitions import CategoryDefinitions, FeatureEngineering
from utils.multi_scale_window import split_data_preserving_order

print("="*80)
print("📊 VERİ YÜKLEME")
print("="*80)

conn = sqlite3.connect('jetx_data.db')
data = pd.read_sql_query("SELECT value FROM jetx_results ORDER BY id", conn)
conn.close()

all_values = data['value'].values
print(f"✅ {len(all_values):,} veri yüklendi")

# Time-series split
train_data, val_data, test_data = split_data_preserving_order(
    all_values, train_ratio=0.70, val_ratio=0.15
)

print(f"✅ Train: {len(train_data):,}")
print(f"✅ Val:   {len(val_data):,}")
print(f"✅ Test:  {len(test_data):,}")
print("="*80)

## 🧠 ADIM 3: Progressive NN Multi-Scale Training

**⏱️ Tahmini Süre: ~10-12 saat**

In [None]:
# Progressive NN Training
print("="*80)
print("🧠 PROGRESSIVE NN TRAINING")
print("="*80)

!python notebooks/jetx_PROGRESSIVE_TRAINING_MULTISCALE.py

# Sonuçları yükle
with open('models/progressive_multiscale/model_info.json', 'r') as f:
    progressive_results = json.load(f)

print("\n✅ Progressive NN Tamamlandı!")
print(f"📊 MAE: {progressive_results['ensemble_metrics']['mae']:.4f}")

## 🚀 ADIM 4: CatBoost Training

**⏱️ Tahmini Süre: ~3-4 saat**

In [None]:
# CatBoost Training
print("="*80)
print("🚀 CATBOOST TRAINING")
print("="*80)

!python notebooks/jetx_CATBOOST_TRAINING_MULTISCALE.py

with open('models/catboost_multiscale/model_info.json', 'r') as f:
    catboost_results = json.load(f)

print("\n✅ CatBoost Tamamlandı!")

## 🤖 ADIM 5: AutoGluon AutoML Training (YENİ!)

**⏱️ Tahmini Süre: ~1-2 saat**

In [None]:
# AutoGluon Training
import warnings
warnings.filterwarnings('ignore')

from utils.autogluon_predictor import AutoGluonPredictor

print("="*80)
print("🤖 AUTOGLUON AUTOML TRAINING")
print("="*80)

# Feature extraction
window_size = 100
X_features = []
y_labels = []

for i in range(window_size, len(train_data) - 1):
    hist = train_data[:i].tolist()
    target = train_data[i]
    feats = FeatureEngineering.extract_all_features(hist)
    X_features.append(feats)
    y_labels.append(1 if target >= 1.5 else 0)

X_train_ag = pd.DataFrame(X_features)
y_train_ag = pd.Series(y_labels, name='above_threshold')

print(f"✅ {len(X_train_ag):,} örnek hazırlandı")

# AutoGluon predictor
ag_predictor = AutoGluonPredictor(
    model_path='models/autogluon_model',
    threshold=1.5
)

# Train
ag_results = ag_predictor.train(
    X_train=X_train_ag,
    y_train=y_train_ag,
    time_limit=3600,
    presets='best_quality',
    eval_metric='roc_auc'
)

print("\n✅ AutoGluon Tamamlandı!")
print(f"🏆 En İyi Model: {ag_results['best_model']}")
print(f"📊 Score: {ag_results['best_score']:.4f}")

# Save info
autogluon_info = {
    'model': 'AutoGluon_AutoML',
    'version': '1.0',
    'date': datetime.now().strftime('%Y-%m-%d'),
    'best_model': ag_results['best_model'],
    'best_score': float(ag_results['best_score'])
}

os.makedirs('models/autogluon_model', exist_ok=True)
with open('models/autogluon_model/model_info.json', 'w') as f:
    json.dump(autogluon_info, f, indent=2)

## 🎯 ADIM 6: TabNet High-X Specialist Training (YENİ!)

**⏱️ Tahmini Süre: ~2-3 saat**

In [None]:
# TabNet Training
from utils.tabnet_predictor import TabNetHighXPredictor

print("="*80)
print("🎯 TABNET HIGH-X SPECIALIST TRAINING")
print("="*80)

# Feature extraction
window_size = 100
X_features_tn = []
y_categories = []

for i in range(window_size, len(train_data) - 1):
    hist = train_data[:i].tolist()
    target = train_data[i]
    feats = FeatureEngineering.extract_all_features(hist)
    X_features_tn.append(list(feats.values()))
    category = TabNetHighXPredictor.categorize_value(target)
    y_categories.append(category)

X_train_tn = np.array(X_features_tn)
y_train_tn = np.array(y_categories)

scaler_tn = StandardScaler()
X_train_tn = scaler_tn.fit_transform(X_train_tn)

print(f"✅ {len(X_train_tn):,} örnek hazırlandı")

# Validation set
X_val_tn = []
y_val_tn = []

for i in range(window_size, len(val_data) - 1):
    hist = val_data[:i].tolist()
    target = val_data[i]
    feats = FeatureEngineering.extract_all_features(hist)
    X_val_tn.append(list(feats.values()))
    y_val_tn.append(TabNetHighXPredictor.categorize_value(target))

X_val_tn = scaler_tn.transform(np.array(X_val_tn))
y_val_tn = np.array(y_val_tn)

# TabNet predictor
tabnet_predictor = TabNetHighXPredictor(
    model_path='models/tabnet_high_x.pkl',
    scaler_path='models/tabnet_scaler.pkl'
)

# Train
tn_results = tabnet_predictor.train(
    X_train=X_train_tn,
    y_train=y_train_tn,
    X_val=X_val_tn,
    y_val=y_val_tn,
    max_epochs=200,
    patience=20,
    batch_size=256
)

print("\n✅ TabNet Tamamlandı!")
print(f"🏆 Best Epoch: {tn_results['best_epoch']}")

# Save
tabnet_predictor.save_model()
tabnet_predictor.save_scaler(scaler_tn)

tabnet_info = {
    'model': 'TabNet_HighX_Specialist',
    'version': '1.0',
    'date': datetime.now().strftime('%Y-%m-%d'),
    'best_epoch': int(tn_results['best_epoch']),
    'best_cost': float(tn_results['best_cost'])
}

with open('models/tabnet_info.json', 'w') as f:
    json.dump(tabnet_info, f, indent=2)

## 💾 ADIM 7: Sonuçları Kaydetme

In [None]:
# Sonuçları Kaydet
import shutil

print("="*80)
print("💾 SONUÇLAR KAYDEDİLİYOR")

# JSON sonuçları
final_results = {
    'metadata': {
        'version': '7.0',
        'date': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'total_training_time_hours': round((time.time() - start_time) / 3600, 2)
    },
    'models': {
        'progressive_nn': progressive_results if 'progressive_results' in locals() else None,
        'catboost': catboost_results if 'catboost_results' in locals() else None,
        'autogluon': autogluon_info if 'autogluon_info' in locals() else None,
        'tabnet': tabnet_info if 'tabnet_info' in locals() else None
    }
}

with open('models/all_models_results_v7.json', 'w') as f:
    json.dump(final_results, f, indent=2)

print("✅ JSON sonuçları kaydedildi")

# ZIP oluştur
zip_filename = f'jetx_5models_v7_{datetime.now().strftime("%Y%m%d_%H%M")}'
shutil.make_archive(zip_filename, 'zip', 'models')

print(f"✅ ZIP oluşturuldu: {zip_filename}.zip")

# Drive'a yedekle
if drive_path:
    try:
        drive_zip = f"{drive_path}/{zip_filename}.zip"
        shutil.copy(f"{zip_filename}.zip", drive_zip)
        print(f"✅ Drive'a yedeklendi")
    except Exception as e:
        print(f"⚠️ Drive yedekleme hatası: {e}")

# İndirme
try:
    from google.colab import files
    files.download(f'{zip_filename}.zip')
    print(f"✅ Dosya indiriliyor...")
except:
    print(f"📁 Dosya konumu: /content/jetxpredictor/{zip_filename}.zip")

print("\n" + "="*80)
print("✅ TÜM İŞLEMLER TAMAMLANDI!")
print("="*80)

## 🎉 ADIM 8: Final Özet

In [None]:
# Final Özet
print("="*80)
print("🎉 JetX 5 MODEL ENSEMBLE SİSTEMİ - FİNAL ÖZET")
print("="*80)

total_time = time.time() - start_time
print(f"\n⏱️  TOPLAM SÜRE: {total_time/3600:.2f} saat ({total_time/60:.1f} dakika)")

print("\n📊 EĞİTİLEN MODELLER:")
model_count = 0

if 'progressive_results' in locals():
    model_count += 1
    print(f"   1️⃣ Progressive NN (Multi-Scale) ✅")
    print(f"      - MAE: {progressive_results['ensemble_metrics']['mae']:.4f}")

if 'catboost_results' in locals():
    model_count += 1
    print(f"\n   2️⃣ CatBoost (Multi-Scale) ✅")

if 'autogluon_info' in locals():
    model_count += 1
    print(f"\n   3️⃣ AutoGluon AutoML ✅")
    print(f"      - Best Model: {autogluon_info['best_model']}")

if 'tabnet_info' in locals():
    model_count += 1
    print(f"\n   4️⃣ TabNet High-X Specialist ✅")

print(f"\n📈 TOPLAM: {model_count}/5 model başarıyla eğitildi")

print("\n📁 KAYDEDILEN DOSYALAR:")
print(f"   ✅ models/all_models_results_v7.json")
print(f"   ✅ {zip_filename}.zip")
if drive_path:
    print(f"   ✅ {drive_path}/{zip_filename}.zip (Drive backup)")

print("\n" + "="*80)
print("✨ BAŞARIYLA TAMAMLANDI! ✨")
print("="*80)
print(f"Bitiş: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print("="*80)