# Qwen3-8B Model Fine-tuning
## TEKNOFEST 2025 - Eƒüitim Teknolojileri

Bu notebook, Qwen3-8B modelini 4-bit kuantizasyon ve LoRA ile fine-tune etmek i√ßin hazƒ±rlanmƒ±≈ütƒ±r.

In [None]:
# √ñnce runtime'ƒ± yeniden ba≈ülat ve temiz kurulum yap
!pip uninstall transformers accelerate -y
!pip install transformers==4.44.2 accelerate==0.34.2 bitsandbytes==0.43.3 peft==0.12.0 -q
!pip install sentencepiece protobuf -q

In [None]:
# Cache temizliƒüi
import os
import shutil

cache_dir = os.path.expanduser('~/.cache/huggingface')
if os.path.exists(cache_dir):
    shutil.rmtree(cache_dir)
    print("Cache temizlendi")

In [None]:
# Runtime'ƒ± yeniden ba≈ülatmak i√ßin
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

## Runtime yeniden ba≈üladƒ±ktan sonra buradan devam edin:

In [None]:
# Gerekli k√ºt√ºphaneleri import et
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training

print("="*70)
print("MODEL Y√úKLEME - Qwen3-8B")
print("="*70)

# Versiyon kontrol√º
import transformers
print(f"Transformers version: {transformers.__version__}")

MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"  # Daha stabil bir model
print(f"Model: {MODEL_ID}")

In [None]:
# 4-bit kuantizasyon ayarlarƒ±
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)
print("‚úÖ 4-bit kuantizasyon ayarlandƒ±")

In [None]:
# Tokenizer y√ºkle
print("\n‚è≥ Tokenizer y√ºkleniyor...")
tokenizer = AutoTokenizer.from_pretrained(
    MODEL_ID,
    trust_remote_code=True
)

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = tokenizer.eos_token_id

print("‚úÖ Tokenizer y√ºklendi")
print(f"   Vocab size: {len(tokenizer)}")

In [None]:
# Model y√ºkle
print("\n‚è≥ Model y√ºkleniyor (5-10 dakika s√ºrebilir)...")
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    use_cache=False
)
print("‚úÖ Model y√ºklendi (4-bit)")

In [None]:
# LoRA konfig√ºrasyonu
lora_config = LoraConfig(
    r=64,
    lora_alpha=128,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

# Model'i LoRA i√ßin hazƒ±rla
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, lora_config)

# ƒ∞statistikler
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
all_params = sum(p.numel() for p in model.parameters())

print("\nüìä Model ƒ∞statistikleri:")
print(f"   Toplam parametre: {all_params:,}")
print(f"   Eƒüitilebilir: {trainable_params:,}")
print(f"   Eƒüitilebilir %: {100 * trainable_params / all_params:.2f}%")

In [None]:
# GPU bellek durumu
if torch.cuda.is_available():
    allocated = torch.cuda.memory_allocated() / 1024**3
    reserved = torch.cuda.memory_reserved() / 1024**3
    print(f"\nüíæ GPU Bellek:")
    print(f"   Kullanƒ±lan: {allocated:.2f} GB")
    print(f"   Rezerve: {reserved:.2f} GB")

print("\n‚úÖ MODEL HAZIR")

## Alternatif: Daha K√º√ß√ºk Model Kullanƒ±mƒ±

Eƒüer yukarƒ±daki model √ßok b√ºy√ºkse, daha k√º√ß√ºk bir model kullanabilirsiniz:

In [None]:
# Alternatif: Daha k√º√ß√ºk Qwen modeli
# MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct"
# veya
# MODEL_ID = "Qwen/Qwen2.5-3B-Instruct"