# üéÆ Fine-tuning Qwen2.5:3b pour GW2 WvW Counter-Picker

Ce notebook permet de fine-tuner le mod√®le Qwen2.5:3b sur les donn√©es de combats GW2 WvW.

**Pr√©requis** :
- Google Colab (gratuit)
- GPU T4 (activ√© automatiquement)
- ~30 minutes pour le fine-tuning

**R√©sultat** :
- Mod√®le fine-tun√© exportable en GGUF pour Ollama
- Meilleure compr√©hension des compositions GW2 WvW
- R√©ponses plus pr√©cises au format CONTER/FOCUS/TACTIQUE

## 1Ô∏è‚É£ V√©rifier le GPU et installer les d√©pendances

In [None]:
# V√©rifier le GPU disponible
!nvidia-smi

import torch
print(f"\n‚úì PyTorch version: {torch.__version__}")
print(f"‚úì CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"‚úì GPU: {torch.cuda.get_device_name(0)}")
    print(f"‚úì VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

In [None]:
# Installer Unsloth (optimis√© pour le fine-tuning rapide)
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps trl peft accelerate bitsandbytes triton
!pip install datasets huggingface_hub

print("‚úì D√©pendances install√©es")

## 2Ô∏è‚É£ Charger le dataset GW2 WvW

In [None]:
# Uploader le dataset depuis ton PC
# Option 1: Upload manuel
from google.colab import files
print("üìÅ Upload le fichier 'finetune_dataset_qwen.jsonl' depuis ton PC:")
uploaded = files.upload()

# V√©rifier le fichier upload√©
import os
for filename in uploaded.keys():
    print(f"‚úì Fichier upload√©: {filename} ({os.path.getsize(filename)} bytes)")

In [None]:
# Charger et pr√©parer le dataset
from datasets import load_dataset

# Charger le dataset JSONL
dataset = load_dataset("json", data_files="finetune_dataset_qwen.jsonl", split="train")

print(f"‚úì Dataset charg√©: {len(dataset)} exemples")
print(f"\nüìã Exemple:")
print(f"Instruction: {dataset[0]['instruction'][:200]}...")
print(f"Output: {dataset[0]['output']}")

In [None]:
# Formater le dataset pour Qwen2.5
def format_prompt(example):
    """Format pour Qwen2.5 chat template"""
    return {
        "text": f"""<|im_start|>user
{example['instruction']}<|im_end|>
<|im_start|>assistant
{example['output']}<|im_end|>"""
    }

# Appliquer le formatage
formatted_dataset = dataset.map(format_prompt)
print(f"‚úì Dataset format√© pour Qwen2.5")
print(f"\nüìã Exemple format√©:")
print(formatted_dataset[0]['text'][:500])

## 3Ô∏è‚É£ Charger le mod√®le Qwen2.5:3b avec Unsloth

In [None]:
from unsloth import FastLanguageModel

# Configuration du mod√®le
max_seq_length = 2048  # Longueur max des s√©quences
dtype = None  # Auto-detect (float16 pour T4)
load_in_4bit = True  # Quantification 4-bit pour √©conomiser la VRAM

# Charger Qwen2.5:3b
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Qwen/Qwen2.5-3B-Instruct",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

print(f"‚úì Mod√®le Qwen2.5-3B-Instruct charg√©")

In [None]:
# Ajouter les adaptateurs LoRA pour le fine-tuning
model = FastLanguageModel.get_peft_model(
    model,
    r=16,  # Rang LoRA (16 = bon √©quilibre qualit√©/vitesse)
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,  # Pas de dropout pour plus de stabilit√©
    bias="none",
    use_gradient_checkpointing="unsloth",  # √âconomise 30% de VRAM
    random_state=42,
)

print(f"‚úì Adaptateurs LoRA ajout√©s")
print(f"‚úì Param√®tres entra√Ænables: {model.print_trainable_parameters()}")

## 4Ô∏è‚É£ Configurer et lancer le fine-tuning

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments

# Configuration de l'entra√Ænement
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=formatted_dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    packing=False,  # Pas de packing pour des exemples de longueur variable
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_steps=10,
        num_train_epochs=3,  # 3 epochs pour un bon apprentissage
        learning_rate=2e-4,
        fp16=not torch.cuda.is_bf16_supported(),
        bf16=torch.cuda.is_bf16_supported(),
        logging_steps=10,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=42,
        output_dir="outputs",
        report_to="none",  # Pas de logging externe
    ),
)

print(f"‚úì Trainer configur√©")
print(f"‚úì Batch size effectif: {2 * 4} = 8")
print(f"‚úì Epochs: 3")
print(f"‚úì Exemples: {len(formatted_dataset)}")

In [None]:
# üöÄ Lancer le fine-tuning
print("üöÄ D√©marrage du fine-tuning...")
print("‚è±Ô∏è Dur√©e estim√©e: 20-30 minutes sur GPU T4")
print("-" * 50)

trainer_stats = trainer.train()

print("-" * 50)
print(f"‚úì Fine-tuning termin√©!")
print(f"‚úì Loss finale: {trainer_stats.training_loss:.4f}")
print(f"‚úì Temps total: {trainer_stats.metrics['train_runtime']:.0f} secondes")

## 5Ô∏è‚É£ Tester le mod√®le fine-tun√©

In [None]:
# Passer en mode inf√©rence
FastLanguageModel.for_inference(model)

# Test avec une composition ennemie
test_prompt = """Guild Wars 2 WvW counter-picker.

VALID SPECS: Firebrand, Willbender, Dragonhunter, Spellbreaker, Berserker, Bladesworn, Herald, Vindicator, Renegade, Scrapper, Holosmith, Mechanist, Druid, Soulbeast, Untamed, Daredevil, Deadeye, Specter, Tempest, Weaver, Catalyst, Chronomancer, Mirage, Virtuoso, Reaper, Scourge, Harbinger

Mode: ZERG (25+ players)
Enemy: 4x Firebrand, 3x Scourge, 2x Scrapper, 2x Spellbreaker

[ENEMY ANALYSIS]
- Firebrand: support, heal, stability (weak to: boon strip, boon corrupt)
- Scourge: condi, corrupt, barrier (weak to: burst, focus fire)
- Scrapper: support, superspeed, cleanse (weak to: boon strip, focus fire)
- Spellbreaker: frontline, strip, cc (weak to: condi pressure, kiting)

Respond EXACTLY in this format:
CONTER: Nx Spec, Nx Spec
FOCUS: Target1 > Target2
TACTIQUE: One tactical advice"""

# G√©n√©rer la r√©ponse
inputs = tokenizer(
    f"<|im_start|>user\n{test_prompt}<|im_end|>\n<|im_start|>assistant\n",
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.1,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("üìã Test du mod√®le fine-tun√©:")
print("=" * 50)
print(response.split("assistant")[-1].strip())

In [None]:
# Test suppl√©mentaire - Roaming
test_roam = """Guild Wars 2 WvW counter-picker.

VALID SPECS: Firebrand, Willbender, Dragonhunter, Spellbreaker, Berserker, Bladesworn, Herald, Vindicator, Renegade, Scrapper, Holosmith, Mechanist, Druid, Soulbeast, Untamed, Daredevil, Deadeye, Specter, Tempest, Weaver, Catalyst, Chronomancer, Mirage, Virtuoso, Reaper, Scourge, Harbinger

Mode: ROAMING (1-10 players)
Enemy: 2x Soulbeast, 1x Deadeye

[ENEMY ANALYSIS]
- Soulbeast: dps, burst, roam (weak to: CC, sustain fights)
- Deadeye: sniper, burst, backline (weak to: mobility, stealth reveal)

Respond EXACTLY in this format:
CONTER: Nx Spec, Nx Spec
FOCUS: Target1 > Target2
TACTIQUE: One tactical advice"""

inputs = tokenizer(
    f"<|im_start|>user\n{test_roam}<|im_end|>\n<|im_start|>assistant\n",
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.1,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("üìã Test Roaming:")
print("=" * 50)
print(response.split("assistant")[-1].strip())

## 6Ô∏è‚É£ Exporter le mod√®le pour Ollama (GGUF)

In [None]:
# Sauvegarder le mod√®le LoRA
model.save_pretrained("qwen25-3b-gw2-lora")
tokenizer.save_pretrained("qwen25-3b-gw2-lora")
print("‚úì Mod√®le LoRA sauvegard√©")

In [None]:
# Exporter en GGUF pour Ollama (quantification Q4_K_M recommand√©e)
# Q4_K_M = bon √©quilibre qualit√©/taille (~2GB)

model.save_pretrained_gguf(
    "qwen25-3b-gw2-gguf",
    tokenizer,
    quantization_method="q4_k_m",  # Quantification 4-bit
)

print("‚úì Mod√®le export√© en GGUF (Q4_K_M)")
print("üìÅ Fichier: qwen25-3b-gw2-gguf/unsloth.Q4_K_M.gguf")

In [None]:
# T√©l√©charger le mod√®le GGUF
from google.colab import files
import os

# Trouver le fichier GGUF
gguf_dir = "qwen25-3b-gw2-gguf"
gguf_files = [f for f in os.listdir(gguf_dir) if f.endswith('.gguf')]

if gguf_files:
    gguf_path = os.path.join(gguf_dir, gguf_files[0])
    print(f"üì• T√©l√©chargement de {gguf_path}...")
    print(f"   Taille: {os.path.getsize(gguf_path) / 1e9:.2f} GB")
    files.download(gguf_path)
else:
    print("‚ùå Fichier GGUF non trouv√©")

## 7Ô∏è‚É£ Instructions pour utiliser le mod√®le avec Ollama

Une fois le fichier `.gguf` t√©l√©charg√©, voici comment l'utiliser:

### Sur ton serveur:

```bash
# 1. Copier le fichier GGUF sur le serveur
scp unsloth.Q4_K_M.gguf user@server:/path/to/models/

# 2. Cr√©er un Modelfile pour Ollama
cat > Modelfile << 'EOF'
FROM /path/to/models/unsloth.Q4_K_M.gguf

TEMPLATE """<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
{{ .Response }}<|im_end|>"""

PARAMETER temperature 0.1
PARAMETER num_predict 80
PARAMETER num_ctx 1024
PARAMETER stop "<|im_end|>"
EOF

# 3. Cr√©er le mod√®le Ollama
ollama create qwen25-gw2 -f Modelfile

# 4. Tester
ollama run qwen25-gw2 "Test prompt..."
```

### Dans counter_ai.py:

Changer `MODEL_NAME` pour utiliser le mod√®le fine-tun√©:
```python
MODEL_NAME = "qwen25-gw2"  # Mod√®le fine-tun√©
```

## ‚úÖ R√©sum√©

Tu as maintenant:
1. ‚úì Fine-tun√© Qwen2.5:3b sur 1537 exemples de combats GW2 WvW
2. ‚úì Export√© le mod√®le en GGUF pour Ollama
3. ‚úì T√©l√©charg√© le fichier (~2GB)

**Prochaines √©tapes:**
1. Copier le fichier GGUF sur ton serveur
2. Cr√©er le mod√®le Ollama avec le Modelfile
3. Mettre √† jour `counter_ai.py` pour utiliser le nouveau mod√®le

Le mod√®le fine-tun√© devrait:
- Mieux respecter le format CONTER/FOCUS/TACTIQUE
- Comprendre les synergies GW2 WvW
- Donner des conseils tactiques plus pertinents