# Entra√Ænement du Neural Network pour Smart Chess sur Google Colab

Ce notebook permet d'entra√Æner le r√©seau de neurones pour l'√©valuation d'√©checs en utilisant les ressources GPU de Google Colab.

**Chemin du projet sur Drive:** `MyDrive/smart_chess_drive/smart-chess`

## Instructions
1. Aller dans **Runtime > Change runtime type > GPU** (T4 ou mieux)
2. Ex√©cuter les cellules dans l'ordre
3. Les mod√®les seront sauvegard√©s automatiquement sur votre Drive

## 1. V√©rification GPU

In [2]:
# V√©rifier la disponibilit√© du GPU
!nvidia-smi

Sun Nov 16 16:56:52 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   66C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## 2. Montage Google Drive

In [3]:
# Monter Google Drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


## 3. Configuration du chemin du projet

In [4]:
# D√©finir le chemin vers le projet sur votre Drive
import os
import sys

PROJECT_PATH = '/content/drive/MyDrive/smart_chess_drive/smart-chess'
os.chdir(PROJECT_PATH)
sys.path.insert(0, PROJECT_PATH)

print(f"R√©pertoire de travail: {os.getcwd()}")
print(f"\nContenu du r√©pertoire:")
for item in sorted(os.listdir('.')):
    print(f"  - {item}")

R√©pertoire de travail: /content/drive/MyDrive/smart_chess_drive/smart-chess

Contenu du r√©pertoire:
  - .git
  - .gitignore
  - README.md
  - ai
  - docs
  - prototypes


## 4. Installation des d√©pendances

In [5]:
# Installer les packages n√©cessaires
!pip install -q torch torchvision torchaudio
!pip install -q numpy matplotlib tqdm

print("‚úì Installation termin√©e")

‚úì Installation termin√©e


## 5. V√©rification de l'environnement PyTorch

In [6]:
import torch
import numpy as np

print("=" * 60)
print("CONFIGURATION SYST√àME")
print("=" * 60)
print(f"PyTorch version: {torch.__version__}")
print(f"NumPy version: {np.__version__}")
print(f"\nCUDA disponible: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"Nom du GPU: {torch.cuda.get_device_name(0)}")
    props = torch.cuda.get_device_properties(0)
    print(f"M√©moire GPU totale: {props.total_memory / 1e9:.2f} GB")
    print(f"Compute Capability: {props.major}.{props.minor}")
else:
    print("‚ö†Ô∏è ATTENTION: GPU non disponible, l'entra√Ænement sera tr√®s lent!")
    print("   Allez dans Runtime > Change runtime type > GPU")

print("=" * 60)

CONFIGURATION SYST√àME
PyTorch version: 2.8.0+cu126
NumPy version: 2.0.2

CUDA disponible: True
CUDA version: 12.6
Nom du GPU: Tesla T4
M√©moire GPU totale: 15.83 GB
Compute Capability: 7.5


## 6. Import des modules du projet

In [7]:
# Importer les modules n√©cessaires depuis le projet (robuste √† l'emplacement du repo sur Drive)
import os
import sys
import importlib

# Assurez-vous que PROJECT_PATH est d√©fini et ajoutez √©galement le dossier `ai` au PYTHONPATH
PROJECT_PATH = '/content/drive/MyDrive/smart_chess_drive/smart-chess'
AI_SUBDIR = os.path.join(PROJECT_PATH, 'ai')

# V√©rifier les chemins alternatifs (si l'utilisateur a copi√© le repo dans /content)
ALT_PATH = '/content/smart-chess'

# Choisir un chemin existant
if not os.path.isdir(PROJECT_PATH) and os.path.isdir(ALT_PATH):
    PROJECT_PATH = ALT_PATH

if not os.path.isdir(PROJECT_PATH):
    raise FileNotFoundError(f"R√©pertoire projet introuvable: {PROJECT_PATH}. Montez Drive et v√©rifiez le chemin.")

# Ajouter au sys.path si n√©cessaire
if PROJECT_PATH not in sys.path:
    sys.path.insert(0, PROJECT_PATH)
if AI_SUBDIR not in sys.path and os.path.isdir(AI_SUBDIR):
    sys.path.insert(0, AI_SUBDIR)

# Se placer dans le r√©pertoire projet
os.chdir(PROJECT_PATH)

print('R√©pertoire de travail:', os.getcwd())
print('\nQuelques fichiers √† la racine du projet:')
print(sorted(os.listdir(PROJECT_PATH))[:50])
print('\nContenu du dossier ai/:')
print(sorted(os.listdir(AI_SUBDIR))[:100])

# Diagnostic d'import direct pour le module Chess
try:
    import Chess
    print('\n‚úÖ Import direct `Chess` OK (module trouv√© via sys.path)')
except Exception as e:
    print('\n‚ùå Import direct `Chess` a √©chou√©:', e)
    print('V√©rifiez que `ai/Chess.py` existe et que le dossier ai/ est dans sys.path')

# Maintenant importer le module d'entra√Ænement (trainer) - UPDATED to torch_train
try:
    import ai.NN.torch_train as trainer
    import ai.NN.torch_nn_evaluator as torch_eval
    from ai.Chess_v2 import Chess
    print('\n‚úì Modules import√©s avec succ√®s!')
except Exception as e:
    print('\n‚ùå Erreur d\'import lors de l\'import du trainer:', e)
    raise


R√©pertoire de travail: /content/drive/MyDrive/smart_chess_drive/smart-chess

Quelques fichiers √† la racine du projet:
['.git', '.gitignore', 'README.md', 'ai', 'docs', 'prototypes']

Contenu du dossier ai/:
['AI_reduction', 'Chess.py', 'ChessInteractif - v7.py', 'ChessInteractifv10.py', 'ChessInteractifv2.py', 'Chess_v2.py', 'NN', 'Null_move_AI', 'Old_AI', 'Player.py', 'Profile', 'Tests.py', '__init__.py', '__pycache__', 'alphabeta.py', 'alphabeta_engine.py', 'alphabeta_engine_v2.py', 'analyze_reduction_overhead.py', 'base_engine.py', 'check_dataset_stats.py', 'check_gpu.py', 'check_performance.py', 'checkpoints', 'chess_model_checkpoint.pt', 'debug_conversion.py', 'engine.py', 'engine_match.py', 'evaluator.py', 'example_move_reduction.py', 'fast_evaluator.py', 'gaviota.py', 'journal-experiments.md', 'optimized_chess.py', 'pgn.py', 'polyglot.py', 'profile_report_1760344602.txt', 'py.typed', 'svg.py', 'syzygy.py', 'test_depth_6_performance.py', 'test_depth_6_quick.py', 'test_depth_eff

## 7. Configuration de l'entra√Ænement

In [8]:
# Param√®tres d'entra√Ænement (NNUE architecture)
CONFIG = {
    # G√©n√©ration de donn√©es
    'num_games': 10000,          # Nombre de parties √† g√©n√©rer pour l'entra√Ænement

    # Hyperparam√®tres NNUE
    'batch_size': 256,           # Taille du batch (augmenter si GPU puissant)
    'epochs': 50,                # Nombre d'√©poques d'entra√Ænement
    'learning_rate': 0.001,      # Taux d'apprentissage

    # Architecture NNUE (768 ‚Üí 4096 ‚Üí 256 ‚Üí 32 ‚Üí 1)
    'hidden1': 4096,
    'hidden2': 256,
    'hidden3': 32,
    'dropout': 0.0,              # NNUE ne use pas de dropout

    # Configuration syst√®me
    'device': 'cuda' if torch.cuda.is_available() else 'cpu',
    'num_workers': 2,            # Workers pour le DataLoader

    # Sauvegarde
    'checkpoint_path': 'ai/chess_model_checkpoint.pt',
    'save_interval': 5,          # Sauvegarder tous les N √©poques
}

print("=" * 60)
print("CONFIGURATION DE L'ENTRA√éNEMENT (NNUE)")
print("=" * 60)
for key, value in CONFIG.items():
    print(f"{key:20s}: {value}")
print("=" * 60)

if CONFIG['device'] == 'cpu':
    print("\n‚ö†Ô∏è ATTENTION: Entra√Ænement sur CPU d√©tect√©!")
    print("   R√©duisez num_games et epochs pour un test rapide.")


CONFIGURATION DE L'ENTRA√éNEMENT (NNUE)
num_games           : 10000
batch_size          : 256
epochs              : 50
learning_rate       : 0.001
hidden1             : 4096
hidden2             : 256
hidden3             : 32
dropout             : 0.0
device              : cuda
num_workers         : 2
checkpoint_path     : ai/chess_model_checkpoint.pt
save_interval       : 5


## 8. G√©n√©ration des donn√©es d'entra√Ænement

Cette √©tape g√©n√®re des parties d'√©checs al√©atoires et calcule les √©valuations de position.
**Attention:** Cela peut prendre 15-30 minutes selon le nombre de parties.

In [9]:
# Localiser le dataset sur Google Drive et pr√©parer le dossier de checkpoints
import os
from glob import glob

# Chemin attendu du dossier contenant le dataset (donn√© par l'user)
# Updated based on user's feedback that the file is directly in smart_chess_drive
DATASET_DIR = '/content/drive/MyDrive/smart_chess_drive/'

# Chercher un fichier .csv dans DATASET_DIR
DATASET_CSV = None
if os.path.exists(DATASET_DIR):
    csvs = glob(os.path.join(DATASET_DIR, '*.csv'))
    if len(csvs) > 0:
        # Assuming there's only one relevant CSV in that dir, pick the first one
        DATASET_CSV = csvs[0]
        print(f'‚úÖ Dataset CSV trouv√©: {DATASET_CSV}')
    else:
        print(f'‚ùå Aucun fichier .csv trouv√© dans {DATASET_DIR}. Placez votre fichier chessData.csv dans ce dossier.')
else:
    print(f'‚ùå Dossier dataset introuvable: {DATASET_DIR}. V√©rifiez le chemin sur votre Drive.')

# Cr√©er un dossier de checkpoints dans le repo sur Drive (persistant)
CKPT_DIR = '/content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints'
os.makedirs(CKPT_DIR, exist_ok=True)
print('Dossier de checkpoints (cr√©√© si manquant):', CKPT_DIR)

# Exposer variables utiles
print('\nVariables expos√©es:')
print(' DATASET_CSV =', DATASET_CSV)
print(' CKPT_DIR =', CKPT_DIR)

‚úÖ Dataset CSV trouv√©: /content/drive/MyDrive/smart_chess_drive/chessData.csv
Dossier de checkpoints (cr√©√© si manquant): /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints

Variables expos√©es:
 DATASET_CSV = /content/drive/MyDrive/smart_chess_drive/chessData.csv
 CKPT_DIR = /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints


In [10]:
from tqdm import tqdm
import time
from sklearn.model_selection import train_test_split

print("Chargement du dataset (depuis chessData)...")

# Pr√©f√©rer la variable DATASET_CSV (d√©finie apr√®s le montage Drive) sinon utiliser la valeur par d√©faut du module trainer
dataset_path = globals().get('DATASET_CSV')

if dataset_path is None:
    raise FileNotFoundError('Aucun chemin de dataset d√©fini. Montez Drive et placez le fichier CSV dans MyDrive/smart_chess_drive/chessData')

start_time = time.time()

# Utiliser la fonction de chargement du script d'entra√Ænement pour assurer le m√™me pr√©traitement
fens, evaluations = trainer.load_data(dataset_path)

# Variables attendues plus bas dans le notebook (split train/validation: 99% train, 1% val)
X_train, X_val, y_train, y_val = train_test_split(fens, evaluations, test_size=0.01, random_state=42)

elapsed_time = time.time() - start_time

print("\n" + "=" * 60)
print("DONN√âES CHARG√âES")
print("=" * 60)
print(f"Dataset train: {len(X_train):,} positions")
print(f"Dataset validation: {len(X_val):,} positions")
print(f"Total: {len(fens):,} positions")
print(f"Temps √©coul√©: {elapsed_time:.1f}s ({elapsed_time/60:.1f} min)")
print("=" * 60)



Chargement du dataset (depuis chessData)...
üìÇ Chargement du dataset depuis /content/drive/MyDrive/smart_chess_drive/chessData.csv...
üßπ Nettoyage : 190154 lignes corrompues supprim√©es.
‚úÖ 12,767,881 positions valides charg√©es.

DONN√âES CHARG√âES
Dataset train: 12,640,202 positions
Dataset validation: 127,679 positions
Total: 12,767,881 positions
Temps √©coul√©: 19.7s (0.3 min)


In [11]:
import inspect
import ai.NN.torch_train as trainer

try:
    # Get the source code of the load_data function
    source_code = inspect.getsource(trainer.load_data)
    print("Source code of trainer.load_data:")
    print("=" * 60)
    print(source_code)
    print("=" * 60)
except TypeError:
    print("Could not get source code for trainer.load_data. It might not be a function defined in the file.")
except FileNotFoundError:
    print("Could not find the torch_train.py file.")
except Exception as e:
    print(f"An error occurred while trying to get source code: {e}")


Source code of trainer.load_data:
def load_data(filepath: str):
    """Charge le dataset FEN,Evaluation et le nettoie."""
    print(f"üìÇ Chargement du dataset depuis {filepath}...")
    
    df = pd.read_csv(
        filepath, 
        names=['FEN', 'Evaluation'], 
        skiprows=1,
        comment='#'
    )
    
    initial_count = len(df)
    df.dropna(inplace=True)
    cleaned_count = len(df)
    
    if initial_count > cleaned_count:
        print(f"üßπ Nettoyage : {initial_count - cleaned_count} lignes corrompues supprim√©es.")
    
    fens = df['FEN'].values
    EVAL_SCALE_FACTOR = 1000.0
    evaluations = (df['Evaluation'].astype(int).values) / EVAL_SCALE_FACTOR
    
    print(f"‚úÖ {len(fens):,} positions valides charg√©es.")
    return fens, evaluations



In [12]:
import os

file_path = os.path.join(PROJECT_PATH, 'ai/NN/torch_train.py')

# Read the content of the file
with open(file_path, 'r') as f:
    content = f.read()

# Assuming the load_data function signature is currently load_data(filepath: str):
# We need to verify it accepts a filepath parameter
if 'def load_data(filepath:' in content or 'def load_data(filepath)' in content:
    print(f"‚úÖ La fonction load_data dans {file_path} accepte d√©j√† un param√®tre filepath.")
    print("Aucune modification n√©cessaire.")
else:
    print(f"‚ö†Ô∏è La fonction load_data pourrait n√©cessiter une modification.")
    print("V√©rifiez manuellement si elle accepte un chemin de fichier en param√®tre.")


‚úÖ La fonction load_data dans /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/NN/torch_train.py accepte d√©j√† un param√®tre filepath.
Aucune modification n√©cessaire.


## 9. Cr√©ation du dataset et du dataloader

In [13]:
from torch.utils.data import DataLoader
from ai.NN.torch_train import ChessDataset

# Cr√©er le dataset d'entra√Ænement
dataset = ChessDataset(X_train, y_train)

# Cr√©er le dataloader d'entra√Ænement
train_loader = DataLoader(
    dataset,
    batch_size=CONFIG['batch_size'],
    shuffle=True,
    num_workers=CONFIG['num_workers'],
    pin_memory=True if CONFIG['device'] == 'cuda' else False
)

print("=" * 60)
print("DATALOADER CONFIGUR√â")
print("=" * 60)
print(f"Taille du dataset d'entra√Ænement: {len(dataset):,} √©chantillons")
print(f"Nombre de batches: {len(train_loader):,}")
print(f"Taille du batch: {CONFIG['batch_size']}")
print(f"Derni√®re batch: {len(dataset) % CONFIG['batch_size']} √©chantillons")
print(f"\nDataset de validation: {len(X_val):,} √©chantillons")
print("=" * 60)



DATALOADER CONFIGUR√â
Taille du dataset d'entra√Ænement: 12,640,202 √©chantillons
Nombre de batches: 49,376
Taille du batch: 256
Derni√®re batch: 202 √©chantillons

Dataset de validation: 127,679 √©chantillons


## 10. Cr√©ation du mod√®le

In [14]:
# Cr√©er le mod√®le NNUE et le d√©placer sur le device appropri√©
from ai.NN.torch_nn_evaluator import TorchNNEvaluator

model = TorchNNEvaluator(
    hidden1=CONFIG['hidden1'],
    hidden2=CONFIG['hidden2'],
    hidden3=CONFIG['hidden3'],
    dropout=CONFIG['dropout']
).to(CONFIG['device'])

# Afficher l'architecture
print("=" * 60)
print("ARCHITECTURE DU MOD√àLE (NNUE-LIKE)")
print("=" * 60)
print(model)
print("=" * 60)

# Compter les param√®tres
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"\nNombre total de param√®tres: {total_params:,}")
print(f"Param√®tres entra√Ænables: {trainable_params:,}")
print(f"Device: {CONFIG['device']}")

# Estimer la taille m√©moire du mod√®le
param_size_mb = total_params * 4 / (1024 ** 2)  # 4 bytes par float32
print(f"Taille estim√©e du mod√®le: {param_size_mb:.2f} MB")

# Afficher les dimensions des couches
print(f"\nArchitecture d√©taill√©e:")
print(f"  Input:  {model.l1.in_features}")
print(f"  Layer 1: {model.l1.out_features} (ReLU)")
print(f"  Layer 2: {model.l2.out_features} (ReLU)")
print(f"  Layer 3: {model.l3.out_features} (ReLU)")
print(f"  Output: {model.l4.out_features} (Linear)")


ARCHITECTURE DU MOD√àLE (NNUE-LIKE)
TorchNNEvaluator(
  (l1): Linear(in_features=768, out_features=4096, bias=True)
  (l2): Linear(in_features=4096, out_features=256, bias=True)
  (l3): Linear(in_features=256, out_features=32, bias=True)
  (l4): Linear(in_features=32, out_features=1, bias=True)
  (act): ReLU()
)

Nombre total de param√®tres: 4,206,913
Param√®tres entra√Ænables: 4,206,913
Device: cuda
Taille estim√©e du mod√®le: 16.05 MB

Architecture d√©taill√©e:
  Input:  768
  Layer 1: 4096 (ReLU)
  Layer 2: 256 (ReLU)
  Layer 3: 32 (ReLU)
  Output: 1 (Linear)


## 11. Entra√Ænement du mod√®le

Cette √©tape lance l'entra√Ænement complet. Les checkpoints sont sauvegard√©s automatiquement sur votre Drive.

In [None]:
# Configurer et lancer le script d'entra√Ænement `ai.NN.torch_train` en adaptant les chemins pour Colab/Drive
import os
import importlib

if DATASET_CSV is None:
    raise FileNotFoundError(f"Dataset non trouv√© dans: {DATASET_DIR}")

# Importer le module d'entra√Ænement (UPDATED: torch_train au lieu de train_torch)
import ai.NN.torch_train as trainer

# Reload the module to pick up recent changes
importlib.reload(trainer)

# Rediriger les chemins dataset et checkpoints vers Drive
trainer.DATASET_PATH = DATASET_CSV
trainer.CHECKPOINT_FILE = os.path.join(CKPT_DIR, os.path.basename(trainer.CHECKPOINT_FILE))
trainer.WEIGHTS_FILE = os.path.join(CKPT_DIR, os.path.basename(trainer.WEIGHTS_FILE))

# Harmonisation des param√®tres avec CONFIG
try:
    trainer.BATCH_SIZE = CONFIG['batch_size']
    print(f'‚úÖ Harmonisation: trainer.BATCH_SIZE = {trainer.BATCH_SIZE}')
except Exception as e:
    print('‚ö†Ô∏è Impossible de d√©finir trainer.BATCH_SIZE:', e)

# Appliquer l'architecture NNUE
try:
    trainer.HIDDEN1 = CONFIG['hidden1']
    trainer.HIDDEN2 = CONFIG['hidden2']
    trainer.HIDDEN3 = CONFIG['hidden3']
    trainer.DROPOUT = CONFIG['dropout']
    print(f"‚úÖ Architecture NNUE appliqu√©e: {trainer.HIDDEN1} ‚Üí {trainer.HIDDEN2} ‚Üí {trainer.HIDDEN3}")
except Exception as e:
    print('‚ö†Ô∏è Impossible de d√©finir l\'architecture NNUE:', e)

# Optionnellement ajuster MAX_SAMPLES
try:
    trainer.MAX_SAMPLES = 200_000
    print(f"‚úÖ MAX_SAMPLES = {trainer.MAX_SAMPLES}")
except Exception as e:
    print('‚ö†Ô∏è Impossible de d√©finir trainer.MAX_SAMPLES:', e)

# Optionnel: r√©duire pour test rapide (d√©commentez si besoin)
# trainer.EPOCHS = 2
# trainer.MAX_SAMPLES = 5000

print('\nConfiguration trainer:')
print(' DATASET_PATH=', trainer.DATASET_PATH)
print(' CHECKPOINT_FILE=', trainer.CHECKPOINT_FILE)
print(' WEIGHTS_FILE=', trainer.WEIGHTS_FILE)
print(' Architecture: 768 ‚Üí', trainer.HIDDEN1, '‚Üí', trainer.HIDDEN2, '‚Üí', trainer.HIDDEN3, '‚Üí 1')
print(' EPOCHS=', trainer.EPOCHS)
print(' MAX_SAMPLES=', trainer.MAX_SAMPLES)

# Lancer l'entra√Ænement
trainer.main()


üñ•Ô∏è  Device: cuda
üöÄ GPU: Tesla T4
üíæ GPU Memory: 15.83 GB
‚úÖ Harmonisation: trainer.BATCH_SIZE = 256
‚úÖ Architecture NNUE appliqu√©e: 4096 ‚Üí 256 ‚Üí 32
‚úÖ MAX_SAMPLES = 200000

Configuration trainer:
 DATASET_PATH= /content/drive/MyDrive/smart_chess_drive/chessData.csv
 CHECKPOINT_FILE= /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints/chess_model_checkpoint.pt
 WEIGHTS_FILE= /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints/chess_nn_weights.npz
 Architecture: 768 ‚Üí 4096 ‚Üí 256 ‚Üí 32 ‚Üí 1
 EPOCHS= 10
 MAX_SAMPLES= 200000
üìÇ Chargement du dataset depuis /content/drive/MyDrive/smart_chess_drive/chessData.csv...
üßπ Nettoyage : 190154 lignes corrompues supprim√©es.
‚úÖ 12,767,881 positions valides charg√©es.

üìä Dataset complet: 12,767,881 positions
üìä Train: 12,640,202 positions
üìä Validation: 127,679 positions
üì• Chargement du checkpoint PyTorch: /content/drive/MyDrive/smart_chess_drive/smart-chess/ai/checkpoints/chess_mo

Epoch 1/10:   0%|          | 0/782 [00:00<?, ?it/s]

[GRAD] epoch=1 batch=0 grad_norm=0.794061 max_abs_grad=0.554259 param_norm=2863.824657


Epoch 1/10:   1%|          | 8/782 [00:00<00:28, 26.78it/s, loss=0.4939]


[DEBUG batch 0] targets mean=-0.0036 std=0.6496; preds mean=0.0153 std=0.4761; RMSE=0.4593; corr=0.7082


Epoch 1/10:  14%|‚ñà‚ñç        | 110/782 [00:01<00:10, 63.55it/s, loss=0.5189]

[GRAD] epoch=1 batch=100 grad_norm=3.081154 max_abs_grad=2.777257 param_norm=2863.830112


Epoch 1/10:  27%|‚ñà‚ñà‚ñã       | 210/782 [00:03<00:08, 66.46it/s, loss=0.5175]

[GRAD] epoch=1 batch=200 grad_norm=0.376880 max_abs_grad=0.227519 param_norm=2863.835622


Epoch 1/10:  39%|‚ñà‚ñà‚ñà‚ñâ      | 308/782 [00:05<00:07, 64.45it/s, loss=0.5207]

[GRAD] epoch=1 batch=300 grad_norm=0.543626 max_abs_grad=0.374502 param_norm=2863.839858


Epoch 1/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 408/782 [00:06<00:06, 60.37it/s, loss=0.5164]

[GRAD] epoch=1 batch=400 grad_norm=0.623933 max_abs_grad=0.463305 param_norm=2863.846829


Epoch 1/10:  66%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 514/782 [00:08<00:04, 66.55it/s, loss=0.5153]

[GRAD] epoch=1 batch=500 grad_norm=0.769188 max_abs_grad=0.532162 param_norm=2863.853879


Epoch 1/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 612/782 [00:09<00:02, 60.71it/s, loss=0.5143]

[GRAD] epoch=1 batch=600 grad_norm=1.615108 max_abs_grad=1.083929 param_norm=2863.860112


Epoch 1/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 711/782 [00:11<00:01, 61.11it/s, loss=0.5110]

[GRAD] epoch=1 batch=700 grad_norm=3.064898 max_abs_grad=2.595213 param_norm=2863.866442


Epoch 1/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 61.76it/s, loss=0.5121]



üîç √âvaluation epoch 1...

EPOCH 1/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5038  (baseline: 0.8191)
  MAE:         0.2284
  Am√©lioration: +38.5% vs baseline
  Corr√©lation: 0.7885
  Std preds:   0.6573  (cible: 0.8191)
  Mean preds:  0.0389  (cible: 0.0373)
  ‚úì  Bon apprentissage!


[Epoch 2] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000100


Epoch 2/10:   1%|          | 6/782 [00:00<00:14, 52.73it/s, loss=0.5331]

[GRAD] epoch=2 batch=0 grad_norm=3.397603 max_abs_grad=2.924154 param_norm=2863.871161


Epoch 2/10:  14%|‚ñà‚ñç        | 108/782 [00:01<00:09, 67.90it/s, loss=0.5154]

[GRAD] epoch=2 batch=100 grad_norm=2.124271 max_abs_grad=1.980428 param_norm=2863.874466


Epoch 2/10:  27%|‚ñà‚ñà‚ñã       | 213/782 [00:03<00:09, 59.30it/s, loss=0.5045]

[GRAD] epoch=2 batch=200 grad_norm=2.011799 max_abs_grad=1.688606 param_norm=2863.878319


Epoch 2/10:  39%|‚ñà‚ñà‚ñà‚ñâ      | 307/782 [00:04<00:07, 65.15it/s, loss=0.5115]

[GRAD] epoch=2 batch=300 grad_norm=2.140440 max_abs_grad=1.900238 param_norm=2863.884574


Epoch 2/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 409/782 [00:06<00:05, 66.79it/s, loss=0.5156]

[GRAD] epoch=2 batch=400 grad_norm=1.766196 max_abs_grad=1.498516 param_norm=2863.890151


Epoch 2/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 509/782 [00:08<00:04, 62.19it/s, loss=0.5149]

[GRAD] epoch=2 batch=500 grad_norm=0.820678 max_abs_grad=0.614220 param_norm=2863.896641


Epoch 2/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 607/782 [00:09<00:02, 64.30it/s, loss=0.5130]

[GRAD] epoch=2 batch=600 grad_norm=2.981407 max_abs_grad=2.730450 param_norm=2863.905316


Epoch 2/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 709/782 [00:11<00:01, 65.72it/s, loss=0.5130]

[GRAD] epoch=2 batch=700 grad_norm=0.958292 max_abs_grad=0.758375 param_norm=2863.909344


Epoch 2/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 64.03it/s, loss=0.5142]



üîç √âvaluation epoch 2...

EPOCH 2/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5521  (baseline: 0.8517)
  MAE:         0.2344
  Am√©lioration: +35.2% vs baseline
  Corr√©lation: 0.7616
  Std preds:   0.6391  (cible: 0.8517)
  Mean preds:  0.0310  (cible: 0.0357)
  ‚úì  Bon apprentissage!


[Epoch 3] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000100


Epoch 3/10:   1%|          | 6/782 [00:00<00:14, 55.42it/s, loss=0.5025]

[GRAD] epoch=3 batch=0 grad_norm=0.724918 max_abs_grad=0.409585 param_norm=2863.913266


Epoch 3/10:  14%|‚ñà‚ñç        | 112/782 [00:01<00:10, 62.45it/s, loss=0.5031]

[GRAD] epoch=3 batch=100 grad_norm=0.974419 max_abs_grad=0.932668 param_norm=2863.917484


Epoch 3/10:  27%|‚ñà‚ñà‚ñã       | 211/782 [00:03<00:09, 62.95it/s, loss=0.5003]

[GRAD] epoch=3 batch=200 grad_norm=3.206643 max_abs_grad=2.662209 param_norm=2863.922192


Epoch 3/10:  40%|‚ñà‚ñà‚ñà‚ñà      | 313/782 [00:04<00:06, 68.70it/s, loss=0.5022]

[GRAD] epoch=3 batch=300 grad_norm=1.830021 max_abs_grad=1.750971 param_norm=2863.928772


Epoch 3/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 408/782 [00:06<00:05, 63.99it/s, loss=0.5056]

[GRAD] epoch=3 batch=400 grad_norm=1.296922 max_abs_grad=1.007384 param_norm=2863.933113


Epoch 3/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 512/782 [00:08<00:04, 60.16it/s, loss=0.5044]

[GRAD] epoch=3 batch=500 grad_norm=0.955218 max_abs_grad=0.883657 param_norm=2863.938655


Epoch 3/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 612/782 [00:09<00:02, 64.55it/s, loss=0.5092]

[GRAD] epoch=3 batch=600 grad_norm=1.232452 max_abs_grad=1.044017 param_norm=2863.945574


Epoch 3/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 711/782 [00:11<00:01, 66.01it/s, loss=0.5094]

[GRAD] epoch=3 batch=700 grad_norm=2.995108 max_abs_grad=2.796591 param_norm=2863.951818


Epoch 3/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 63.46it/s, loss=0.5092]



üîç √âvaluation epoch 3...

EPOCH 3/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5091  (baseline: 0.8303)
  MAE:         0.2336
  Am√©lioration: +38.7% vs baseline
  Corr√©lation: 0.7901
  Std preds:   0.6493  (cible: 0.8303)
  Mean preds:  0.0247  (cible: 0.0278)
  ‚úì  Bon apprentissage!


[Epoch 4] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000100


Epoch 4/10:   1%|          | 5/782 [00:00<00:17, 45.53it/s, loss=0.5161]

[GRAD] epoch=4 batch=0 grad_norm=3.540634 max_abs_grad=3.139233 param_norm=2863.956739


Epoch 4/10:  14%|‚ñà‚ñç        | 110/782 [00:01<00:10, 63.07it/s, loss=0.5190]

[GRAD] epoch=4 batch=100 grad_norm=1.153062 max_abs_grad=1.044369 param_norm=2863.963500


Epoch 4/10:  27%|‚ñà‚ñà‚ñã       | 210/782 [00:03<00:08, 67.21it/s, loss=0.5161]

[GRAD] epoch=4 batch=200 grad_norm=2.218287 max_abs_grad=1.797508 param_norm=2863.968250


Epoch 4/10:  40%|‚ñà‚ñà‚ñà‚ñâ      | 309/782 [00:04<00:07, 66.65it/s, loss=0.5120]

[GRAD] epoch=4 batch=300 grad_norm=4.677987 max_abs_grad=4.343200 param_norm=2863.973098


Epoch 4/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 407/782 [00:06<00:06, 61.72it/s, loss=0.5084]

[GRAD] epoch=4 batch=400 grad_norm=1.901593 max_abs_grad=1.740224 param_norm=2863.977814


Epoch 4/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 512/782 [00:08<00:04, 62.42it/s, loss=0.5119]

[GRAD] epoch=4 batch=500 grad_norm=4.141823 max_abs_grad=3.356983 param_norm=2863.982841


Epoch 4/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 611/782 [00:09<00:02, 65.15it/s, loss=0.5131]

[GRAD] epoch=4 batch=600 grad_norm=1.191742 max_abs_grad=0.718456 param_norm=2863.987194


Epoch 4/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 711/782 [00:11<00:01, 63.81it/s, loss=0.5136]

[GRAD] epoch=4 batch=700 grad_norm=0.703314 max_abs_grad=0.542745 param_norm=2863.992914


Epoch 4/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 63.28it/s, loss=0.5129]



üîç √âvaluation epoch 4...

EPOCH 4/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5427  (baseline: 0.7941)
  MAE:         0.2336
  Am√©lioration: +31.7% vs baseline
  Corr√©lation: 0.7337
  Std preds:   0.6384  (cible: 0.7941)
  Mean preds:  0.0221  (cible: 0.0390)
  ‚úì  Bon apprentissage!


[Epoch 5] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000100


Epoch 5/10:   1%|          | 5/782 [00:00<00:15, 49.25it/s, loss=0.5192]

[GRAD] epoch=5 batch=0 grad_norm=0.672186 max_abs_grad=0.622433 param_norm=2863.995371


Epoch 5/10:  14%|‚ñà‚ñç        | 109/782 [00:01<00:10, 65.22it/s, loss=0.5146]

[GRAD] epoch=5 batch=100 grad_norm=1.993751 max_abs_grad=1.456071 param_norm=2864.001399


Epoch 5/10:  27%|‚ñà‚ñà‚ñã       | 208/782 [00:03<00:08, 68.23it/s, loss=0.5136]

[GRAD] epoch=5 batch=200 grad_norm=2.832227 max_abs_grad=2.327103 param_norm=2864.006708


Epoch 5/10:  39%|‚ñà‚ñà‚ñà‚ñâ      | 307/782 [00:04<00:07, 63.76it/s, loss=0.5188]

[GRAD] epoch=5 batch=300 grad_norm=2.543439 max_abs_grad=2.347676 param_norm=2864.011389


Epoch 5/10:  53%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé    | 413/782 [00:06<00:05, 62.02it/s, loss=0.5172]

[GRAD] epoch=5 batch=400 grad_norm=2.471264 max_abs_grad=2.067453 param_norm=2864.017426


Epoch 5/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñç   | 507/782 [00:08<00:04, 65.75it/s, loss=0.5195]

[GRAD] epoch=5 batch=500 grad_norm=0.648273 max_abs_grad=0.403222 param_norm=2864.022081


Epoch 5/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 607/782 [00:09<00:02, 64.37it/s, loss=0.5188]

[GRAD] epoch=5 batch=600 grad_norm=1.967470 max_abs_grad=1.173296 param_norm=2864.027917


Epoch 5/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 708/782 [00:11<00:01, 69.44it/s, loss=0.5195]

[GRAD] epoch=5 batch=700 grad_norm=0.616067 max_abs_grad=0.441355 param_norm=2864.032713


Epoch 5/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 64.40it/s, loss=0.5209]



üîç √âvaluation epoch 5...

EPOCH 5/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5137  (baseline: 0.8240)
  MAE:         0.2320
  Am√©lioration: +37.7% vs baseline
  Corr√©lation: 0.7824
  Std preds:   0.6237  (cible: 0.8240)
  Mean preds:  0.0369  (cible: 0.0434)
  ‚úì  Bon apprentissage!


[Epoch 6] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000050


Epoch 6/10:   1%|          | 6/782 [00:00<00:13, 57.04it/s, loss=0.5161]

[GRAD] epoch=6 batch=0 grad_norm=2.538424 max_abs_grad=2.215186 param_norm=2864.035425


Epoch 6/10:  14%|‚ñà‚ñç        | 111/782 [00:01<00:10, 63.45it/s, loss=0.5136]

[GRAD] epoch=6 batch=100 grad_norm=0.825253 max_abs_grad=0.747685 param_norm=2864.038068


Epoch 6/10:  27%|‚ñà‚ñà‚ñã       | 211/782 [00:03<00:09, 63.36it/s, loss=0.5183]

[GRAD] epoch=6 batch=200 grad_norm=0.441753 max_abs_grad=0.272720 param_norm=2864.040808


Epoch 6/10:  39%|‚ñà‚ñà‚ñà‚ñâ      | 307/782 [00:05<00:08, 57.00it/s, loss=0.5160]

[GRAD] epoch=6 batch=300 grad_norm=1.174782 max_abs_grad=1.109916 param_norm=2864.043741


Epoch 6/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 410/782 [00:06<00:05, 62.91it/s, loss=0.5161]

[GRAD] epoch=6 batch=400 grad_norm=1.593721 max_abs_grad=1.403701 param_norm=2864.046508


Epoch 6/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 509/782 [00:08<00:04, 63.42it/s, loss=0.5182]

[GRAD] epoch=6 batch=500 grad_norm=0.797848 max_abs_grad=0.710464 param_norm=2864.049305


Epoch 6/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 609/782 [00:09<00:02, 63.65it/s, loss=0.5182]

[GRAD] epoch=6 batch=600 grad_norm=1.675811 max_abs_grad=1.544051 param_norm=2864.052238


Epoch 6/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 708/782 [00:11<00:01, 64.40it/s, loss=0.5171]

[GRAD] epoch=6 batch=700 grad_norm=0.193486 max_abs_grad=0.148709 param_norm=2864.054996


Epoch 6/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 62.40it/s, loss=0.5161]



üîç √âvaluation epoch 6...

EPOCH 6/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5419  (baseline: 0.8450)
  MAE:         0.2381
  Am√©lioration: +35.9% vs baseline
  Corr√©lation: 0.7674
  Std preds:   0.6406  (cible: 0.8450)
  Mean preds:  0.0457  (cible: 0.0385)
  ‚úì  Bon apprentissage!


[Epoch 7] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000050


Epoch 7/10:   1%|          | 6/782 [00:00<00:15, 51.51it/s, loss=0.4311]

[GRAD] epoch=7 batch=0 grad_norm=1.538235 max_abs_grad=1.455298 param_norm=2864.057336


Epoch 7/10:  14%|‚ñà‚ñç        | 111/782 [00:01<00:10, 63.39it/s, loss=0.5000]

[GRAD] epoch=7 batch=100 grad_norm=0.468247 max_abs_grad=0.375585 param_norm=2864.060055


Epoch 7/10:  27%|‚ñà‚ñà‚ñã       | 211/782 [00:03<00:08, 63.82it/s, loss=0.5095]

[GRAD] epoch=7 batch=200 grad_norm=0.414182 max_abs_grad=0.258776 param_norm=2864.061881


Epoch 7/10:  40%|‚ñà‚ñà‚ñà‚ñâ      | 309/782 [00:04<00:07, 62.57it/s, loss=0.5080]

[GRAD] epoch=7 batch=300 grad_norm=1.608223 max_abs_grad=1.245239 param_norm=2864.064404


Epoch 7/10:  52%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñè    | 410/782 [00:06<00:05, 66.26it/s, loss=0.5079]

[GRAD] epoch=7 batch=400 grad_norm=0.427560 max_abs_grad=0.298484 param_norm=2864.066241


Epoch 7/10:  66%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 513/782 [00:08<00:04, 59.57it/s, loss=0.5097]

[GRAD] epoch=7 batch=500 grad_norm=0.670480 max_abs_grad=0.560051 param_norm=2864.069483


Epoch 7/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 612/782 [00:09<00:02, 65.74it/s, loss=0.5096]

[GRAD] epoch=7 batch=600 grad_norm=1.255506 max_abs_grad=1.117853 param_norm=2864.072348


Epoch 7/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 711/782 [00:11<00:01, 63.63it/s, loss=0.5127]

[GRAD] epoch=7 batch=700 grad_norm=0.905500 max_abs_grad=0.677804 param_norm=2864.074660


Epoch 7/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 61.83it/s, loss=0.5135]



üîç √âvaluation epoch 7...

EPOCH 7/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5600  (baseline: 0.8593)
  MAE:         0.2400
  Am√©lioration: +34.8% vs baseline
  Corr√©lation: 0.7584
  Std preds:   0.6459  (cible: 0.8593)
  Mean preds:  0.0514  (cible: 0.0523)
  ‚úì  Bon apprentissage!


[Epoch 8] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000050


Epoch 8/10:   1%|          | 5/782 [00:00<00:16, 47.54it/s, loss=0.5032]

[GRAD] epoch=8 batch=0 grad_norm=1.191977 max_abs_grad=1.041030 param_norm=2864.077749


Epoch 8/10:  14%|‚ñà‚ñç        | 111/782 [00:01<00:11, 57.82it/s, loss=0.5068]

[GRAD] epoch=8 batch=100 grad_norm=2.261725 max_abs_grad=2.032706 param_norm=2864.081146


Epoch 8/10:  27%|‚ñà‚ñà‚ñã       | 213/782 [00:03<00:09, 60.76it/s, loss=0.5011]

[GRAD] epoch=8 batch=200 grad_norm=1.551285 max_abs_grad=1.431441 param_norm=2864.083482


Epoch 8/10:  39%|‚ñà‚ñà‚ñà‚ñâ      | 308/782 [00:05<00:06, 67.96it/s, loss=0.5055]

[GRAD] epoch=8 batch=300 grad_norm=1.930723 max_abs_grad=1.738414 param_norm=2864.085359


Epoch 8/10:  53%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé    | 413/782 [00:06<00:05, 62.62it/s, loss=0.5017]

[GRAD] epoch=8 batch=400 grad_norm=1.480405 max_abs_grad=1.327017 param_norm=2864.088476


Epoch 8/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñç   | 507/782 [00:08<00:04, 64.30it/s, loss=0.5044]

[GRAD] epoch=8 batch=500 grad_norm=1.844896 max_abs_grad=1.531610 param_norm=2864.090890


Epoch 8/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 608/782 [00:09<00:02, 63.89it/s, loss=0.5063]

[GRAD] epoch=8 batch=600 grad_norm=0.287045 max_abs_grad=0.180724 param_norm=2864.093629


Epoch 8/10:  90%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñâ | 702/782 [00:11<00:01, 65.62it/s, loss=0.5087]

[GRAD] epoch=8 batch=700 grad_norm=1.462516 max_abs_grad=1.024278 param_norm=2864.096632


Epoch 8/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 62.42it/s, loss=0.5128]



üîç √âvaluation epoch 8...

EPOCH 8/10 - √âvaluation sur 5,000 positions
  RMSE:        0.4696  (baseline: 0.7787)
  MAE:         0.2220
  Am√©lioration: +39.7% vs baseline
  Corr√©lation: 0.7981
  Std preds:   0.6145  (cible: 0.7787)
  Mean preds:  0.0659  (cible: 0.0487)
  ‚úì  Bon apprentissage!


[Epoch 9] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000050


Epoch 9/10:   1%|          | 6/782 [00:00<00:13, 55.69it/s, loss=0.4693]

[GRAD] epoch=9 batch=0 grad_norm=1.993881 max_abs_grad=1.670438 param_norm=2864.098259


Epoch 9/10:  14%|‚ñà‚ñç        | 111/782 [00:01<00:11, 60.55it/s, loss=0.4919]

[GRAD] epoch=9 batch=100 grad_norm=0.342615 max_abs_grad=0.304939 param_norm=2864.100143


Epoch 9/10:  27%|‚ñà‚ñà‚ñã       | 209/782 [00:03<00:08, 64.92it/s, loss=0.5039]

[GRAD] epoch=9 batch=200 grad_norm=1.086846 max_abs_grad=0.936283 param_norm=2864.103204


Epoch 9/10:  40%|‚ñà‚ñà‚ñà‚ñâ      | 311/782 [00:04<00:06, 67.93it/s, loss=0.5092]

[GRAD] epoch=9 batch=300 grad_norm=0.438400 max_abs_grad=0.345720 param_norm=2864.105041


Epoch 9/10:  53%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñé    | 411/782 [00:06<00:05, 65.20it/s, loss=0.5046]

[GRAD] epoch=9 batch=400 grad_norm=1.368421 max_abs_grad=1.015411 param_norm=2864.107768


Epoch 9/10:  65%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñå   | 510/782 [00:07<00:04, 65.98it/s, loss=0.5070]

[GRAD] epoch=9 batch=500 grad_norm=1.921076 max_abs_grad=1.747695 param_norm=2864.110425


Epoch 9/10:  78%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñä  | 608/782 [00:09<00:02, 63.70it/s, loss=0.5073]

[GRAD] epoch=9 batch=600 grad_norm=0.633907 max_abs_grad=0.601677 param_norm=2864.112481


Epoch 9/10:  91%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà | 713/782 [00:11<00:01, 62.15it/s, loss=0.5068]

[GRAD] epoch=9 batch=700 grad_norm=1.746314 max_abs_grad=1.338913 param_norm=2864.115317


Epoch 9/10: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 782/782 [00:12<00:00, 64.15it/s, loss=0.5066]



üîç √âvaluation epoch 9...

EPOCH 9/10 - √âvaluation sur 5,000 positions
  RMSE:        0.5202  (baseline: 0.8177)
  MAE:         0.2259
  Am√©lioration: +36.4% vs baseline
  Corr√©lation: 0.7717
  Std preds:   0.6414  (cible: 0.8177)
  Mean preds:  0.0361  (cible: 0.0422)
  ‚úì  Bon apprentissage!


[Epoch 10] üé≤ √âchantillonnage: 200,000 positions sur 12,640,202
‚û°Ô∏è Learning rate courant: 0.000050


Epoch 10/10:   1%|          | 6/782 [00:00<00:14, 53.68it/s, loss=0.5406]

[GRAD] epoch=10 batch=0 grad_norm=3.880266 max_abs_grad=3.482190 param_norm=2864.117445


Epoch 10/10:  14%|‚ñà‚ñç        | 111/782 [00:01<00:10, 62.71it/s, loss=0.5265]

[GRAD] epoch=10 batch=100 grad_norm=1.666374 max_abs_grad=1.355342 param_norm=2864.119222


Epoch 10/10:  27%|‚ñà‚ñà‚ñã       | 209/782 [00:03<00:10, 52.38it/s, loss=0.5187]

[GRAD] epoch=10 batch=200 grad_norm=1.241821 max_abs_grad=1.018399 param_norm=2864.122207


Epoch 10/10:  29%|‚ñà‚ñà‚ñä       | 223/782 [00:03<00:09, 58.72it/s, loss=0.5186]

## 14. Test du mod√®le sur des positions al√©atoires

In [None]:
# @title
# Passer le mod√®le en mode √©valuation
model.eval()

# Tester sur quelques positions al√©atoires DU DATASET DE VALIDATION
num_tests = 10
test_indices = np.random.choice(len(X_val), num_tests, replace=False)

print("=" * 60)
print(f"TEST SUR {num_tests} POSITIONS AL√âATOIRES (VALIDATION SET)")
print("=" * 60)

errors = []

with torch.no_grad():
    for i, idx in enumerate(test_indices, 1):
        # Cr√©er un dataset temporaire pour encoder la position
        temp_dataset = ChessDataset([X_val[idx]], [y_val[idx]])
        x, _ = temp_dataset[0]
        x = x.unsqueeze(0).to(CONFIG['device'])

        y_true = y_val[idx]
        y_pred = model(x).cpu().numpy()[0, 0]
        error = abs(y_true - y_pred)
        errors.append(error)

        print(f"\nPosition {i}:")
        print(f"  √âvaluation r√©elle:  {y_true:+8.4f}")
        print(f"  Pr√©diction mod√®le:  {y_pred:+8.4f}")
        print(f"  Erreur absolue:     {error:8.4f}")

        # Indicateur visuel de la qualit√©
        if error < 0.1:
            print(f"  Qualit√©: ‚úÖ Excellente")
        elif error < 0.3:
            print(f"  Qualit√©: ‚úì Bonne")
        elif error < 0.5:
            print(f"  Qualit√©: ‚ö† Moyenne")
        else:
            print(f"  Qualit√©: ‚ùå Faible")

print("\n" + "=" * 60)
print("STATISTIQUES DES TESTS")
print("=" * 60)
print(f"Erreur moyenne: {np.mean(errors):.4f}")
print(f"Erreur m√©diane: {np.median(errors):.4f}")
print(f"Erreur min:     {np.min(errors):.4f}")
print(f"Erreur max:     {np.max(errors):.4f}")
print(f"√âcart-type:     {np.std(errors):.4f}")
print("=" * 60)
