# GENERAL: Strategic Game AI Training

GPU-accelerated training using AlphaZero-style RL. Optimizations enabled for Colab (T4) and Local (RTX) environments.

This notebook uses the optimized `GenGameAI` codebase with:
- **Non-blocking Inference** (Asyncio + ThreadPool)
- **Efficient Data Loading** (DataLoader with pinned memory)
- **Auto-Configuration** (Detects GPU VRAM and CPU cores)

## 1. Setup

In [None]:
!git clone https://github.com/Tanish-2006/Generals.git
%cd Generals

In [None]:
!python3.11 -m pip install torch numpy

In [1]:
%cd ..

e:\Projects\GenGameAI


## 2. Configuration & Hardware Detection

In [2]:
import sys
sys.path.insert(0, '.')
import torch
from config import TRAINING, NETWORK

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("WARNING: No GPU detected.")

print("\nAuto-Detected Configuration:")
print(f"  Workers: {TRAINING.num_workers}")
print(f"  Batch Size: {TRAINING.batch_size}")
print(f"  Games/Iter: {TRAINING.games_per_iter}")

GPU: NVIDIA GeForce RTX 4060 Laptop GPU
VRAM: 8.59 GB

Auto-Detected Configuration:
  Workers: 0
  Batch Size: 64
  Games/Iter: 16


## 3. Upload Previous Model (Optional)
If you have a `model_latest.pth` or `model_old.pth` from a previous run, upload it here to resume training.

In [None]:
from google.colab import files
from pathlib import Path

CHECKPOINT_DIR = Path("data/checkpoints")
CHECKPOINT_DIR.mkdir(parents=True, exist_ok=True)

print("Upload model_latest.pth or model_old.pth if you have one:")
uploaded = files.upload()

for filename in uploaded.keys():
    target_path = CHECKPOINT_DIR / filename
    with open(target_path, 'wb') as f:
        f.write(uploaded[filename])
    print(f"Saved {filename} to {target_path}")

## 4. Training Loop
Runs the main optimized training loop. Supports resuming if models exist.

In [None]:
from main import main_loop

await main_loop(max_iterations=20)

[main] Found 1 replay batches. Resuming from iteration 2.
[Trainer] Using device: cuda
[Trainer] AMP enabled for faster training
[Trainer] JIT compilation skipped on Windows (Triton not supported)
[main] Resuming with Best Model (model_old.pth)
[main] Successfully loaded model from E:\Projects\GenGameAI\data\checkpoints\model_old.pth
[InferenceServer] Started on cuda:0 with batch_size=16

[main] ITERATION 2 - self-play 16 games
[main] Generating 16 games concurrently...
[ReplayBuffer] Saved: E:\Projects\GenGameAI\data\replay\batch_0002.npz
[main] Loading replay data to train
[ReplayBuffer] Loaded 2 batches
[main] Training for 3 epochs...

[Trainer] Training on 3560 samples

Epoch 1/3
  Batch   0/55 | Loss: 9.9759 | Policy: 8.9100 | Value: 1.0659
  Batch  20/55 | Loss: 8.9699 | Policy: 8.3899 | Value: 0.5799
  Batch  40/55 | Loss: 8.6113 | Policy: 8.0985 | Value: 0.5128

Epoch 1 Average Loss: 9.0051

Epoch 2/3
  Batch   0/55 | Loss: 7.6151 | Policy: 7.2907 | Value: 0.3244
  Batch  20/55

## 5. Download Trained Model
Download the latest model checkpoint.

In [None]:
from google.colab import files
from pathlib import Path

model_path = Path("data/checkpoints/model_latest.pth")
if model_path.exists():
    files.download(str(model_path))
    print(f"Downloaded: {model_path}")
else:
    print("No model found.")

## 6. Monitoring (Optional)

In [None]:
from pathlib import Path

CHECKPOINT_DIR = Path("data/checkpoints")
REPLAY_DIR = Path("data/replay")

def show_training_status():
    print("=" * 50)
    print("TRAINING STATUS")
    print("=" * 50)
    
    if CHECKPOINT_DIR.exists():
        checkpoints = list(CHECKPOINT_DIR.glob("*.pth"))
        print(f"\nCheckpoints: {len(checkpoints)}")
        for cp in checkpoints:
            size_mb = cp.stat().st_size / (1024 * 1024)
            print(f"  - {cp.name}: {size_mb:.2f} MB")
    
    if REPLAY_DIR.exists():
        replays = list(REPLAY_DIR.glob("*.npz"))
        print(f"\nReplay batches: {len(replays)}")
        if replays:
            total_size = sum(r.stat().st_size for r in replays) / (1024 * 1024)
            print(f"  Total size: {total_size:.2f} MB")
    
    if torch.cuda.is_available():
        print("\nGPU Memory:")
        print(f"  Allocated: {torch.cuda.memory_allocated(0) / 1e9:.2f} GB")
        print(f"  Cached: {torch.cuda.memory_reserved(0) / 1e9:.2f} GB")

show_training_status()