# ðŸ§© PoT Sudoku Benchmark - Master-Level Sudoku Solver

Train a **master-level Sudoku AI** using PoT (Pointer-over-Heads) architecture.

This notebook replicates the [HRM paper's Sudoku demo](https://github.com/sapientinc/HRM) using PoT components:
- **PoHStack + IterRefiner** for iterative reasoning
- **HRMPointerController** for hierarchical head routing
- **QHaltingController** for adaptive computation
- **PuzzleEmbedding** for per-instance specialization

## Expected Results
| Model | Parameters | Grid Accuracy | Runtime |
|-------|------------|---------------|---------|
| HRM (paper) | 27M | ~100% | 10 hours |
| **PoT (ours)** | ~27M | 95%+ | 10 hours |

## Hardware Requirements
- **GPU**: RTX 4070 / T4 / A100 (any modern GPU with 8GB+ VRAM)
- **Runtime**: ~10 hours for full training


## 1. Setup


In [None]:
# Clone PoT repository
!git clone https://github.com/Eran-BA/PoT.git /content/PoT 2>/dev/null || (cd /content/PoT && git pull)
%cd /content/PoT

# Install dependencies
!pip install -q torch torchvision torchaudio
!pip install -q tqdm numpy huggingface_hub


In [None]:
# Verify GPU
import torch
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")


## 2. Download Sudoku Dataset

Downloads the **Sudoku-Extreme** dataset from HuggingFace (`sapientinc/sudoku-extreme`):
- 1000 extreme-difficulty puzzles
- 1000 augmentations per puzzle (validity-preserving transforms)
- Total: ~1,000,000 training samples


In [None]:
# Download and build dataset
!python experiments/sudoku_poh_benchmark.py \
    --download \
    --data-dir data/sudoku-extreme-1k-aug-1000 \
    --subsample 1000 \
    --num-aug 1000 \
    --epochs 0


## 3. Train PoH Sudoku Solver (~10 hours)

### Hyperparameters (from HRM paper)
```yaml
d_model: 512, n_heads: 8, n_layers: 2
R: 8 (refinement iterations), T: 4 (HRM period)
epochs: 20000, batch_size: 384
lr: 7e-5, weight_decay: 1.0
```


In [None]:
# Full training (~10 hours on RTX 4070)
!python experiments/sudoku_poh_benchmark.py \
    --data-dir data/sudoku-extreme-1k-aug-1000 \
    --model poh \
    --d-model 512 \
    --n-heads 8 \
    --n-layers 2 \
    --R 8 \
    --T 4 \
    --epochs 20000 \
    --batch-size 384 \
    --lr 7e-5 \
    --puzzle-lr 7e-5 \
    --weight-decay 1.0 \
    --eval-interval 500 \
    --output experiments/results/sudoku_poh


## 4. Quick Test (~30 min)

For a quick sanity check, run with fewer epochs:


In [None]:
# Quick test (~30 minutes)
!python experiments/sudoku_poh_benchmark.py \
    --data-dir data/sudoku-extreme-1k-aug-1000 \
    --model poh \
    --epochs 1000 \
    --batch-size 256 \
    --eval-interval 100 \
    --output experiments/results/sudoku_poh_quick


## 5. View Results


## References

- [HRM Paper](https://arxiv.org/abs/2506.21734): Hierarchical Reasoning Model
- [HRM GitHub](https://github.com/sapientinc/HRM): Official implementation  
- [PoT GitHub](https://github.com/Eran-BA/PoT): Pointer-over-Heads Transformer
