# ðŸ§© HybridPoHHRM Sudoku Benchmark

Train a **master-level Sudoku AI** using **HybridPoHHRM** - combining HRM's two-timescale reasoning with PoT head routing.

## Architecture
- **z_H, z_L**: Persistent hidden states (HRM-style)
- **L_level**: Fast reasoning (inner loop, 8 cycles)
- **H_level**: Slow reasoning (outer loop, 2 cycles)
- **PoT head routing**: Dynamic attention head selection in both levels

## Expected Results
| Model | Parameters | Grid Accuracy | Note |
|-------|------------|---------------|------|
| HRM (paper) | 27M | ~55% | On 1K training puzzles |
| **HybridPoHHRM** | ~26M | 50-55% | Matching HRM |

## Hardware Requirements
- **GPU**: A100 recommended (40GB VRAM)
- **Runtime**: ~6-10 hours for full training


## 1. Setup


In [None]:
# Clone PoT repository
!git clone https://github.com/Eran-BA/PoT.git /content/PoT 2>/dev/null || (cd /content/PoT && git pull)
%cd /content/PoT

# Install dependencies
!pip install -q torch torchvision torchaudio
!pip install -q tqdm numpy huggingface_hub


In [None]:
# Verify GPU
import torch
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")


## 2. Download Sudoku Dataset

Downloads the **Sudoku-Extreme** dataset from HuggingFace (`sapientinc/sudoku-extreme`):
- 1000 extreme-difficulty puzzles
- 1000 augmentations per puzzle (validity-preserving transforms)
- Total: ~1,000,000 training samples


In [None]:
# Optional: Pre-download dataset (--download in training command also works)
!python experiments/sudoku_poh_benchmark.py --download --epochs 0


## 3. Train HybridPoHHRM Sudoku Solver

### Hyperparameters
```yaml
d_model: 512, n_heads: 8
H_cycles: 2, L_cycles: 8  # Two-timescale reasoning
H_layers: 2, L_layers: 2
lr: 3e-4, weight_decay: 0.01
epochs: 20000, batch_size: 128
```


In [None]:
# Full training (A100)
!python experiments/sudoku_poh_benchmark.py \
    --download \
    --model hybrid \
    --epochs 20000 \
    --batch-size 128 \
    --eval-interval 500 \
    --hrm-grad-style


## 4. Quick Test (~30 min)

For a quick sanity check, run with fewer epochs:


In [None]:
# Quick test (~1 hour)
!python experiments/sudoku_poh_benchmark.py \
    --download \
    --model hybrid \
    --epochs 1000 \
    --batch-size 128 \
    --eval-interval 100


## 5. View Results


## References

- [HRM Paper](https://arxiv.org/abs/2506.21734): Hierarchical Reasoning Model
- [HRM GitHub](https://github.com/sapientinc/HRM): Official implementation  
- [PoT GitHub](https://github.com/Eran-BA/PoT): Pointer-over-Heads Transformer
