A self-correcting pipeline that solves NP-hard combinatorial optimization problems using Genetic Algorithms, Iterated Local Search, and SPIN/Promela formal verification — orchestrated through a blackboard architecture with an autonomous feedback loop.
- Overview
- Architecture
- Key Results
- Prerequisites
- Installation
- Project Structure
- Quick Start
- How It Works
- Core Components
- Distance Convention (Critical)
- Configuration
- Algorithmic Toolkit
- Optimization Guide
- Extending to Other Problems
- Hard Rules
- Troubleshooting
- Benchmarks
- Contributing
- License
NP-Solver is a pipeline that combines metaheuristic search with formal verification to solve NP-hard problems — starting with the Travelling Salesman Problem (TSP). It does not just find solutions; it proves properties about them using the SPIN model checker.
The system is designed to run autonomously via a feedback loop (the "Ralph Loop"): it solves, verifies, diagnoses failures, patches parameters, and iterates until a formally verified solution meeting the quality threshold is found.
What makes it different from a plain GA/TSP solver:
| Feature | Plain GA | NP-Solver |
|---|---|---|
| Finds good tours | ✅ | ✅ |
| Proves tour validity (Hamiltonian cycle) | ❌ | ✅ via SPIN |
| Verifies claimed cost matches actual cost | ❌ | ✅ via SPIN |
| Self-tunes parameters on failure | ❌ | ✅ via feedback agent |
| Autonomous iteration with diagnosis | ❌ | ✅ via orchestrator |
| Emits formal promise on success | ❌ | ✅ GAP_LT_5_PERCENT_VERIFIED |
┌───────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR │
│ (agents/orchestrator.py) │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌───────────────────────┐ │
│ │ GA + │───▶│ BLACKBOARD │───▶│ SPIN VERIFICATION │ │
│ │ ILS │ │ (JSON) │ │ (Promela model) │ │
│ │ Solver │ │ │◀───│ │ │
│ └──────────┘ │ • problem │ └───────────────────────┘ │
│ ▲ │ • best_cost │ │ │
│ │ │ • best_tour │ ▼ │
│ │ │ • gap │ ┌───────────────────────┐ │
│ │ │ • params │ │ FEEDBACK AGENT │ │
│ └──────────│ • promise │◀───│ (gap analysis, │ │
│ │ │ │ param tuning) │ │
│ └──────────────┘ └───────────────────────┘ │
│ │
│ Repeat until promise == "GAP_LT_5_PERCENT_VERIFIED" │
└───────────────────────────────────────────────────────────────────┘
Data Flow:
- Solver (GA + ILS) writes
best_solutionandbest_costto the blackboard - Promela generator encodes the candidate tour as a SPIN model
- SPIN runner compiles and verifies the model (permutation check + cost check)
- Feedback agent reads verification results, computes gap vs. known optimal, tunes parameters
- Orchestrator checks for the success promise; if not met, loops back to step 1
| Instance | Cities | Optimal | Found | Gap | Iterations | SPIN Status |
|---|---|---|---|---|---|---|
| eil51 | 51 | 426 | 426 | 0.00% | 1 | ✅ PASS |
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.10+ | Runtime |
| SPIN | 6.5+ | Model checking / formal verification |
| GCC | Any recent | Compiling SPIN-generated verifiers (pan.c) |
| pip | Latest | Python package management |
Verify installation:
python3 --version # Python 3.10+
which spin # /usr/bin/spin or similar
which gcc # /usr/bin/gcc- macOS:
brew install spin - Ubuntu/Debian:
apt-get install spin - From source: spinroot.com
# Clone the repository
git clone https://github.com/your-org/np-solver.git
cd np-solver
# Install Python dependencies
pip install -r requirements.txt
# Verify everything compiles
python3 -m py_compile agents/orchestrator.py
python3 -m py_compile strategies/ga_tsp.py
python3 -m py_compile models/promela_gen.py
python3 -m py_compile verification/spin_runner.py
python3 -m py_compile agents/feedback.py
# Run tests
pytest tests/ -vdeap>=1.4 # Genetic algorithm framework
pytest>=7.0 # Testing
np-solver/
├── agents/
│ ├── orchestrator.py # Main loop: solver → verify → feedback → repeat
│ └── feedback.py # Gap analysis, parameter tuning, stagnation detection
├── strategies/
│ └── ga_tsp.py # GA + 2-opt + Or-opt + ILS + double-bridge solver
├── models/
│ └── promela_gen.py # Generates Promela models from candidate tours
├── verification/
│ └── spin_runner.py # Compiles and runs SPIN verification
├── tests/
│ └── test_np_solver.py # Integration tests (DO NOT MODIFY)
├── blackboard.json # Shared state: problem, solution, params, verification
├── ralph_improvements.log # Changelog from autonomous iterations
├── ralph_loop_eil51_prompt.md # Full task specification
└── README.md # This file
{
"problem": {
"type": "TSP",
"instance": [
[37, 52], [49, 49], [52, 64], [20, 26], [40, 30],
[21, 47], [17, 63], [31, 62], [52, 33], [51, 21],
[42, 41], [31, 32], [5, 25], [12, 42], [36, 16],
[52, 41], [27, 23], [17, 33], [13, 13], [57, 58],
[62, 42], [42, 57], [16, 57], [8, 52], [7, 38],
[27, 68], [30, 48], [43, 67], [58, 48], [58, 27],
[37, 69], [38, 46], [46, 10], [61, 33], [62, 63],
[63, 69], [32, 22], [45, 35], [59, 15], [5, 6],
[10, 17], [21, 10], [5, 64], [30, 15], [39, 10],
[32, 39], [25, 32], [25, 55], [48, 28], [56, 37],
[30, 40]
],
"optimal": 426,
"n": 51,
"distance_type": "EUC_2D"
},
"strategies": {
"current": "ga_tsp",
"tried": [],
"params": {
"pop_size": 200,
"gens": 500,
"mutpb": 0.15,
"cxpb": 0.7
}
},
"best_solution": null,
"best_cost": null,
"verification": { "pass_rate": 0, "status": "pending", "error": null },
"fitness_history": [],
"feedback": { "gap": 1.0, "pass_rate": 0, "stagnant_count": 0, "ready_for_test": false },
"iteration": 0,
"test_passed": false,
"promise": null
}# Clean SPIN artifacts
rm -f pan pan.c pan.h pan.b pan.m pan.t pan.p pan.trail _spin_nvr.tmp
# Run the orchestrator
python3 -m agents.orchestratorpython3 -c "
import json
with open('blackboard.json') as f:
bb = json.load(f)
print(f'Cost: {bb["best_cost"]}')
print(f'Gap: {bb["feedback"]["gap"]:.2%}')
print(f'SPIN: {bb["verification"]["status"]}')
print(f'Promise: {bb["promise"]}')
"pytest tests/test_np_solver.py -v- Load the TSP instance coordinates into
blackboard.json - Set
distance_type: "EUC_2D"andoptimal: 426 - Ensure all distance computations use
round(sqrt(...))— not raw float Euclidean - Configure SPIN runner for N=51:
pan -m100000, timeout 120s
Each iteration performs:
CLEAN → RUN orchestrator → OBSERVE blackboard → DIAGNOSE → IMPROVE → REPEAT
The orchestrator internally runs:
- GA phase — evolve a population of tours (200 pop × 500 gens)
- Seeding — inject nearest-neighbor heuristic tours into 20% of the population
- Elite extraction — pull top 3 from Hall of Fame + NN seeds as ILS starting points
- ILS phase — for each starting tour, run 200 double-bridge kicks with 2-opt refinement
- Write best tour to blackboard
- Promela generation — encode the tour as a SPIN model
- SPIN verification — compile and run the model checker
- Feedback — compute gap, update parameters, check stagnation
- Promise check — if gap < 5% and SPIN passes, emit
GAP_LT_5_PERCENT_VERIFIED
Python exception?
├── YES → Fix the bug, py_compile, re-run
└── NO ↓
SPIN compile error?
├── YES → Fix promela_gen.py (N mismatch, indexing)
└── NO ↓
SPIN verification fails?
├── YES → Fix Promela model logic
└── NO ↓
gap ≥ 5% and pass_rate ≥ 0.95?
├── YES → Improve solver (escalation ladder)
└── NO ↓
stagnant_count > 3?
├── YES → Escalate improvements
└── NO → Run another iteration
| Level | Trigger | Action |
|---|---|---|
| 1 | gap > 10% | Verify EUC_2D distances, ensure 2-opt is working |
| 2 | gap > 7% | Add nearest-neighbor seeding + Or-opt |
| 3 | gap > 6% | Add ILS with double-bridge kicks (200+) |
| 4 | gap > 5% | Increase ILS kicks to 500, add more start points |
| 5 | gap > 5% still | Add 3-opt moves inside local search |
| 6 | stagnant > 10 | Double population, fresh restart |
| 7 | stagnant > 20 | Add simulated annealing post-processing |
File: blackboard.json
The central shared state. All agents read from and write to this JSON file. Key fields:
| Field | Type | Description |
|---|---|---|
problem.instance |
[[x,y], ...] |
City coordinates |
problem.optimal |
int |
Known optimal tour cost |
problem.distance_type |
string |
"EUC_2D" — determines distance function |
strategies.params |
object |
GA parameters: pop_size, gens, mutpb, cxpb |
best_solution |
[int, ...] |
Best tour found (0-indexed city permutation) |
best_cost |
int |
Cost of best tour |
verification.status |
string |
"PASS", "FAIL", or "pending" |
verification.pass_rate |
float |
0.0 – 1.0 |
feedback.gap |
float |
(best_cost - optimal) / optimal |
feedback.stagnant_count |
int |
Consecutive iterations without improvement |
promise |
string or null |
"GAP_LT_5_PERCENT_VERIFIED" on success |
File: strategies/ga_tsp.py
The solver combines multiple techniques:
- Genetic Algorithm (via DEAP): ordered crossover, swap mutation, tournament selection
- Nearest-Neighbor Seeding: deterministic greedy tours seed 20% of the initial population
- 2-opt Local Search: iterative edge-swap improvement
- Or-opt: segment relocation (segments of 1, 2, and 3 cities)
- Double-Bridge Perturbation: 4-edge random reconnection to escape local optima
- Iterated Local Search (ILS):
local_search → double_bridge → local_searchrepeated
Files: models/promela_gen.py, verification/spin_runner.py
The Promela model encodes:
- Tour validity — every city appears exactly once (permutation check)
- Cost correctness — the computed cost matches the claimed cost
SPIN exhaustively checks these properties. If verification passes, the solution is formally proven correct — not just tested.
promela_gen.py: Candidate tour → Promela source (.pml)
spin_runner.py: .pml → spin -a → gcc pan.c → ./pan -m100000 → PASS/FAIL
File: agents/feedback.py
Reads blackboard state after each iteration and:
- Computes gap:
(best_cost - optimal) / optimal - Detects stagnation (no improvement for N iterations)
- Tunes GA parameters (increase gens, adjust mutation rate)
- Guards parameter bounds:
pop_size ≤ 1000,gens ≤ 2000,mutpb ≤ 0.3
File: agents/orchestrator.py
The main loop that ties everything together. Run with:
python3 -m agents.orchestratorIt calls solver → verification → feedback → promise check in sequence, and runs pytest at the end.
This is the single most important implementation detail.
TSPLIB EUC_2D instances use integer-rounded Euclidean distances:
import math
def euc_2d(a, b):
# TSPLIB EUC_2D: round to nearest integer.
return round(math.sqrt((a[0] - b[0])**2 + (a[1] - b[1])**2))
def tour_cost(tour, coords):
# Total tour cost using TSPLIB EUC_2D convention.
n = len(tour)
return sum(euc_2d(coords[tour[i]], coords[tour[(i+1) % n]]) for i in range(n))Why this matters:
- Raw
math.distreturns floats (e.g.,14.142135...) - TSPLIB expects integers (e.g.,
14) - The proven optimal of 426 is computed with integer distances
- Using float distances produces a different optimal, making the 5% gap check meaningless
The previous run's only bug was this mismatch. Once fixed, the solver hit optimal on the first iteration.
| Parameter | Default | Range | Description |
|---|---|---|---|
pop_size |
200 | 50–1000 | Population size |
gens |
500 | 100–2000 | Generations per GA run |
mutpb |
0.15 | 0.05–0.30 | Mutation probability |
cxpb |
0.7 | 0.5–0.9 | Crossover probability |
| Parameter | Default | Description |
|---|---|---|
n_kicks |
200 | Double-bridge perturbations per ILS run |
n_starts |
~56 | Starting tours (NN seeds + GA elites) |
| Parameter | Default | Description |
|---|---|---|
| Search depth | -m100000 |
Max search depth for pan |
| Timeout | 120s |
Subprocess timeout |
Iteratively reverses tour segments to remove crossing edges:
Before: ... → A → B → ... → C → D → ...
After: ... → A → C → ... → B → D → ... (segment B..C reversed)
Applied if delta = d(A,C) + d(B,D) - d(A,B) - d(C,D) < 0.
Relocates segments of 1, 2, or 3 consecutive cities to a better position in the tour. Complements 2-opt by handling moves that 2-opt cannot express.
A 4-opt perturbation that randomly splits the tour into 4 segments and reconnects them in a non-sequential order. This is the standard escape mechanism from 2-opt local optima:
def double_bridge(tour):
n = len(tour)
cuts = sorted(random.sample(range(1, n), 3))
a, b, c, d = cuts[0], cuts[1], cuts[2], n
return tour[:a] + tour[c:d] + tour[b:c] + tour[a:b]The core metaheuristic:
start → 2-opt → (best so far)
↓
double-bridge kick
↓
2-opt on perturbed tour
↓
accept if improved → repeat N times
Greedy construction: start at a city, always move to the nearest unvisited city. Fast (O(n²)), produces tours within ~20% of optimal. Used to seed the GA population.
- Precompute the distance matrix — replace per-call
euc_2d()with an N×N lookup table - Replace GA seeding with pure NN — skip 200×500 GA generations, seed ILS directly
- Parallelize ILS restarts — use
multiprocessing.Poolacross CPU cores - Use Numba for 2-opt — JIT-compile the inner loop for 10-50× speedup
- Add Lin-Kernighan style variable-depth moves
- Increase ILS kicks to 500–1000
- Add 3-opt when ILS with 2-opt plateaus
- Use portfolio approach — run multiple strategies concurrently
- Simplify the Promela model — verify only permutation, compute cost in Python
- Skip SPIN on intermediate iterations — only verify the final candidate
- Use neighbor lists in 2-opt (only check nearest 10–15 neighbors per city)
The architecture is problem-agnostic. To add a new problem type:
- Add a solver in
strategies/(e.g.,strategies/ga_knapsack.py) - Add a Promela model in
models/for the verification properties you care about - Update
blackboard.jsonwith the problem definition and optimality target - Update
feedback.pywith problem-specific gap calculation - Update
orchestrator.pyto dispatch to the new solver
Candidate problem classes:
- Vehicle Routing Problem (VRP)
- Job-Shop Scheduling
- Graph Coloring
- Knapsack / Bin Packing
- Satisfiability (SAT)
- ❌ NEVER modify
tests/test_np_solver.py - ❌ NEVER manually set
promise,test_passed,best_solution, orbest_costinblackboard.json - ❌ NEVER hardcode a tour
- ❌ NEVER skip
python3 -m agents.orchestrator(always run the full pipeline) - ✅ ALWAYS use
euc_2d(nearest-integer) distances — never raw Euclidean - ✅ ALWAYS clean SPIN artifacts before each run
- ✅ ALWAYS
py_compilechanged files before running the orchestrator - ✅ ALWAYS log changes to
ralph_improvements.log
| Symptom | Cause | Fix |
|---|---|---|
| Gap is slightly off from expected | Using math.dist instead of euc_2d |
Replace all distance calls with round(math.sqrt(...)) |
| SPIN compile error | Promela N doesn't match city count |
Update #define N in promela_gen.py |
| SPIN timeout | Search depth too shallow or timeout too short | Increase -m flag and subprocess timeout |
pan: invalid end state |
Promela model missing progress/end labels | Add end: labels to blocking states |
| GA finds same cost every run | Random seed is fixed | Remove or vary random.seed(42) |
| stagnant_count keeps growing | ILS stuck in basin of attraction | Escalate: more kicks, double-bridge, 3-opt |
ModuleNotFoundError: deap |
Missing dependency | pip install deap |
Permission denied on ./pan |
Pan binary not executable | chmod +x pan (handled by spin_runner) |
- Instance: 51 cities, Christofides/Eilon 1969
- Proven optimal: 426 (Concorde exact solver)
- Distance type: EUC_2D (integer-rounded Euclidean)
- Pipeline result: 426 (exact optimal, 0.00% gap)
- Time: ~3 minutes on Apple M-series
- Verification: SPIN PASS, 5/5 pytest PASS
| Instance | Cities | Optimal | Status |
|---|---|---|---|
| eil51 | 51 | 426 | ✅ Solved |
| berlin52 | 52 | 7542 | 🔲 Planned |
| kroA100 | 100 | 21282 | 🔲 Planned |
| ch150 | 150 | 6528 | 🔲 Planned |
| tsp225 | 225 | 3916 | 🔲 Planned |
- Fork the repository
- Create a feature branch (
git checkout -b feature/lin-kernighan) - Ensure all tests pass (
pytest tests/ -v) - Ensure the orchestrator still reaches
GAP_LT_5_PERCENT_VERIFIED - Submit a pull request
Do not modify tests/test_np_solver.py. Add new tests in separate files.
MIT License. See LICENSE for details.
- TSPLIB — Gerhard Reinelt, University of Heidelberg
- SPIN — Gerard Holzmann, Bell Labs / NASA JPL
- DEAP — Distributed Evolutionary Algorithms in Python
- Concorde — Applegate, Bixby, Chvátal, Cook (optimal TSP solver)
Built with 🧬 metaheuristics, 🔍 formal verification, and ♻️ autonomous feedback loops.