🧬 NP-Solver — Formally Verified Metaheuristic Optimization

A self-correcting pipeline that solves NP-hard combinatorial optimization problems using Genetic Algorithms, Iterated Local Search, and SPIN/Promela formal verification — orchestrated through a blackboard architecture with an autonomous feedback loop.

Overview

NP-Solver is a pipeline that combines metaheuristic search with formal verification to solve NP-hard problems — starting with the Travelling Salesman Problem (TSP). It does not just find solutions; it proves properties about them using the SPIN model checker.

The system is designed to run autonomously via a feedback loop (the "Ralph Loop"): it solves, verifies, diagnoses failures, patches parameters, and iterates until a formally verified solution meeting the quality threshold is found.

What makes it different from a plain GA/TSP solver:

Feature	Plain GA	NP-Solver
Finds good tours	✅	✅
Proves tour validity (Hamiltonian cycle)	❌	✅ via SPIN
Verifies claimed cost matches actual cost	❌	✅ via SPIN
Self-tunes parameters on failure	❌	✅ via feedback agent
Autonomous iteration with diagnosis	❌	✅ via orchestrator
Emits formal promise on success	❌	✅ `GAP_LT_5_PERCENT_VERIFIED`

Architecture

┌───────────────────────────────────────────────────────────────────┐
│                        ORCHESTRATOR                                 │
│                  (agents/orchestrator.py)                            │
│                                                                     │
│   ┌──────────┐    ┌──────────────┐    ┌───────────────────────┐     │
│   │  GA +    │───▶│  BLACKBOARD  │───▶│   SPIN VERIFICATION   │     │
│   │  ILS     │    │  (JSON)      │    │   (Promela model)     │     │
│   │  Solver  │    │              │◀───│                       │     │
│   └──────────┘    │  • problem   │    └───────────────────────┘     │
│        ▲          │  • best_cost │              │                   │
│        │          │  • best_tour │              ▼                   │
│        │          │  • gap       │    ┌───────────────────────┐     │
│        │          │  • params    │    │   FEEDBACK AGENT      │     │
│        └──────────│  • promise   │◀───│   (gap analysis,      │     │
│                   │              │    │    param tuning)       │     │
│                   └──────────────┘    └───────────────────────┘     │
│                                                                     │
│   Repeat until promise == "GAP_LT_5_PERCENT_VERIFIED"              │
└───────────────────────────────────────────────────────────────────┘

Data Flow:

Solver (GA + ILS) writes best_solution and best_cost to the blackboard
Promela generator encodes the candidate tour as a SPIN model
SPIN runner compiles and verifies the model (permutation check + cost check)
Feedback agent reads verification results, computes gap vs. known optimal, tunes parameters
Orchestrator checks for the success promise; if not met, loops back to step 1

Key Results

Instance	Cities	Optimal	Found	Gap	Iterations	SPIN Status
eil51	51	426	426	0.00%	1	✅ PASS

Prerequisites

Tool	Version	Purpose
Python	3.10+	Runtime
SPIN	6.5+	Model checking / formal verification
GCC	Any recent	Compiling SPIN-generated verifiers (`pan.c`)
pip	Latest	Python package management

Verify installation:

python3 --version        # Python 3.10+
which spin               # /usr/bin/spin or similar
which gcc                # /usr/bin/gcc

Installing SPIN

macOS: brew install spin
Ubuntu/Debian: apt-get install spin
From source: spinroot.com

Installation

# Clone the repository
git clone https://github.com/your-org/np-solver.git
cd np-solver

# Install Python dependencies
pip install -r requirements.txt

# Verify everything compiles
python3 -m py_compile agents/orchestrator.py
python3 -m py_compile strategies/ga_tsp.py
python3 -m py_compile models/promela_gen.py
python3 -m py_compile verification/spin_runner.py
python3 -m py_compile agents/feedback.py

# Run tests
pytest tests/ -v

Python Dependencies

deap>=1.4          # Genetic algorithm framework
pytest>=7.0        # Testing

Project Structure

np-solver/
├── agents/
│   ├── orchestrator.py       # Main loop: solver → verify → feedback → repeat
│   └── feedback.py           # Gap analysis, parameter tuning, stagnation detection
├── strategies/
│   └── ga_tsp.py             # GA + 2-opt + Or-opt + ILS + double-bridge solver
├── models/
│   └── promela_gen.py        # Generates Promela models from candidate tours
├── verification/
│   └── spin_runner.py        # Compiles and runs SPIN verification
├── tests/
│   └── test_np_solver.py     # Integration tests (DO NOT MODIFY)
├── blackboard.json           # Shared state: problem, solution, params, verification
├── ralph_improvements.log    # Changelog from autonomous iterations
├── ralph_loop_eil51_prompt.md # Full task specification
└── README.md                 # This file

Quick Start

1. Set up the problem in `blackboard.json`

{
  "problem": {
    "type": "TSP",
    "instance": [
      [37, 52], [49, 49], [52, 64], [20, 26], [40, 30],
      [21, 47], [17, 63], [31, 62], [52, 33], [51, 21],
      [42, 41], [31, 32], [5, 25],  [12, 42], [36, 16],
      [52, 41], [27, 23], [17, 33], [13, 13], [57, 58],
      [62, 42], [42, 57], [16, 57], [8, 52],  [7, 38],
      [27, 68], [30, 48], [43, 67], [58, 48], [58, 27],
      [37, 69], [38, 46], [46, 10], [61, 33], [62, 63],
      [63, 69], [32, 22], [45, 35], [59, 15], [5, 6],
      [10, 17], [21, 10], [5, 64],  [30, 15], [39, 10],
      [32, 39], [25, 32], [25, 55], [48, 28], [56, 37],
      [30, 40]
    ],
    "optimal": 426,
    "n": 51,
    "distance_type": "EUC_2D"
  },
  "strategies": {
    "current": "ga_tsp",
    "tried": [],
    "params": {
      "pop_size": 200,
      "gens": 500,
      "mutpb": 0.15,
      "cxpb": 0.7
    }
  },
  "best_solution": null,
  "best_cost": null,
  "verification": { "pass_rate": 0, "status": "pending", "error": null },
  "fitness_history": [],
  "feedback": { "gap": 1.0, "pass_rate": 0, "stagnant_count": 0, "ready_for_test": false },
  "iteration": 0,
  "test_passed": false,
  "promise": null
}

2. Run the pipeline

# Clean SPIN artifacts
rm -f pan pan.c pan.h pan.b pan.m pan.t pan.p pan.trail _spin_nvr.tmp

# Run the orchestrator
python3 -m agents.orchestrator

3. Check results

python3 -c "
import json
with open('blackboard.json') as f:
    bb = json.load(f)
print(f'Cost:    {bb["best_cost"]}')
print(f'Gap:     {bb["feedback"]["gap"]:.2%}')
print(f'SPIN:    {bb["verification"]["status"]}')
print(f'Promise: {bb["promise"]}')
"

4. Run tests

pytest tests/test_np_solver.py -v

How It Works

Phase 0 — Setup

Load the TSP instance coordinates into blackboard.json
Set distance_type: "EUC_2D" and optimal: 426
Ensure all distance computations use round(sqrt(...)) — not raw float Euclidean
Configure SPIN runner for N=51: pan -m100000, timeout 120s

Phase 1 — Iteration Loop

Each iteration performs:

CLEAN  →  RUN orchestrator  →  OBSERVE blackboard  →  DIAGNOSE  →  IMPROVE  →  REPEAT

The orchestrator internally runs:

GA phase — evolve a population of tours (200 pop × 500 gens)
Seeding — inject nearest-neighbor heuristic tours into 20% of the population
Elite extraction — pull top 3 from Hall of Fame + NN seeds as ILS starting points
ILS phase — for each starting tour, run 200 double-bridge kicks with 2-opt refinement
Write best tour to blackboard
Promela generation — encode the tour as a SPIN model
SPIN verification — compile and run the model checker
Feedback — compute gap, update parameters, check stagnation
Promise check — if gap < 5% and SPIN passes, emit GAP_LT_5_PERCENT_VERIFIED

Diagnosis Decision Tree

Python exception?
├── YES → Fix the bug, py_compile, re-run
└── NO ↓

SPIN compile error?
├── YES → Fix promela_gen.py (N mismatch, indexing)
└── NO ↓

SPIN verification fails?
├── YES → Fix Promela model logic
└── NO ↓

gap ≥ 5% and pass_rate ≥ 0.95?
├── YES → Improve solver (escalation ladder)
└── NO ↓

stagnant_count > 3?
├── YES → Escalate improvements
└── NO → Run another iteration

Escalation Ladder

Level	Trigger	Action
1	gap > 10%	Verify EUC_2D distances, ensure 2-opt is working
2	gap > 7%	Add nearest-neighbor seeding + Or-opt
3	gap > 6%	Add ILS with double-bridge kicks (200+)
4	gap > 5%	Increase ILS kicks to 500, add more start points
5	gap > 5% still	Add 3-opt moves inside local search
6	stagnant > 10	Double population, fresh restart
7	stagnant > 20	Add simulated annealing post-processing

Core Components

Blackboard

File: blackboard.json

The central shared state. All agents read from and write to this JSON file. Key fields:

Field	Type	Description
`problem.instance`	`[[x,y], ...]`	City coordinates
`problem.optimal`	`int`	Known optimal tour cost
`problem.distance_type`	`string`	`"EUC_2D"` — determines distance function
`strategies.params`	`object`	GA parameters: `pop_size`, `gens`, `mutpb`, `cxpb`
`best_solution`	`[int, ...]`	Best tour found (0-indexed city permutation)
`best_cost`	`int`	Cost of best tour
`verification.status`	`string`	`"PASS"`, `"FAIL"`, or `"pending"`
`verification.pass_rate`	`float`	0.0 – 1.0
`feedback.gap`	`float`	`(best_cost - optimal) / optimal`
`feedback.stagnant_count`	`int`	Consecutive iterations without improvement
`promise`	`string or null`	`"GAP_LT_5_PERCENT_VERIFIED"` on success

GA + Local Search Solver

File: strategies/ga_tsp.py

The solver combines multiple techniques:

Genetic Algorithm (via DEAP): ordered crossover, swap mutation, tournament selection
Nearest-Neighbor Seeding: deterministic greedy tours seed 20% of the initial population
2-opt Local Search: iterative edge-swap improvement
Or-opt: segment relocation (segments of 1, 2, and 3 cities)
Double-Bridge Perturbation: 4-edge random reconnection to escape local optima
Iterated Local Search (ILS): local_search → double_bridge → local_search repeated

SPIN Verification

Files: models/promela_gen.py, verification/spin_runner.py

The Promela model encodes:

Tour validity — every city appears exactly once (permutation check)
Cost correctness — the computed cost matches the claimed cost

SPIN exhaustively checks these properties. If verification passes, the solution is formally proven correct — not just tested.

promela_gen.py:   Candidate tour → Promela source (.pml)
spin_runner.py:   .pml → spin -a → gcc pan.c → ./pan -m100000 → PASS/FAIL

Feedback Agent

File: agents/feedback.py

Reads blackboard state after each iteration and:

Computes gap: (best_cost - optimal) / optimal
Detects stagnation (no improvement for N iterations)
Tunes GA parameters (increase gens, adjust mutation rate)
Guards parameter bounds: pop_size ≤ 1000, gens ≤ 2000, mutpb ≤ 0.3

Orchestrator

File: agents/orchestrator.py

The main loop that ties everything together. Run with:

python3 -m agents.orchestrator

It calls solver → verification → feedback → promise check in sequence, and runs pytest at the end.

Distance Convention (Critical)

This is the single most important implementation detail.

TSPLIB EUC_2D instances use integer-rounded Euclidean distances:

import math

def euc_2d(a, b):
    # TSPLIB EUC_2D: round to nearest integer.
    return round(math.sqrt((a[0] - b[0])**2 + (a[1] - b[1])**2))

def tour_cost(tour, coords):
    # Total tour cost using TSPLIB EUC_2D convention.
    n = len(tour)
    return sum(euc_2d(coords[tour[i]], coords[tour[(i+1) % n]]) for i in range(n))

Why this matters:

Raw math.dist returns floats (e.g., 14.142135...)
TSPLIB expects integers (e.g., 14)
The proven optimal of 426 is computed with integer distances
Using float distances produces a different optimal, making the 5% gap check meaningless

The previous run's only bug was this mismatch. Once fixed, the solver hit optimal on the first iteration.

Configuration

GA Parameters (in `blackboard.json`)

Parameter	Default	Range	Description
`pop_size`	200	50–1000	Population size
`gens`	500	100–2000	Generations per GA run
`mutpb`	0.15	0.05–0.30	Mutation probability
`cxpb`	0.7	0.5–0.9	Crossover probability

ILS Parameters (in `ga_tsp.py`)

Parameter	Default	Description
`n_kicks`	200	Double-bridge perturbations per ILS run
`n_starts`	~56	Starting tours (NN seeds + GA elites)

SPIN Parameters (in `spin_runner.py`)

Parameter	Default	Description
Search depth	`-m100000`	Max search depth for `pan`
Timeout	`120s`	Subprocess timeout

Algorithmic Toolkit

2-opt

Iteratively reverses tour segments to remove crossing edges:

Before:  ... → A → B → ... → C → D → ...
After:   ... → A → C → ... → B → D → ...  (segment B..C reversed)

Applied if delta = d(A,C) + d(B,D) - d(A,B) - d(C,D) < 0.

Or-opt

Relocates segments of 1, 2, or 3 consecutive cities to a better position in the tour. Complements 2-opt by handling moves that 2-opt cannot express.

Double-Bridge

A 4-opt perturbation that randomly splits the tour into 4 segments and reconnects them in a non-sequential order. This is the standard escape mechanism from 2-opt local optima:

def double_bridge(tour):
    n = len(tour)
    cuts = sorted(random.sample(range(1, n), 3))
    a, b, c, d = cuts[0], cuts[1], cuts[2], n
    return tour[:a] + tour[c:d] + tour[b:c] + tour[a:b]

Iterated Local Search (ILS)

The core metaheuristic:

start → 2-opt → (best so far)
              ↓
        double-bridge kick
              ↓
        2-opt on perturbed tour
              ↓
        accept if improved → repeat N times

Nearest-Neighbor Heuristic

Greedy construction: start at a city, always move to the nearest unvisited city. Fast (O(n²)), produces tours within ~20% of optimal. Used to seed the GA population.

Optimization Guide

For Speed

Precompute the distance matrix — replace per-call euc_2d() with an N×N lookup table
Replace GA seeding with pure NN — skip 200×500 GA generations, seed ILS directly
Parallelize ILS restarts — use multiprocessing.Pool across CPU cores
Use Numba for 2-opt — JIT-compile the inner loop for 10-50× speedup

For Solution Quality (larger instances)

Add Lin-Kernighan style variable-depth moves
Increase ILS kicks to 500–1000
Add 3-opt when ILS with 2-opt plateaus
Use portfolio approach — run multiple strategies concurrently

For Scalability (500+ cities)

Simplify the Promela model — verify only permutation, compute cost in Python
Skip SPIN on intermediate iterations — only verify the final candidate
Use neighbor lists in 2-opt (only check nearest 10–15 neighbors per city)

Extending to Other Problems

The architecture is problem-agnostic. To add a new problem type:

Add a solver in strategies/ (e.g., strategies/ga_knapsack.py)
Add a Promela model in models/ for the verification properties you care about
Update blackboard.json with the problem definition and optimality target
Update feedback.py with problem-specific gap calculation
Update orchestrator.py to dispatch to the new solver

Candidate problem classes:

Vehicle Routing Problem (VRP)
Job-Shop Scheduling
Graph Coloring
Knapsack / Bin Packing
Satisfiability (SAT)

Hard Rules

❌ NEVER modify tests/test_np_solver.py
❌ NEVER manually set promise, test_passed, best_solution, or best_cost in blackboard.json
❌ NEVER hardcode a tour
❌ NEVER skip python3 -m agents.orchestrator (always run the full pipeline)
✅ ALWAYS use euc_2d (nearest-integer) distances — never raw Euclidean
✅ ALWAYS clean SPIN artifacts before each run
✅ ALWAYS py_compile changed files before running the orchestrator
✅ ALWAYS log changes to ralph_improvements.log

Troubleshooting

Symptom	Cause	Fix
Gap is slightly off from expected	Using `math.dist` instead of `euc_2d`	Replace all distance calls with `round(math.sqrt(...))`
SPIN compile error	Promela `N` doesn't match city count	Update `#define N` in `promela_gen.py`
SPIN timeout	Search depth too shallow or timeout too short	Increase `-m` flag and subprocess timeout
`pan: invalid end state`	Promela model missing progress/end labels	Add `end:` labels to blocking states
GA finds same cost every run	Random seed is fixed	Remove or vary `random.seed(42)`
stagnant_count keeps growing	ILS stuck in basin of attraction	Escalate: more kicks, double-bridge, 3-opt
`ModuleNotFoundError: deap`	Missing dependency	`pip install deap`
Permission denied on `./pan`	Pan binary not executable	`chmod +x pan` (handled by spin_runner)

Benchmarks

TSPLIB eil51

Instance: 51 cities, Christofides/Eilon 1969
Proven optimal: 426 (Concorde exact solver)
Distance type: EUC_2D (integer-rounded Euclidean)
Pipeline result: 426 (exact optimal, 0.00% gap)
Time: ~3 minutes on Apple M-series
Verification: SPIN PASS, 5/5 pytest PASS

Future Targets

Instance	Cities	Optimal	Status
eil51	51	426	✅ Solved
berlin52	52	7542	🔲 Planned
kroA100	100	21282	🔲 Planned
ch150	150	6528	🔲 Planned
tsp225	225	3916	🔲 Planned

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/lin-kernighan)
Ensure all tests pass (pytest tests/ -v)
Ensure the orchestrator still reaches GAP_LT_5_PERCENT_VERIFIED
Submit a pull request

Do not modify tests/test_np_solver.py. Add new tests in separate files.

License

MIT License. See LICENSE for details.

Acknowledgments

TSPLIB — Gerhard Reinelt, University of Heidelberg
SPIN — Gerard Holzmann, Bell Labs / NASA JPL
DEAP — Distributed Evolutionary Algorithms in Python
Concorde — Applegate, Bixby, Chvátal, Cook (optimal TSP solver)

Built with 🧬 metaheuristics, 🔍 formal verification, and ♻️ autonomous feedback loops.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
docs/plans		docs/plans
np-solver		np-solver
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🧬 NP-Solver — Formally Verified Metaheuristic Optimization

Table of Contents

Overview

Architecture

Key Results

Prerequisites

Installing SPIN

Installation

Python Dependencies

Project Structure

Quick Start

1. Set up the problem in blackboard.json

2. Run the pipeline

3. Check results

4. Run tests

How It Works

Phase 0 — Setup

Phase 1 — Iteration Loop

Diagnosis Decision Tree

Escalation Ladder

Core Components

Blackboard

GA + Local Search Solver

SPIN Verification

Feedback Agent

Orchestrator

Distance Convention (Critical)

Configuration

GA Parameters (in blackboard.json)

ILS Parameters (in ga_tsp.py)

SPIN Parameters (in spin_runner.py)

Algorithmic Toolkit

2-opt

Or-opt

Double-Bridge

Iterated Local Search (ILS)

Nearest-Neighbor Heuristic

Optimization Guide

For Speed

For Solution Quality (larger instances)

For Scalability (500+ cities)

Extending to Other Problems

Hard Rules

Troubleshooting

Benchmarks

TSPLIB eil51

Future Targets

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Set up the problem in `blackboard.json`

GA Parameters (in `blackboard.json`)

ILS Parameters (in `ga_tsp.py`)

SPIN Parameters (in `spin_runner.py`)

Packages