Ferrox

Constraint solving as a Converge Suggestor.

LLMs are remarkable at understanding intent, drafting plans, explaining tradeoffs, and generating candidate solutions. They are not optimisers. Given a staffing problem with 60 tasks, 12 agents, and tight time windows, a language model will produce a reasonable-sounding schedule — but it cannot prove that schedule is the best possible, cannot guarantee every constraint is met, and cannot tell you how far from optimal it is.

Ferrox fills that gap. It exposes industrial-strength mathematical solvers — Google OR-Tools CP-SAT and HiGHS MIP — as first-class Converge Suggestors that live alongside LLM agents in the same Formation. The LLM understands the business context. Ferrox finds the provably correct answer within it.

The Problem LLMs Cannot Solve Alone

Most real business decisions are constrained optimisation problems dressed in plain language:

"Schedule our field crews for next week" — thousands of valid schedules exist; only one is cheapest
"Which projects should we fund this quarter?" — capital is finite, returns are interdependent, regulations apply
"Route our delivery vehicles through the city" — time windows, service durations, and a hard return deadline
"Plan our factory floor for the next shift" — machines cannot share jobs; precedence cannot be violated

A language model is extraordinarily good at the surrounding work: understanding the request, pulling relevant context, communicating the result. The inner loop — "given these constraints, what is the optimal assignment?" — is where mathematical solvers are decisive.

Ferrox makes those solvers available to any Converge Formation, with confidence scores that tell the Formation exactly how trustworthy each answer is.

Formations and Suggestors

Converge is an open Agent OS built around two primitives:

Suggestor — an agent that reads facts from a shared context and proposes new facts, tagged with a confidence score. Any number of Suggestors can run against the same context simultaneously.

Formation — a group of Suggestors registered in a single Engine. The Engine runs all accepting Suggestors, collects their proposals, and lets consumers pick the highest-confidence answer — or compare all of them.

Ferrox contributes Suggestors that compete on provable quality. For every problem class, two implementations are available:

Problem class	Fast Suggestor	Confidence	Optimal Suggestor	Confidence
Task scheduling (MAATW)	`GreedySchedulerSuggestor`	≤ 0.65	`CpSatSchedulerSuggestor`	≤ 1.0
Job Shop scheduling	`GreedyJobShopSuggestor`	≤ 0.55	`CpSatJobShopSuggestor`	≤ 1.0
Vehicle routing (VRPTW)	`NearestNeighborSuggestor`	≤ 0.60	`CpSatVrptwSuggestor`	≤ 1.0
Linear programs	—	—	`GlopLpSuggestor`	≤ 1.0
Mixed-integer programs	—	—	`HighsMipSuggestor`	≤ 1.0
General CP-SAT	—	—	`CpSatSuggestor`	≤ 1.0

The greedy Suggestor answers in microseconds. The solver Suggestor runs in parallel and either proves the greedy answer was optimal, or beats it. The Formation selects by confidence — no orchestration code required.

How confidence works

optimal solution found  →  confidence = visit_ratio  (1.0 if all tasks scheduled)
feasible but not proven →  confidence = visit_ratio × 0.85
infeasible or error     →  confidence = 0.0
greedy heuristic        →  confidence = throughput_ratio × cap  (cap ≤ 0.65)

A greedy plan capped at 0.65 will always yield to a proven optimal plan at 1.0. If the solver times out with a feasible-but-not-proven plan (0.85), the Formation can still use it with appropriate uncertainty.

Benchmarks

Multi-Agent Task Assignment with Time Windows

60 tasks · 12 specialist agents · 5 skills · 360 min horizon

GreedySchedulerSuggestor   56 / 60 tasks   93.3%    0.03 ms   confidence 0.60
CpSatSchedulerSuggestor    60 / 60 tasks  100.0%     260 ms   confidence 1.00  ← optimal

Job Shop Scheduling

15 jobs × 10 machines — Taillard-style instance

GreedyJobShopSuggestor     makespan 2038    0.3 ms   confidence 0.55
CpSatJobShopSuggestor      makespan 1044   30.0 s    confidence 0.85  ← feasible (48.8% improvement)

Vehicle Routing with Time Windows

20 customers — Solomon-style instance · depot at (50, 50) · horizon 480 min

NearestNeighborSuggestor    5 / 20 customers   < 0.1 ms   confidence 0.15
CpSatVrptwSuggestor         8 / 20 customers     4.9 s    confidence 0.40  ← optimal (+60%)

Four Business Flows

Each flow below shows how LLMs, Cedar policy, Knapsack/MIP, and constraint solvers work as peers in a Formation. No single technology handles the full decision. Each does what it is actually good at.

Flow 1 — Investment Portfolio Allocation

Scenario: A fund manager wants to deploy €50 M across a shortlist of projects. Each project has a return estimate, a risk score, a sector, and a minimum ticket size. Regulations prohibit concentrating more than 40 % of capital in any single sector. ESG policy requires at least three sustainable projects in the portfolio.

Step	Actor	Role
1	LLM Suggestor	Reads analyst notes and CRM history. Writes a narrative summary of each candidate to `ContextKey::Seeds`. Tags which candidates are flagged as ESG-eligible.
2	Cedar policy	Enforces hard regulatory rules — sector concentration cap, minimum ticket, excluded geographies. Any candidate that violates policy is removed from context before solvers see it.
3	`HighsMipSuggestor`	Formulates a binary knapsack: select projects to maximise expected return subject to total capital ≤ €50 M, sector caps, and ESG count ≥ 3. Returns the optimal portfolio with proven optimality gap.
4	LLM Suggestor	Reads the optimal portfolio from `ContextKey::Strategies`. Drafts the investment committee memo, explains the tradeoffs, and flags any candidates that were close to inclusion.

Why the solver, not the LLM, picks the portfolio: The MIP solver can evaluate 2^30 combinations in seconds and prove no better combination exists. An LLM cannot. It will produce a plausible-sounding list that may miss €2 M of return and violate a sector cap it failed to track.

Flow 2 — Field Service Crew Scheduling

Scenario: A utilities company has 40 field technicians and 180 work orders for the coming week. Each work order requires a specific certification, has a customer-committed time window, and a service duration. Technicians have different skill sets, working hours, and geographic zones. Labour agreements cap overtime.

Step	Actor	Role
1	LLM Suggestor	Parses the incoming work orders from unstructured emails and PDFs. Extracts customer, location, window, skill requirement, and priority. Seeds `scheduling-request:week-42` into context.
2	Cedar policy	Enforces labour agreement rules — no technician works more than 10 hours, no consecutive overnight shifts, union jurisdiction by zone. Removes violations before the scheduler runs.
3	`GreedySchedulerSuggestor`	Runs EDF + earliest-available in < 1 ms. Immediately seeds a baseline plan. Confidence ≤ 0.65.
3	`CpSatSchedulerSuggestor`	Runs in parallel. Finds the maximum number of work orders that can be scheduled within all constraints. Returns optimal (or feasible) plan with proven gap. Confidence ≤ 1.0.
4	LLM Suggestor	Takes the CP-SAT plan from context. Writes the technician briefing emails, drafts customer notifications for unscheduled orders, and suggests overflow options.

Why this cannot be done with an LLM alone: A 40-person, 180-job scheduling problem with time windows is NP-hard. The LLM would produce a schedule that looks reasonable but misses 20–30 jobs a skilled human planner or solver would have fit. The CP-SAT model proves the maximum achievable.

Flow 3 — Multi-Stop Delivery Routing

Scenario: A logistics operator runs a same-day delivery fleet. At 9 AM, 60 new delivery requests arrive with pick-up windows, drop-off windows, and service times. Each vehicle must return to the depot by 6 PM. The objective is to maximise deliveries completed; cost per vehicle is fixed so maximising throughput maximises margin.

Step	Actor	Role
1	LLM Suggestor	Reads customer messages, extracts delivery addresses, time preferences, and special instructions (fragile, signature required). Geocodes addresses. Seeds `vrptw-request:2026-04-22` into context.
2	Cedar policy	Applies driver hours-of-service rules, vehicle payload limits, and restricted delivery zones. Flags any delivery that cannot legally be served and writes that back to context before routing.
3	`NearestNeighborSuggestor`	Runs in < 1 ms. Provides an instant baseline route for dispatch visibility.
3	`CpSatVrptwSuggestor`	Runs in parallel. Uses `AddCircuit` + time-window propagation to find the route that maximises customers visited while respecting all windows and the return deadline. Returns proven-optimal (or best-found-feasible) route.
4	LLM Suggestor	Reads the optimal route from context. Generates turn-by-turn driver instructions, customer ETA notifications, and a capacity summary for operations.

Why routing is not a prompt: VRPTW is one of the canonical NP-hard combinatorial problems in operations research. A greedy nearest-neighbour misses 60 % more customers than CP-SAT on tight-window instances. Those missed deliveries are missed revenue and broken SLAs.

Flow 4 — Factory Production Scheduling

Scenario: A precision manufacturer runs a 10-machine job shop. Each evening, a new batch of 15–30 jobs arrives, each requiring a fixed sequence of machining operations. No two jobs can occupy the same machine simultaneously. The target is to minimise makespan — finishing the batch as early as possible to free capacity for the next shift.

Step	Actor	Role
1	LLM Suggestor	Reads the ERP export and production notes. Identifies rush jobs, quality-hold items, and maintenance windows. Seeds `jspbench-request:shift-evening` into context with a structured `JobShopRequest`.
2	Cedar policy	Enforces maintenance windows (machine M03 offline 22:00–23:00), operator certification requirements for certain operations, and priority overrides for rush orders. Modifies the request in context accordingly.
3	`GreedyJobShopSuggestor`	SPT list scheduling in < 1 ms. Provides an immediate baseline for the floor supervisor screen. Confidence 0.55.
3	`CpSatJobShopSuggestor`	CP-SAT interval variables + `NoOverlap` per machine. Proven minimum makespan. Confidence 1.0 on optimal, 0.85 if time budget exhausted. On the 15×10 benchmark: 48.8 % shorter than greedy.
4	LLM Suggestor	Takes the optimal schedule from context. Generates shift handover notes, machine loading reports, and flags if any rush jobs have been delayed beyond their committed window.

Why the floor supervisor needs the solver, not just the LLM: A job shop with 15 jobs and 10 machines has more valid orderings than atoms in the observable universe. Greedy SPT gets you to the floor faster. CP-SAT gets you out of the factory 49 % sooner. On a three-shift operation, that difference compounds into days of recovered capacity per month.

Solvers

Library	Version	Algorithm	Best for
Google OR-Tools CP-SAT	9.15	DPLL(T) + LNS + clause learning	Scheduling, routing, combinatorial assignment
HiGHS	1.14	Revised simplex + branch-and-cut	LP relaxations, pure MIP, capital allocation

Both are compiled from source and linked statically into the gRPC server. No external services, no API calls, no rate limits.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Converge Formation (Engine)                                 │
│                                                              │
│  ContextKey::Seeds                                           │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  vrptw-request:run-001    { depot, customers, ... }  │    │
│  └─────────────────────────────────────────────────────┘    │
│                          │                                   │
│           ┌──────────────┼──────────────┐                   │
│           ▼                             ▼                   │
│  NearestNeighborSuggestor      CpSatVrptwSuggestor          │
│  (sub-ms, confidence 0.15)     (seconds, confidence 0.40)   │
│           │                             │                   │
│           ▼                             ▼                   │
│  ContextKey::Strategies                                      │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  vrptw-plan-greedy:run-001   { route: [...] }        │    │
│  │  vrptw-plan-cpsat:run-001    { route: [...] }        │    │
│  └─────────────────────────────────────────────────────┘    │
│                          │                                   │
│           ▼ (highest confidence wins)                        │
│  LLM Suggestor reads vrptw-plan-cpsat:run-001               │
│  → drafts driver instructions, customer ETAs                │
└──────────────────────────────────────────────────────────────┘

Each Suggestor writes to a solver-prefixed key so all plans coexist. Downstream consumers select by confidence score. The LLM never sees raw constraint data — it sees structured, solved plans.

Running the Showcases

Build the C++ solver libraries first:

make all          # builds OR-Tools v9.15 and HiGHS v1.14 from source

Then run any showcase:

just example-maatw      # Multi-Agent Task Assignment with Time Windows
just example-jspbench   # Job Shop Scheduling (15 jobs × 10 machines)
just example-vrptw      # Vehicle Routing with Time Windows (20 customers)
just example-cp         # Sudoku via CP-SAT (generic CpSatSuggestor)
just example-mip        # Capital allocation via HiGHS MIP

Each example registers both a greedy and an optimal Suggestor in a Formation, runs both, and prints the quality comparison with confidence scores.

gRPC Server

Ferrox ships a production-ready gRPC server that exposes all Suggestors over the network. Any Converge Formation can call it as a Provider.

just server             # local, no TLS
just up                 # Docker Compose with mTLS

Authentication via Authorization: Bearer <token> (set FERROX_AUTH_TOKEN). TLS certificates in ./tls/ — generate dev certs with just tls-dev-certs.

Adding a Suggestor

Implement the Suggestor trait from converge-pack:

#[async_trait]
impl Suggestor for MyCustomSuggestor {
    fn name(&self) -> &str { "MyCustomSuggestor" }

    fn dependencies(&self) -> &[ContextKey] { &[ContextKey::Seeds] }

    fn complexity_hint(&self) -> Option<&'static str> {
        Some("O(n log n) — describe what this costs and what scale it handles")
    }

    fn accepts(&self, ctx: &dyn Context) -> bool {
        ctx.get(ContextKey::Seeds).iter().any(|f| f.id.starts_with("my-request:"))
    }

    async fn execute(&self, ctx: &dyn Context) -> AgentEffect {
        // read from Seeds, compute, write to Strategies
        AgentEffect::with_proposals(vec![...])
    }
}

Register it in an Engine:

let mut engine = Engine::new();
engine.register_suggestor(GreedySuggestor);
engine.register_suggestor(MyCustomSuggestor);   // competes on the same seeds

The Formation handles concurrency, confidence ranking, and fact deduplication.

Project Layout

crates/
  ferrox/               Core library — all Suggestors and problem types
    src/
      scheduling/       MAATW — task assignment with time windows
      jobshop/          JSP  — job shop scheduling (N jobs, M machines)
      vrptw/            VRPTW — vehicle routing with time windows
      cp/               Generic CP-SAT Suggestor (any CpSatRequest)
      lp/               Generic LP Suggestor (GLOP)
      mip/              Generic MIP Suggestor (HiGHS)
  ferrox-server/        gRPC server (TLS, auth, Docker)
  ortools-sys/          Rust FFI to OR-Tools CP-SAT
  highs-sys/            Rust FFI to HiGHS

examples/
  maatw/                Formation demo: task scheduling
  jspbench/             Formation demo: job shop
  vrptw/                Formation demo: vehicle routing
  cp_sudoku/            Formation demo: sudoku via generic CP-SAT
  highs_mip/            Formation demo: capital allocation via MIP

proto/                  Protobuf definitions (ferrox.v1)
vendor/
  ortools/              OR-Tools v9.15 source
  highs/                HiGHS v1.14 source

Why Rust

Rust gives ferrox zero-copy FFI to C++ solver libraries with no garbage-collection pauses, no JVM warm-up, and no Python GIL. The OR-Tools and HiGHS bindings call directly into the solver shared libraries. An end-to-end Formation run — seed to plan — adds no observable latency beyond what the solver itself takes.

The unsafe keyword does not appear in ferrox library code. All C boundary code is in ortools-sys and highs-sys, wrapped in safe Rust APIs before any Suggestor touches them.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github		.github
crates		crates
examples		examples
experiments		experiments
proto		proto
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ferrox

The Problem LLMs Cannot Solve Alone

Formations and Suggestors

How confidence works

Benchmarks

Multi-Agent Task Assignment with Time Windows

Job Shop Scheduling

Vehicle Routing with Time Windows

Four Business Flows

Flow 1 — Investment Portfolio Allocation

Flow 2 — Field Service Crew Scheduling

Flow 3 — Multi-Stop Delivery Routing

Flow 4 — Factory Production Scheduling

Solvers

Architecture

Running the Showcases

gRPC Server

Adding a Suggestor

Project Layout

Why Rust

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ferrox

The Problem LLMs Cannot Solve Alone

Formations and Suggestors

How confidence works

Benchmarks

Multi-Agent Task Assignment with Time Windows

Job Shop Scheduling

Vehicle Routing with Time Windows

Four Business Flows

Flow 1 — Investment Portfolio Allocation

Flow 2 — Field Service Crew Scheduling

Flow 3 — Multi-Stop Delivery Routing

Flow 4 — Factory Production Scheduling

Solvers

Architecture

Running the Showcases

gRPC Server

Adding a Suggestor

Project Layout

Why Rust

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages