POUNCE v0.3.0
Added — Multiple-minima & critical-point global search (PR #94)
pounce.find_minima(fun, x0, n_minima=..., method=...) returns several
distinct local minima from a single call. Methods: flooding /
deflation (add a repulsive Gaussian / pole "hump" — with analytic
gradient and Hessian — at each found minimum and re-solve), multistart,
mlsl, and basinhopping. Anisotropic auto bump widths and a
curvature-derived auto amplitude; Hessian-based saddle rejection; global
box restarts; bounds and constraints pass through minimize untouched.
The six-hump-camel demo recovers all six minima.
pounce.find_critical_points/pounce.find_saddles— stationary
points via the squared-gradient merit½‖∇f‖², classified by Morse
index from the Hessian eigenvalues (degenerate / non-Morse points are
flagged, not mislabeled); eigenvector-following saddle search with
box-clipped steps.pounce.reaction_network— minima, transition states, and barriers
(Müller–Brown example).- Robustness: non-finite candidates and objectives are rejected before
acceptance; the de-duplication metric is the same per-dimension scaled
distance across the minima and saddle routes. - Examples (
gaussian_hump_minima.py,critical_points.py,
reaction_barrier.py), notebooks 15–17, docs (find-minima.md,
find-minima-choosing.md), and 29 unit tests.
Fixed
- Acceptable-point termination now rejects a non-finite objective. A
near-feasible iterate whose objective evaluates to NaN/Inf (e.g. CUTE
himmelbj) exitsInvalid_Number_Detectedinstead of a spurious
Solved_To_Acceptable_Levelcarrying ananobjective; the benchmark
driver's objective scrape is null-safe to match. - No spurious
jacobian()call on unconstrained problems
(pounce-py).eval_jac_gshort-circuits when there are no constraint
entries (mirroring the Hessian guard), so the unconstrainedminimize
facade — which legitimately omitsjacobian— no longer logs an
AttributeErrorat ERROR level on every iteration.
Benchmarks & docs
- The benchmark suite now runs single-threaded and sequential by
default (OMP/VECLIB/RAYON=1) so POUNCE and Ipopt solve times are
directly comparable on one host; the ipopt-ma57 reference was
regenerated andBENCHMARK_REPORT.mdcarries a threading/timing note
(POUNCE'sfaer/rayondense linear algebra is up to ~2× faster
multi-threaded, so the single-threaded numbers are a controlled lower
bound). - Example notebooks re-executed against the current solver and corrected
where the prose had drifted: warm-starting now demonstrates the
mu_init+ tightwarm_start_*_bound_pushtuning that actually cuts
iterations (11 → 5 on HS071); the scaling example uses a1e12
constraint where gradient scaling visibly helps (31 → 11 iters); the
FBBT notebook shows the domain-safety and false-infeasibility wins. - mdBook restyled in the POUNCE tiger / cream brand palette.
Added — Inverse-map ODE recipe over a sensitivity RHS (pounce#91)
pounce.jax.inverse_map_rhs(jp, dy_ds, *, output=None, x0=None) builds
the right-hand side of the Alves–Kitchin–Lima inverse / uncertainty
mapping ODE (pounce#84, Eq. 3):
dθ/ds = (∂y/∂θ)^{-1} · dy/ds
where y = output(x*(θ), θ) is an output of the embedded optimizer.
POUNCE supplies the RHS; an off-the-shelf adaptive integrator (diffrax,
scipy) does the stepping — no NLP inversion.
- The inverse map is a linear solve against the total output
sensitivity∂y/∂θ = (∂h/∂x) J + ∂h/∂θ(withJ = ∂x*/∂θfrom
the held factor), not a Jacobian-vector product — so it wants the full
Jandjnp.linalg.solve. - The whole evaluation rides one
jax.pure_callback, so the RHS is
JAX-traceable and composes underjax.jitand diffrax (which
jit-compiles the vector field). - Worked example
python/examples/inverse_map_diffrax.pyintegrates a
closed output boundary with diffrax Dopri5 and round-trips back through
the optimizer onto the boundary (~1e-7).diffraxis an optional
extra (pip install pounce[diffrax]); the example falls back to RK4
if it's absent. inverse_map_rhs(..., warm=True)warm-starts each inner solve from the
previous evaluation's primal/duals/μ (pounce#86). Result-invariant (up
to solver tolerance); a modest lever — ~1.4-1.7× fewer IPM iterations,
~1.3× wall-clock, roughly flat in problem size (interior-point
warm-start ceiling + per-eval Jacobian-build overhead). Benchmark:
python/benchmarks/inverse_map_warm.py. For a real speedup on a smooth
map, preferPathFollower(it skips solves, not just cheapens them).- Switch to
PathFollowerwhen the path folds or the active set changes. - Worked notebook
python/notebooks/14_path_following.ipynbtours the
whole family (sensitivity → margin → continuation → fold → inverse map).
Also fixes the build-once / stacked path for unconstrained problems
(g=None, m=0): the constraint callbacks no longer dereference the
(None) constraint-Jacobian jit, so JaxProblem.solve /
solve_with_jacobian / the batched solves now work with no constraints.
Added — Predictor–corrector path-following engine (pounce#90)
pounce.jax.PathFollower traces a solution path of a parametric NLP by
composing the post-solve sensitivity primitives instead of re-solving
at every step:
from pounce.jax import PathFollower
pf = PathFollower(jp, monitor_tol=1e-6, ds0=0.05)
trace = pf.follow(theta_of_s, (0.0, 1.0), x0) # parameter continuation
# trace.x, trace.theta, trace.s, trace.lam,
# trace.n_correctors, trace.n_accepts, trace.active_set_changes- predict — extrapolate primal and duals along the held-factor
sensitivity (jvp_from_state(..., with_duals=True)); monitor
(no solve) — KKT residual + active-set margin (#89) at the predicted
point; correct — only when the monitor trips, a warm-μ re-solve
that also re-anchors the factor in one solve (warm_anchor, #86). - Adaptive step size; detects and records active-set changes and
re-anchors on the new active set. PathFollower.trace_arclength(...)— pseudo-arclength continuation for
a scalar-parameter, equality/unconstrained family, tracing past
folds where∂x*/∂θis singular (parameter continuation cannot).
Reports turning points. Bifurcation/branch-switching and
inequality-active folds are out of scope for v1.- On a linear-response NLP the predictor is exact, so the whole path is
traced with zero correctors (one anchor solve vs one cold solve
per step); nonlinear paths correct adaptively and still trace to
tolerance.
New supporting public surface:
JaxProblem.warm_anchor(p, x0, *, duals=None, mu=None)— a warm-started,
μ-seeded re-solve that pins the converged factor and returns aB=1
AnchorState(the corrector + anchor in one solve). Threads μ through
the reusable build-once path (the #86 follow-up).JaxProblem.jvp_from_state(..., with_duals=True)/
batched_jvp_from_state(..., with_duals=True)— also return the dual
sensitivity∂λ*/∂θ · dpfrom the same held-factor back-solve.
Added — Active-set-proximity monitor (pounce#89)
JaxProblem.active_set_margin(state) reports the distance to an
active-set change at the anchor point — the "predictor is about to
become invalid" signal for predictor–corrector path following. The
post-solve sensitivity is a derivative on a fixed active set; this
flags when a bound / inequality is about to cross its critical-region
boundary (where the sensitivity is discontinuous).
r = jp.active_set_margin(state)
# r["margin"], r["min_mult"], r["min_slack"] — each (B,)- By complementarity: an active bound/inequality (multiplier
> active_tol) is about to leave the set — its multiplier heads to
zero; an inactive one is about to enter — its slack heads to
zero.min_mult/min_slacktrack each;margin = min(min_mult, min_slack). - Equalities (
cl == cu) are excluded (always active);±infbounds
and the slack side of a one-sided inequality drop out naturally.
An unconstrained interior point returnsinf. - Pure-JAX reduction over state the
AnchorStatealready holds — no
solve, no back-solve. Pairs with the caller-side KKT-residual
(smooth-drift) monitor: re-anchor when either trips.
Added — Single-problem ergonomic sensitivity wrappers (pounce#88)
Thin un-batched wrappers over the batched_* post-solve sensitivity
methods, for the scalar / path-following user (one NLP at a time):
x_star, (lam, zL, zU), J = jp.solve_with_jacobian(theta, x0) # J: (n, p)
state = jp.anchor(theta, x0) # un-batched point → B=1 state
J = jp.sensitivity(state) # (n, p) from the held factor
dx = jp.jvp_from_state(state, dtheta) # J @ dtheta -> (n,)
dp = jp.vjp_from_state(state, x_bar) # J^T @ x_bar -> (p,)solve_with_jacobian/sensitivity/jvp_from_state/
vjp_from_stateaccept and return un-batched shapes, delegating to
the batched methods withB=1and squeezing — no new numerics.anchornow accepts a single un-batched point (p_shape) in addition
to a batch ((B,) + p_shape); a single point yields aB=1
AnchorState. The single-problem from-state wrappers reject aB>1
state rather than silently mis-shaping.- Implemented as
JaxProblemmethods (mirroring thebatched_*names)
rather than free functions, for consistency with the existing surface.
Added — Exact post-solve sensitivity at a supplied point (pounce#87)
JaxProblem.sensitivity_at(x_star, theta, duals, *, wrt_cols=None)
returns the exact primal sensitivity ∂x*/∂θ evaluated at a
caller-supplied primal-dual point, by re-assembling and factoring the
KKT system there — no IPM re-solve.
J = jp.sensitivity_at(x_star, theta, (lam, zL, zU)) # (n, p_dim)- Re-factor, not reuse. A held FERAL factor encodes the anchor
point'sH/J, so back-solving it at a movedx_stargives a
first-order-stale sensitivity.sensitivity_atassembles the dense
(n+m)×(n+m)KKT at the supplied point, which is exact there
(assuming a KKT point fortheta). The cheap-but-local reuse path
stays as the predictorbatched_jvp_from_state; this is its
exact-refresh complement. - Active set is read from the supplied bound multipliers
(zL, zU),
exactly like thecustom_vjpbackward — the caller passes the duals
the anchoring solve /solve_with_warmreturned at this point. - Pure-JAX, so itself differentiable (second-order sensitivities work);
matchesjax.jacobianover a fresh solve to ~1e-6 at every point
along a swept path, including a binding bound.
This is the exact-refresh primitive for the inverse map, where x*
traces a known output boundary and the sensitivity must be evaluated at
the known point without paying a full re-solve per RK stage.
Added — Barrier-μ warm start for predictor–corrector correctors (pounce#86)
The interior-point barrier parameter μ is now reported on every solve and
can be threaded into a warm-started re-solve, so a predictor–corrector
corrector resumes near the central path instead of re-walking the barrier
homotopy from the default initial μ.
info["mu"]— everyProblem.solve/Solver.solve/
solve_with_sensinfo dict now carries the converged barrier parameter
(0.0on the barrier-free SQP path).pounce.jax.solve_with_warmaccepts a 4-element warm-state
(lam, zL, zU, mu)that seedsmu_init/warm_start_target_mu, and
returns the converged μ in a matching 4-tuple. The 3-tuple form is
unchanged; passingmu=Noneinside a 4-tuple reports μ out without
seeding it in. Differentiability w.r.t.pis preserved (the μ
input/output are stop-gradient, like the duals).
On a small parametric NLP, seeding μ from the previous solve's converged
barrier cut a warm-started corrector from 5 interior-point iterations to
1 (same optimum). The mu_init / warm_start_target_mu algorithm
options already existed; this exposes the converged μ needed to drive
them along a path.
Added — Post-solve Jacobian / sensitivity API from the held KKT factor (pounce#82)
JaxProblem now exposes a first-class post-solve sensitivity surface
that reuses the held FERAL stacked KKT factor instead of round-tripping
through jax.vjp / jax.jacrev:
x_star, (lam, zL, zU), J, state = jp.batched_solve_with_jacobian(
p_batch, x0,
wrt_cols=slice(0, ny), # optional parameter-column selection (1-D p)
return_state=True,
)
dp_bar = jp.batched_vjp_from_state(state, x_bar) # J^T @ x_bar
state.close()batched_solve_with_jacobian(...)returns the full per-block
primal JacobianJof shape(B, n, p_dim)(or(B, n, len(wrt_cols)))
alongsidex_starand the(lam, zL, zU)duals (same contract as
batched_solve_with_warm). The Jacobian is assembled by evaluating the
existing factor-reuse backward over then×nidentity output basis —
one multi-RHSSolver.kkt_solve_manyagainst the held LDLᵀ factor, no
NLP re-solve.anchor(p_batch, x0, *, wrt_cols=None)solves once and pins the
factor, returning anAnchorStatehandle for reuse across several
post-solve sensitivity calls (linear-update pattern).batched_vjp_from_state(state, x_bar)is the public reverse-mode
productJᵀ x̄against a held factor.batched_jvp_from_state(state, dp)is the forward-mode product
J @ dp— the cheap path for linear updates that never materialise the
fullJ. It assembles the parameter-side RHS[∂²L/∂x∂p · dp; ∂g/∂p · dp]into the compound x- and constraint-blocks and back-solves
once against the held factor. Accepts a reduceddpwhen the state was
anchored withwrt_cols.AnchorStatelifetime: works as a context manager
(with jp.anchor(...) as state:) and supports explicit ownership
(state.close(),state.reanchor(...)) for handles that outlive a
lexical block. Pinned factors are exempt from the LRU but capped
(_pinned_capacity, default 16) with a loud overflow error, and a
weakreffinalizer reclaims the factor if a handle is dropped without
close().
Added — Structured logging + colored iteration table (pounce#71)
POUNCE now emits diagnostics through the
tracing ecosystem and renders the
per-iteration table in a tiger/rust branded color theme.
- Colored iteration table. Restoration lines take a background that
varies by restoration kind (soft-stay → tan, soft-exit → amber, hard →
deep rust); the row text shades smoothly from black toward red as the
primal step lengthalphashrinks (a visual stalling cue, shifted to
cream → bright-yellow on the dark restoration backgrounds). Color is
emitted only when stdout is a terminal — redirected output and
NO_COLORget plain text with identical column alignment. - Structured logs. Solver-internal diagnostics, warnings, and
developer instrumentation are nowtracingevents under namespaced
targets (pounce::algorithm,pounce::linsol,pounce::mu,
pounce::sqp,pounce::linesearch,pounce::restoration,
pounce::presolve,pounce::py). Logs go to stderr; program
output (iteration table, summary,--dump) stays on stdout. - Spans.
solve,iteration,linear_solve, andrestoration
spans tag nested events with context. - New environment variables:
RUST_LOG(verbosity / per-target
filtering, defaultinfo),POUNCE_LOG_FORMAT=text|json(JSON sink on
stderr, including the per-iterationpounce::iterationstream for
Studio / CI),NO_COLOR/CLICOLOR_FORCE(color policy). Documented
indocs/src/options.mdanddocs/src/troubleshooting.md. - New
pounce-observabilitycrate (subscriber install + iteration
collector) and apounce-common::stylepalette module. - A
log→tracingbridge (tracing_log::LogTracer) so any remaining
log::*call sites — chiefly transitive dependencies — surface through
the subscriber and obeyRUST_LOG. - Branded CLI header. The
pouncebanner now renders a molten
tiger/rust POUNCE logo (terminal-only;NO_COLOR/ non-TTY get plain
text).
Changed
- Per-iteration JSON solve-report data is now sourced from the
pounce::iterationtracing event (via an in-process collector layer)
rather than an in-loop accumulation; the report contents are
unchanged. Capturing iteration history requires the tracing subscriber
installed by the CLI / Python / C frontends (or
pounce_observability::init_for_tests()in tests). - Bumped the
ferallinear-algebra dependency from 0.8.0 to 0.9.0.
Removed
- Dropped the direct
logcrate dependency in favor oftracing.
Added — Active-set SQP with working-set warm start (Phase 5b + 5c + 5d)
A new sequential-quadratic-programming driver sits alongside the
existing interior-point method, opt-in via a single option flip.
Designed for warm-started NLP sequences (MPC, parametric
continuation, homotopy sweeps), where the previous solve's active
set is a strong starting point.
Tutorial: docs/tutorials/active-set-sqp.md.
Python notebook: python/notebooks/06_sqp_parametric_continuation.ipynb.
C example: crates/pounce-cinterface/examples/sqp_warm_start.c.
GAMS example: gams/examples/parametric_sqp_warm_start.gms.
Design note: dev-notes/research/active-set-sqp-warm-start.md.
Algorithm selection (cross-cutting)
- New top-level option
algorithm, valuesinterior-point
(default; existing IPM path) andactive-set-sqp(new SQP driver).
Settable through every interface —add_optionin Rust /
Python,AddIpoptStrOptionin C,pounce.optin GAMS — exactly
likelinear_solveralready is.
SQP suboptions (sqp_* namespace)
sqp_globalization (filter | l1-elastic),
sqp_hessian (exact | damped-bfgs | lbfgs),
sqp_max_iter, sqp_tol, sqp_constr_viol_tol,
sqp_dual_inf_tol, sqp_l1_penalty, sqp_l1_penalty_safety,
sqp_l1_penalty_max, sqp_bt_reduction, sqp_bt_min_alpha,
sqp_print_level, sqp_lbfgs_max_history. Defaults mirror
SqpOptions::default(). Each is "only consulted when algorithm
is active-set-sqp"; the IPM path ignores them silently.
Python — pounce.Problem
New keyword argument and methods:
prob.add_option("algorithm", "active-set-sqp")
x, info = prob.solve(x0, working_set=ws)
ws = info["working_set"] # always present; None on the IPM path
ws = prob.get_working_set()
prob.set_working_set(ws)
prob.clear_working_set()The working_set value is a 2-tuple (bounds, constraints) of
numpy int8 arrays with status codes 0..=3 (Inactive / AtLower /
AtUpper / Fixed-or-Equality). Module-level helper
pounce.classify_working_set(x, x_l, x_u, g, g_l, g_u, lambda_g, z_l, z_u, m_eq, ...) classifies an IPM-converged iterate
into a WS suitable for Problem.solve(working_set=…).
C ABI — four new entry points
Bool IpoptGetWorkingSet(IpoptProblem, IpoptBoundStatus*, IpoptConsStatus*);
Bool IpoptSetWarmStartWorkingSet(IpoptProblem, const IpoptBoundStatus*, const IpoptConsStatus*);
Bool IpoptClearWarmStartWorkingSet(IpoptProblem);
enum ApplicationReturnStatus IpoptSolveWarmStart(
IpoptProblem, ipnumber *x, *g, *obj_val, *mult_g, *mult_x_L, *mult_x_U,
const IpoptBoundStatus *bound_in,
const IpoptConsStatus *cons_in,
IpoptBoundStatus *bound_out,
IpoptConsStatus *cons_out,
UserDataPtr user_data);Plus typedefs IpoptBoundStatus, IpoptConsStatus and the four
status constants POUNCE_WS_INACTIVE (= 0), POUNCE_WS_AT_LOWER
(= 1), POUNCE_WS_AT_UPPER (= 2), POUNCE_WS_FIXED_OR_EQ (= 3).
No existing C entry-point signature changed — cyipopt / JuMP /
AMPL clients link unchanged.
GAMS solver link
Two mechanisms ship in tandem:
- §7.4(a) marginal-based reconstruction (default, no
configuration). The solver link reads variable and equation
marginals (x.m,con.m) at the top of everypouCallSolver
invocation and reconstructs the SQP working set automatically.
Lossy at degenerate active sets — same idiom as CONOPT, IPOPT,
KNITRO under GAMS. - §7.4(b) persistent state file (opt-in via
sqp_state_file <path>inpounce.opt). A small binary blob
with FNV-1a checksum keyed by(n, m, x_l, x_u, g_l, g_u)so
structural changes invalidate cleanly. Falls back to §7.4(a) on
any read failure.
Sensitivity (pounce-sensitivity)
SensResult now carries the converged user-space multipliers
(mult_g, mult_x_L, mult_x_U) and constraint values (g),
so the parametric "predictor + SQP corrector" pattern is a single
SensSolve::run followed by one classify_working_set call.
Hessian sources
The sqp_hessian option selects between three implementations:
exact— useseval_h; pounce-qp's inertia control handles
indefiniteness via diagonal-shift retry (§4.5).damped-bfgs— Powell-damped rank-2 BFGS, densen×n,
guaranteed PSD (Powell 1978).lbfgs— limited-memory BFGS with circular history, default
6 pairs (matches IPOPT'slimited_memory_max_history),
materialized to dense Triplet at QP-solve time.
Globalizations
sqp_globalization selects the SQP outer-loop step-acceptance
test:
filter(default) — Fletcher-Leyffer 2002 Pareto-frontier
filter on(constraint violation, objective). No penalty
parameter; recommended general default.l1-elastic— Han-Powell meritφ(x; ν) = f(x) + ν · violation(x)
with adaptive ν clamped bysqp_l1_penalty_safety/
sqp_l1_penalty_max. SNOPT-style behaviour.
Added — feral_ordering option (FERAL fill-reducing ordering)
User-facing knob for the FERAL backend's fill-reducing ordering. New
string option feral_ordering accepts auto (default; feral's
adaptive dispatcher — picks AMD / AMF / MetisND from cheap pattern
features), auto_race (runs symbolic factorization on AMD, MetisND,
ScotchND, KahipND and keeps the smallest factor_nnz; ~4× a single
symbolic pass, amortized across numeric refactorizations), and the
concrete methods amd, amf, metis, scotch, kahip. Settable
through every interface that consumes pounce.opt /
OptionsList — Rust, Python, C, GAMS, CLI — and also via the
POUNCE_FERAL_ORDERING environment variable for option-free
callers. Reuses the same explicit-set semantics as the other
feral_* options: leaving it unset keeps the FeralConfig::from_env
default (Auto).
The motivating case is pinene_3200_0009, where the cheap Auto
heuristic picks MetisND (88 s numeric) but AMD factors in 19.5 s on
the same matrix; feral_ordering auto_race measures both and lands
on the winner without per-problem manual tuning. See
docs/src/options.md "FERAL backend tuning" and
docs/src/troubleshooting.md for guidance.
Added — AMPL imported (external) function support (issue #49)
.nl files that declare imported functions in their F segments
and call them via f<id> <nargs> tokens are now solved end-to-end.
Set AMPLFUNC to a newline-separated list of shared-library paths;
pounce loads each library via the standard AMPL funcadd_ASL ABI,
binds every referenced funcall id to a (library, name) pair, and
emits TapeOp::Funcall nodes that participate in full forward /
reverse / Hessian sweeps (first- and second-derivative requests
are issued back through the library on demand, with the packed
upper-triangular Hessian indexed as hes[lo + hi*(hi+1)/2]).
Tested against the IDAES general_helmholtz_external.dylib
fixture from the issue report — pounce reaches
EXIT: Optimal Solution Found on the 3-variable Helmholtz
problem. Without AMPLFUNC set, problems that need external
functions fail with a clear error naming the offending function
and pointing at AMPLFUNC.
Limitations: only the Tape (default) AD path supports external
functions. The HybridTape partial-separability path and the
JIT-style HessianProgram path panic on TapeOp::Funcall — both
are alternative routes not on NlTnlp::new's critical path, so
the current production flow is unaffected.
Added — Phase 5a pounce-qp crate
Standalone sparse parametric active-set QP solver. Drives the
SQP subproblem solves; also exposed as a standalone crate
(pounce_qp::ParametricActiveSetSolver). Implements
Gill-Murray-Saunders elastic mode (§4.3), full GMSW EXPAND
anti-cycling (§4.4), Bunch-Kaufman inertia control via
diagonal-shift retry (§4.5), iterative refinement (§4.7), and
Sherman-Morrison-Woodbury Schur-complement factor updates (§4.2,
opt-in via QpOptions::use_schur_updates).
Added — In-repo regression fixtures
crates/pounce-algorithm/tests/hock_schittkowski_subset.rs—
10 HS problems with published closed-form optima.crates/pounce-qp/tests/mm_published_optima.rs—
Maros-Mészáros-flavoured framework with 5 fixtures + reusable
compare_qps_to_published(text, x*, f*, …)helper.crates/pounce-algorithm/tests/parametric_sqp_corrector.rs—
IPM → classify_working_set → SQP corrector end-to-end.crates/pounce-algorithm/tests/sqp_filter_vs_l1_elastic.rs—
parity between the two globalizations.
Added — Auxiliary-equality preprocessing (Phase 0 presolve, issue #53)
A 14-PR series that scaffolds an opt-in Phase 0 presolve pass:
detects block-triangular structure in the equality system, solves
the dependent blocks ahead of the IPM, and substitutes the
recovered variables back into the user TNLP. Targets gas-network,
power-flow, and process-design problems where a few hundred
algebraic state variables eliminate cleanly.
The algorithm and reference implementation are a port of
ripopt PR #32 by
David Bernal Neira (@bernalde).
The ripopt work also vendored the
tutorial_flow_density{,_perturbed}.nl and gaslib11_steady.nl
fixtures we now use for end-to-end testing.
- Hopcroft-Karp incidence matching, Dulmage-Mendelsohn decomposition,
Tarjan SCC → block-triangular form. - Coupling classification (linear / nonlinear / inequality-coupled)
plus a damped-Newton block solver with large-block fallback. - Trivial-elimination pre-pass; inequality-coupled blocks handled
by projection. - Reduction-frame bookkeeping with full multiplier recovery so
final_zL/final_zUround-trip back to the user space. - Orchestrator wired into
PresolveTnlp, gated by
presolve_auxiliary(default off). Diagnostics surfaced via
presolve_auxiliary_diagnostics. - Design note:
dev-notes/auxiliary-equality-preprocessing.md;
user docs indocs/src/auxiliary-presolve.md.
Added — FBBT (Feasibility-Based Bound Tightening, #62)
Three-commit landing of FBBT inside pounce-presolve:
pounce-presolve::interval— outward-rounded interval arithmetic
onf64, withInterval::divreciprocal endpoints rounded
outward (fixes a subtle near-zero straddle case discovered in
review).ExpressionProvidertrait + forward pass walks each constraint
expression and tightens variable bounds from the constraint's
g_l/g_uenvelope.- Reverse propagation + orchestrator wired through
PresolveTnlp
end-to-end. New options:presolve_fbbt(master switch,
default off),fbbt_tol,fbbt_max_iter,fbbt_max_constraints. - Docs:
docs/src/fbbt.md; demo notebook
python/notebooks/08_fbbt.ipynb.
Added — Problem and KKT-system scaling (#61, f00c1f9)
End-to-end wiring of the upstream nlp_scaling_* and
linear_system_scaling option families:
nlp_scaling_method:none/user-scaling(new — pulled from
set_problem_scalingPython API orSetIpoptProblemScaling
C API) /gradient-based(existing, now with target-gradient
knobsnlp_scaling_obj_target_gradientand
nlp_scaling_constr_target_gradient).linear_system_scaling:none/mc19/ruiz(iterative
symmetric infinity-norm equilibration, new) /slack-based.
Applied to the augmented system independent of NLP-level
scaling.- Python
Problem.set_problem_scaling(obj_scaling, x_scaling=None, g_scaling=None)plus a worked example in
python/notebooks/07_scaling.ipynb. - Documentation:
docs/src/scaling.md.
Added — Mehrotra adaptive-μ defaults and init cascade (upstream parity)
mehrotra_algorithmoption routed throughPdSearchDirCalc
(previously parsed but inert).adaptive_mu_globalizationcascade finished per upstream Ipopt;
bound_push/bound_frac/bound_mult_init_val/alpha_for_y
cascade frommehrotra_algorithm yes.least_square_init_primalimplemented in
DefaultIterateInitializer.accept_every_trial_stephonored in the line search and
cascaded frommehrotra_algorithm(matches upstream
initialization behavior).
Added — FERAL backend tunables and 0.8.0 bump
feral_pivtolexposed as anOptionsListoption with
FERAL_PIVTOLenvironment-variable fallback.- Tri-state
cascade_break(#55):auto/on/off, inheriting
the FERAL Phase B default unless explicitly set. - Workspace bump to
feral 0.8.0, which ships the SSIDS-aligned
strict-zero-pivot inertia policy (feral gh#54 / pounce gh#52,
nuffield2_trap). The temporary[patch.crates-io]block
pointing at the local feral checkout has been removed.
Added — pounce-solve-report crate + IpoptWriteSolveReport C API
- New publishable crate
pounce-solve-report(first crates.io
release) emits the machine-readablepounce.solve-report/v1
JSON shared by the CLI, the C ABI, and the GAMS driver. - C ABI:
IpoptWriteSolveReport(IpoptProblem, const char *path)
writes the report to disk afterIpoptSolve. - GAMS driver now emits
pounce.solve-report/v1alongside the
.lstso studio tooling can consume it directly.
Added — Diagnostics dumps
--dump iterates:{summary,full}(#68) — per-iteration trajectory
artefacts the studio can replay.summarywrites one JSON line
per outer iteration;fulladds the primal/dual vectors and
KKT residuals.--dump kkt:*+L(#69) — augments the existing KKT-system dump
with the LDLᵀ factor pattern (block structure, fill-in, pivot
signs) for inertia post-mortems.print_options_documentation yesnow actually walks the
registered options and emits a categorized dump (previously a
registered-but-inert toggle).
Added — Studio Claude-skill and MCP GAMS tools
studio/skill/— Claude-skill front-end as an alternative to the
MCP server. Lighter-weight install path for users who just want
the studio prompts and don't need an MCP runtime.studio/mcp— new GAMS problem tools (run_gams_problem,
analyze_gams_problem,parse_gams_listing,
list_gams_examples) plus an install script.
Added — Parallel batched pounce.jax.vmap_solve_parallel + GIL release (pounce#74)
pounce_py::Problem::solve now releases the Python GIL across the
optimize_tnlp call (every TNLP callback was already
Python::with_gil-wrapped, so this is a localized
py.allow_threads block in crates/pounce-py/src/problem.rs).
That unlocks true concurrent IPM iteration across independent
Problem instances on different OS threads — Python-level
f / g callbacks still serialize on the GIL but the linear-algebra
heart of the solver runs in parallel.
pounce.jax.vmap_solve_parallel rides that change: a drop-in
replacement for vmap_solve that dispatches the batch over a
ThreadPoolExecutor of independent Problem instances. Forward
is parallel via the threadpool; backward is jax.vmap over the
per-element KKT solve (pure JAX, vectorizes naturally).
from pounce.jax import vmap_solve_parallel
x_batch = vmap_solve_parallel(
p_batch, f=f, g=g, x0=x0, n=n, m=m,
lb=lb, ub=ub, cl=cl, cu=cu,
workers=8, # default: min(B, 8)
)Microbench (n=30, B=16, nonlinear unconstrained, M1 8-core):
vmap_solve 1.00s → vmap_solve_parallel(workers=8) 0.37s
(~2.75×). Speedup grows with per-element solve cost. Numerically
identical to the sequential reference.
Added — pounce.jax.solve_with_warm (pounce#74)
Companion to pounce.jax.solve that threads the previous solve's
dual triple (mult_g, mult_x_L, mult_x_U) into the next call via
IPOPT's warm_start_init_point=yes machinery.
from pounce.jax import solve_with_warm
x_star, warm = solve_with_warm(
p, f=f, g=g, x0=x0, n=n, m=m,
lb=lb, ub=ub, cl=cl, cu=cu,
warm_start=None, # cold first call
)
for p_k in trajectory[1:]:
x_star, warm = solve_with_warm(
p_k, f=f, g=g, x0=x_star, n=n, m=m,
lb=lb, ub=ub, cl=cl, cu=cu,
warm_start=warm, # threaded duals
)Differentiable w.r.t. p via the same implicit-function rule as
solve. Cotangents on the warm-state outputs and the warm-state
inputs are dropped (zero) — at the optimum the duals are a
function of p and the active set, not an independent input to
dx*/dp. solve itself is unchanged (non-breaking).
Added — pounce.jax.JaxProblem build-once/solve-many handle (pounce#75)
Iterative outer loops (differentiable constrained layers in a
training step, parametric sweeps) were paying a ~45ms rebuild on
every call to pounce.jax.solve / vmap_solve_parallel /
solve_with_warm — re-JIT of jax.grad/jacrev/hessian, the
one-shot random sparsity probe, plus a fresh pounce.Problem
construction — versus a ~3ms underlying solve. On n=5, m=6
problems that's a ~14× wrapper overhead.
JaxProblem is a build-once handle: do the JIT and sparsity probe
in __init__, then expose .solve(p, x0), .solve_with_warm(p, x0, warm), .vmap_solve(p_batch, x0), and .vmap_solve_parallel(p_batch, x0, workers=) as methods that reuse the prebuilt state across
calls. Each worker thread in vmap_solve_parallel keeps its own
cached pounce.Problem via threading.local so the build cost is
paid at most once per worker (typically min(B, 8) total) rather
than B times per batch.
from pounce.jax import JaxProblem
jp = JaxProblem(
f=f, g=g, n=n, m=m, p_example=p0,
lb=lb, ub=ub, cl=cl, cu=cu,
options={"tol": 1e-9, "print_level": 0},
)
for p_k in trajectory:
x_star = jp.solve(p_k, x0=x_prev)
x_prev = x_starMicrobench on the issue's n=5, m=6 shape — 20 sequential solves at
different p:
top-level solve (20 calls): 1.914s → 95.7ms/solve
JaxProblem.solve (20 calls): 0.136s → 6.8ms/solve
speedup: 14.1x
Existing top-level solve / vmap_solve / vmap_solve_parallel /
solve_with_warm are unchanged (non-breaking) — JaxProblem is a
new surface for performance-sensitive iterative use.
Added — JaxProblem factor-reuse backward (k_aug-style; pounce#76)
The custom_vjp backward of JaxProblem.solve /
solve_with_warm no longer assembles a dense
(n+m) × (n+m) KKT block in JAX and runs jnp.linalg.solve on it.
Instead it reuses the IPM's converged compound KKT factor through
pounce.Solver.kkt_solve — the same factor k_aug uses for
parametric sensitivity. Two wins:
- Perf. The dense back-solve is O((n+m)³) on every bwd call;
reusing the held LDLᵀ factor makes it O(nnz(L)). For modestn
the absolute savings are small; forn+min the hundreds-to-
thousands it dominates the bwd. - Correctness. The compound block's bound-multiplier rows
(z_l, z_u)already encode active-set behaviour — at convergence
active bounds have unboundedz(forcesΔx_i = 0in the
back-solve), inactive bounds havez ≈ 0(leavesΔx_ifree).
Slack inequality rows in the user'sgare handled the same way
by(v_l, v_u). The factor-reuse path therefore drops the
explicit active-set masking the dense path does onH/J/v;
accuracy isO(μ)at the IPM barrier parameter, well belowtol
after convergence.
Behaviour change: JaxProblem(factor_reuse=True) is the default. Set
factor_reuse=False for a verbatim fallback to the pre-#76 dense
backward (useful for higher-order differentiation, since the dense
backward stays JAX-traced and is itself differentiable).
Plumbing:
pounce.Solverexposes a newblock_dimsgetter returning the
(n_x, n_s, n_y_c, n_y_d, n_z_l, n_z_u, n_v_l, n_v_u)layout of
the compound KKT vector so the JAX bwd can pack a partial RHS
(just the x-block) and unpacku_x/u_y_c/u_y_d.- Each fwd registers its converged
Solverin a bounded-LRU cache
on theJaxProblem(default capacity 128, exposed as
clear_solver_cache()for early eviction). LRU rather than
pop-on-read becausejax.jacobiancalls the bwd N times per
fwd; pop semantics would crash from the second direction onward. - The back-solve
pure_callbackuses
vmap_method="sequential"sojax.jacobian/jax.vmapof a
loss-gradient correctly iterate one cotangent at a time across
the impure host call.
The standalone pounce.jax.solve / vmap_solve_parallel /
solve_with_warm keep the dense backward for now.
Added — JaxProblem.batched_solve stacked block-diagonal solve (pounce#76 (A))
JaxProblem.batched_solve(p_batch, x0) runs one IPM solve over a
single NLP whose variables are [x^(1); ...; x^(B)], constraints are
concat(g(x^(k), p^(k))), and objective is Σ_k f(x^(k), p^(k)).
The Jacobian and Lagrangian Hessian are block-diagonal (no
cross-block coupling, since each block-k constraint touches only
the block-k slice of X and the objective is a pure sum), so the
IPM sees one big sparse problem but spends linear-system work
proportional to B × (per-block factor cost).
Complementary to the existing batched surfaces:
vmap_solve— sequentialjax.lax.map, one solve per iterate.vmap_solve_parallel— B independent IPMs in a
ThreadPoolExecutor(GIL released per solve). Wins when batch
elements have very different convergence behaviour.batched_solve— one stacked IPM. Wins when blocks have similar
convergence behaviour (shared barrier homotopy and shared
symbolic factorisation amortise across the batch) and when B is
large enough that the per-call Python overhead of B fwd
dispatches becomes visible — one Rust crossing instead of B.
custom_vjp-wrapped: jax.grad / jax.jacobian through
batched_solve work end-to-end. The bwd vmaps the per-element
dense KKT back-solve, which is exact because the block-diagonal
coupling means ∂x^(k)*/∂p^(j) = 0 for k ≠ j.
Plumbing:
_StackedJaxNlplifts the per-block sparsity pattern (cached on
the parentJaxProblemfrom the one-shot probe) to the stacked
problem's block-diagonal pattern at construction time, so the
per-solvejacobianstructure/hessianstructurecallbacks are
O(1).- Stacked Problems are built per (thread, B) with a tiny LRU on
theJaxProblem(cap 4) keyed by batch size — guards against
cycling between a couple of sizes (e.g. eval batch ≠ train
batch). - Per-block bounds
lb/ub/cl/cuare tiled across the batch;
per-block bounds aren't exposed on this surface.
Changed
pounce-qp::ParametricActiveSetSolver::solve_equality_plus_bounds
now falls through tosolve_elasticwhen the equality-relaxed
cold start violates a variable bound. Previously returned
UnsupportedFeature.optimize_sqp_tnlpnow populatesSolveStatistics
(iteration_count,final_dual_inf,final_constr_viol,
final_objective) soGetIpoptIterCount,info["iter_count"],
etc. report SQP-side numbers on the SQP path.
Fixed
- SQP
check_kktstationarity formula: was∇f + Jᵀ λ_g + λ_x,
must be∇f + Jᵀ λ_g − λ_x(pounce-qp packs
λ_x = z_l − z_u = −λ_sat). Latent — only triggered by problems
with an active variable bound. Discovered on a 3-D simplex
projection. fix(mu): guard probing oracle against corrupted iterate (#58)
— the probing oracle no longer dereferences fields of an
iterate that the line-search rejected mid-update.fix(mu/probing): σ denominator usescurr_avrg_compl, not
data.curr_mu, matching upstream.fix(mu-oracle): allow inexact affine predictor solves to feed
the quality-function oracle (upstream parity).fix(l1-wrapper): use multi-pass restoration factory provider (#24)— the ℓ₁ penalty wrapper now nests a restoration sub-IPM
whose own restoration provider is the multi-pass factory,
matching the outer IPM path.fix(restoration): restoration sub-IPM inherits the outer
mu_strategyrather than resetting tomonotone.fix(feral): zero-pivot factorizations on LP-shape KKT
systems route toSingularinstead of bubbling up as
Internal.fix(fbbt): outward-round reciprocal endpoints in
Interval::divfor the near-zero straddle case.fix(presolve): auxiliary preprocessing +presolve_bound_tightening
infeasibility paths (#60).fix(init/ls): perturbdelta_c/delta_dby 1e-8 in the
least-squares-init augmented system to avoid exact rank
deficiency.fix(scaling): scaled_l/d_uin step withd(x)under
gradient-based scaling.fix(hsl): HSL build script is a no-op whenCOINHSL_DIRis
unset, socargo buildworks on machines without HSL
installed even with thema57feature off.fix(benchmark-report): composite report now globs the newest
pounce_*.jsonunderbenchmarks/mittelmann/results/instead
of hard-codingpounce_v0.1.0.json.fix(jax):pounce.jax.solvebackward pass now respects the
constraint active set, not just variable bounds. Slack inequality
rows are dropped from the implicit-function-theorem KKT block via
the same identity-augment trick used for active bounds; previously
they were kept as equalities, silently returning the wrong
dx*/dpwhenever an inequality was inactive at the optimum
(pounce#73).
Docs
docs: adaptive-μ option tables, scaling worked example, troubleshooting guide—docs/src/options.md,
docs/src/scaling.md,docs/src/troubleshooting.mdrefreshed.- FBBT reference page (
docs/src/fbbt.md) and Pyomo demo
notebookpython/notebooks/08_fbbt.ipynb(#62). - Scaling docs page (
docs/src/scaling.md) + Python demo notebook
python/notebooks/07_scaling.ipynb(#61). studio/skillREADME: correctedPOUNCE_BINclaim,
inspect --json, sibling-feral layout.- README badges: PyPI version + downloads for
pounce-solverand
pyomo-pounce; Zenodo DOI
10.5281/zenodo.20387011published.
Compatibility
- All existing IPM users (
IpoptSolve,Problem.solve(x0=…),
option nlp = pouncewithoutalgorithmset) continue to
behave identically. Every Phase 5 addition is opt-in. - The C ABI is strictly additive — four new symbols, no signature
changes. - The Python
Problem.solvesignature gained one optional kwarg
(working_set=None); positional callers are unaffected.
Algorithm-path isolation guarantees
The IPM and active-set SQP paths share the TNLP layer, options
registry, linear-solver backend, and finalize_solution, but are
otherwise isolated. Toggling algorithm is always safe:
- The default (
algorithm = interior-point) runs zero Phase 5
code. Users who never setactive-set-sqpare unaffected. sqp_*options are silently ignored on the IPM path.- IPM warm-start options (
warm_start_init_point,bound_push,
bound_frac,slack_bound_push,mult_init_max,mu_init,
mu_target, …) are silently ignored on the SQP path. - Warm-start payloads are path-local:
set_sqp_warm_start(SqpIterates)/
Problem.solve(working_set=…)/IpoptSetWarmStartWorkingSet
feed the SQP loop only;lagrange=/zl=/zu=paired with
warm_start_init_point=yesfeed the IPM only. info["working_set"]is always present in the Python info
dict but isNoneon the IPM path.- Callers can flip between paths across solves on the same
problem handle — the parametric corrector pattern in the
tutorial uses this for cold IPM warm-up followed by an SQP
corrector.
These guarantees are exercised by the test suite: see
application_default_does_not_select_sqp,
application_sqp_warm_start_auto_clears_after_use,
application_sqp_warm_start_round_trip, and
test_get_working_set_returns_none_on_ipm_path (Python).