napkin-math: threshold-pairing rule for extract-parameters-from-digest by neoneye · Pull Request #738 · PlanExeOrg/PlanExe

neoneye · 2026-05-20T23:00:17Z

Scope

Single-file change: adds a "Threshold pairing rule" to system-prompt.txt for the extract-parameters-from-digest skill, slotted between "Coverage and capacity gate rule" and "Combined viability gate preservation". No code changes; no compress changes.

Split out of PR #737 because the threshold-pairing rule lives in the extract stage, not the compress stage, so it does not fit that PR's "Phase 1 compress-prompt cleanup" scope.

What the rule says

When the extract emits a key_value whose role is a numeric threshold — a floor, cap, ceiling, minimum, maximum, target volume, target share, or target deadline — it must also emit a paired margin/surplus calculation comparing the realised quantity against the threshold. The pairing has three parts:

The threshold goes in key_values.
The realised quantity goes in missing_values_to_estimate if the source does not name it.
The margin calculation goes in recommended_first_calculations or derived_questions, with realised - threshold (floor: positive = pass) or threshold - realised (cap: positive = pass), using the _margin / _surplus suffix.

Under cap pressure, the rule says to drop a less-load-bearing key_value or move a less-critical calc to derived_questions — never skip the pairing.

Why

Existing rules in the same file (No orphan formula rule, Coverage and capacity gate rule, Dead-end variable prevention) cover the principle abstractly, but extractor runs were leaving threshold key_values unpaired in practice. The new rule operationalises the pairing as a concrete three-part check.

Corpus-agnostic by construction

The rule names only structural categories (floor, cap, ceiling, target volume, target share, target deadline) and the existing _margin / _surplus naming convention. No corpus literals, no plan names, no domain-specific acronyms, no expected output ids.

Regression probes (not acceptance criteria)

Baseline plans are used to detect that the rule moves the right behaviour, not to define what the rule should target. Probes run against the gitignored output/v50/ digests:

A reverse-logistics campaign plan in the baseline previously had two unpaired threshold key_values (a volume target and a charitable-donation floor). The new rule produces paired *_margin calculations for both.
A disaster-response plan in the baseline previously had a generator-fuel priority floor unpaired and a geological-observation trigger mis-classified as a simulatable key_value. The new rule produces the missing pairing for the fuel floor and re-routes the geological trigger to unmodelled_gates.

The probes also surface that the existing 5-cap on missing_values_to_estimate interacts with the threshold-pairing rule in ways that can force tradeoffs (a less-critical existing pairing dropped to make room). This is a known limitation, not a fault of the new rule, and is left for a structural followup.

What this PR does NOT do

It does not add prompt content that targets any specific baseline plan.
It does not raise the 5-cap on missing_values_to_estimate or modify any other hard limit.
It does not address the unrelated compress-LLM run-to-run variance that drops other tripwires from the digest before the extract sees them (that issue is independent of this rule and remains open).

Test plan

CI green (no code changes; just a prompt-text edit)
Hand-spot the new rule paragraph reads corpus-agnostically (no plan names, no literals)
Re-run extract-parameters-from-digest against the napkin_math baseline; the rule should move only structural patterns, not target any single plan

…gest Adds a new "Threshold pairing rule" to the extract skill's system prompt, slotted between "Coverage and capacity gate rule" and "Combined viability gate preservation". Why: extraction runs were leaving threshold key_values (floors, caps, targets, deadlines) unpaired with their realised-vs-threshold margin calcs. The 'no dead-end variables' rule already forbids this in principle, but in practice the abstract directive was being followed unevenly. The new rule makes the pairing explicit and operational: every extracted threshold gets a paired margin calculation in recommended_first_calculations or derived_questions, with the realised quantity declared in missing_values_to_estimate when the source does not name it. The rule is corpus-agnostic — it uses only structural categories (floor, cap, ceiling, target volume, target share, target deadline) and the existing _margin / _surplus naming convention. No corpus literals. Tested by re-running the skill on two v50 baselines that had unpaired thresholds: - crate_recovery_campaign — previously had target_recovered_crates (108k volume target) and minimum_donation_threshold_dkk (500k donation floor) as unpaired key_values. Re-extraction emits q_volume_target_margin and q_donation_minimum_margin as derived_questions per the new rule. Also restructured average_effective_incentive_per_crate_dkk from a missing-value to an explicit recommended_first_calculation so that incentive_per_crate_dkk and pilot_incentive_decel_threshold_crates appear in real depends_on rather than narrative-only suggested_estimation_method prose. - yellowstone_evacuation — previously had hospital_fuel_priority_share (75% generator-fuel floor) unpaired and vei7_uplift_trigger_cm_per_hour mis-classified as a simulatable key_value. Re-extraction emits hospital_fuel_priority_margin and moves vei7 to unmodelled_gates (geological observation, not deterministically simulatable). Dropped zone_zero_evacuation_target_people (population denominator, not a viability threshold) and the weakest existing pair (public_compliance_threshold_zone_one) under cap pressure, since hospital generator-fuel priority is more directly life-safety than downstream traffic congestion. Both v50 parameters.json files now audit clean against the no-dead-end-variables rule (every key_value appears in at least one calc's depends_on).

neoneye · 2026-05-20T23:04:28Z

Consolidating into PR #737 per request — single PR makes it easier to verify the combined compress + extract prompt changes.

neoneye mentioned this pull request May 20, 2026

napkin-math: prompt cleanup for compress + extract #737

Merged

8 tasks

neoneye closed this May 20, 2026

neoneye deleted the napkin-math/phase2-extract-threshold-pairing branch May 20, 2026 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

napkin-math: threshold-pairing rule for extract-parameters-from-digest#738

napkin-math: threshold-pairing rule for extract-parameters-from-digest#738
neoneye wants to merge 1 commit into
mainfrom
napkin-math/phase2-extract-threshold-pairing

neoneye commented May 20, 2026

Uh oh!

neoneye commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

neoneye commented May 20, 2026

Scope

What the rule says

Why

Corpus-agnostic by construction

Regression probes (not acceptance criteria)

What this PR does NOT do

Test plan

Uh oh!

neoneye commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant