fix(annealing): repair promotion source assignment + autonomous approval#393
Merged
Conversation
Two defects kept annealing proposals from behaving as advertised in
their basic golden path. Investigation traced both to confirmed root
causes with runtime evidence.
1. Promotions created empty ontologies.
get_first_order_source_ids() ran a Cypher query that combined a
bound variable with an aggregate in one WITH
(`neighbor_sources + collect(...)`) — legal in Neo4j, rejected by
Apache AGE ("must be either part of an explicitly listed key or
used inside an aggregate function"). The exception was swallowed,
the function returned [], and every execute_promotion() reassigned
zero sources — minting Ontology nodes with no concepts. Split into
two fixed-length queries (the anchor's own sources + its neighbors'
sources) unioned in Python. Verified against the live graph: a
known anchor now returns 4 primordial-scoped sources where the old
query threw.
2. Autonomous mode still had a human-raceable 'pending' window.
Auto-approval ran as a post-cycle batch ~30s after proposals were
written, so an operator could click Approve first (observed:
one proposal executed triggered_by=admin, the rest by the worker).
In autonomous mode the manager now stores proposals already
'approved' (reviewed_by=annealing_worker) — no pending window
exists. The worker's post-cycle step is dispatch-only
(_dispatch_approved_proposals), guarded on status='approved' so a
re-run cannot double-dispatch. hitl mode is unchanged: proposals
are born 'pending' for human review.
Not changed (by design): a proposal approved against a now-missing
ontology still fails gracefully with a clear error. That is correct
behaviour — a real reviewer would not approve a missing target — so
no state-reconciliation layer is warranted.
Files:
- api/app/lib/age_client/ontology_edges.py — AGE-safe source query
- api/app/services/annealing_manager.py — automation_level, born-approved
- api/app/workers/annealing_worker.py — dispatch-only post-cycle step
- tests/unit/services/test_annealing_manager.py — born-approved/pending tests
Tests: 465 unit passed (incl. 2 new); promotion query verified live.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Goal: get annealing proposals to behave as advertised, in their most basic golden-path mode. An investigation of the annealing agent surfaced three issues — two are real defects (fixed here), one is correct behaviour as-is.
Issue 1 — promotions created empty ontologies 🔴 fixed
get_first_order_source_ids()ran a Cypher query that combined a bound variable with an aggregate in a singleWITH:Legal in Neo4j; rejected by Apache AGE —
"neighbor_sources" must be either part of an explicitly listed key or used inside an aggregate function. Thetry/exceptswallowed it, the function returned[], and everyexecute_promotion()reassigned zero sources — creating Ontology nodes with no concepts.Runtime logs confirmed it fired for all four recent promotions:
And the graph proved the effect: all 7 Source nodes still
SCOPED_BY primordial, the 4 promoted ontologies empty.Fix: split into two fixed-length queries (the anchor's own sources + its neighbors' sources), unioned in Python. Verified against the live graph — a known anchor now returns 4 primordial-scoped sources where the old query threw.
Issue 2 — autonomous mode still had a raceable "pending" window 🟠 fixed
Auto-approval ran as a post-cycle batch ~30s after proposals were written (each proposal needs an LLM evaluation). During that window proposals sat
pendingin the UI and a human could approve one first — observed in the logs: one proposal executedtriggered_by=admin, the resttriggered_by=annealing_worker, plus aProposal N not pending — skipping auto-approvewarning.Fix: in autonomous mode the manager now stores proposals already
approved(reviewed_by=annealing_worker) — there is no pending window to race. The worker's post-cycle step is dispatch-only (_dispatch_approved_proposals), guarded onstatus='approved'so a re-run can't double-dispatch. The proposal still passes through the sameapproved→proposal_executionjob lifecycle the manual review endpoint uses. hitl mode is unchanged — proposals are bornpendingfor human review.Issue 3 — failed demotions ("Ontology no longer exists") 🟡 no change, by design
When a proposal targets an ontology that was already removed (by a sibling proposal or manual edits),
execute_demotion()fails gracefully with a clear error andstatus=failed. That is correct — a real reviewer would not approve a proposal against a missing target, and with autonomous approval working there is no drift to reconcile. No state-reconciliation layer added.Changes
api/app/lib/age_client/ontology_edges.py— AGE-safe two-query rewrite ofget_first_order_source_idsapi/app/services/annealing_manager.py—automation_levelparam;_store_proposalborn-approved in autonomous modeapi/app/workers/annealing_worker.py— passautomation_levelto the manager;_auto_approve_and_dispatch→ dispatch-only_dispatch_approved_proposalstests/unit/services/test_annealing_manager.py— born-approved / born-pending testsVerification
End-to-end (trigger a real annealing cycle and watch a promotion populate an ontology) is worth doing post-merge — it mutates the graph, so left for an explicit run.