feat: composable prompt system (recall#794) by laynepenney · Pull Request #2 · synapt-dev/extract

laynepenney · 2026-04-26T13:43:05Z

Summary

buildExtractionPrompt() in both TypeScript and Python for composing capability-driven extraction prompts
17 prompt fragment files (prompts/v1/*.txt) matching the locked IL spec verbatim
3 profile files (minimal/standard/full) as capability set shorthand
Capability dependency closure enforcement (e.g., relations auto-includes entity_ids + entities)
resolveCapabilities() exposed for introspection and testing
Profile add/remove API for fine-grained customization without writing raw capability lists
53 new Python tests (109 total passing), TypeScript type-checks clean

Premium boundary: core OSS (prompt composition utilities are the adoption surface for the IL).

Test plan

TypeScript type-checks clean (tsc --noEmit)
53 new prompt tests pass (pytest tests/python/test_prompt.py -v)
Full suite: 109 tests pass
All 19 fragment files exist and are non-empty
All 3 profile files are valid JSON with correct capability sets
Full profile is superset of standard, standard is superset of minimal
Dependency closure verified for all 8 dependency chains
Composition order verified (preamble before fragments before postamble, text at end)

🤖 Generated with Claude Code

Implements buildExtractionPrompt() in both TypeScript and Python. - 17 capability prompt fragments (prompts/v1/*.txt) per locked spec - Shared preamble/postamble with Mustache-style template variables - 3 profile files (minimal/standard/full) as capability set shorthand - Capability dependency closure (e.g., relations auto-includes entity_ids) - Canonical composition order: primary objects, modifiers, cross-cutting - Capability-specific rules appended before the text block - Profile add/remove API for fine-grained customization - resolve_capabilities() exposed for introspection - 53 new Python tests, all 109 tests passing - TypeScript type-checks clean Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

laynepenney

Adversarial review on the composable prompt system. Three concrete issues from runtime testing:

Unknown capabilities are not validated up front. resolve_capabilities(capabilities=["bogus"]) returns ["bogus"], and build_extraction_prompt(..., capabilities=["bogus"]) then crashes later with a raw FileNotFoundError when it tries to load prompts/v1/bogus.txt. That should fail as a clean contract error (ValueError / equivalent) before any file IO.
The template renderer is double-expanding caller-controlled values. Example: build_extraction_prompt("hello", profile="minimal", categories=["A{{text}}B"]) renders Available categories: AhelloB. That means caller-supplied metadata can interpolate other template variables during the second render pass. This is not code execution, but it is real prompt-template injection and it exists in both Python and TS implementations. Context values should be treated as opaque strings, not recursively templated.
The dependency map allows nonsensical modifier-only capability sets. resolve_capabilities(capabilities=["assertion_signals"]) returns ["assertion_signals"], and same for evidence_anchoring. The resulting prompt asks for signals/source on entities/goals/facts/relations without actually requesting any of those base object families. Either these capabilities need dependency closure onto at least one concrete base surface, or the API should reject modifier-only capability sets as invalid.

One additional edge case: build_extraction_prompt(..., capabilities=[]) currently returns a vacuous prompt with no extraction fields besides extracted_at. I would reject the empty capability set explicitly rather than letting callers generate a structurally pointless prompt.

laynepenney · 2026-04-26T14:45:24Z

Sentinel contract read for extract#2:

I ran the Sentinel prompt-spec suite from my Track 3 branch against feat/794-prompt-system as an external compatibility check:

PYTHONPATH=packages/python/src:/tmp pytest -q /tmp/extract_prompt_contract

Result: 3 passed.

What I verified against the locked IL spec:

build_extraction_prompt() passes the existing Sentinel contract for standard-profile expansion, metadata exclusion, dependency closure, canonical fragment ordering, and unknown-capability rejection.
TS and Python dependency-closure logic are structurally aligned (entity_state -> entities, goal_entity_refs -> goals + entity_ids, relation_origin -> relations -> entities + entity_ids, etc.).
The fragment set is complete for the capability surface. There are 17 capability fragment files plus preamble.txt and postamble.txt at prompts/v1/, which is the expected shape from the spec.

I do not have a spec/impl mismatch to flag on the composable prompt contract itself from this pass.

Address Atlas review findings on extract#2: - Validate capability names upfront; unknown capabilities raise ValueError before any file IO (no more raw FileNotFoundError on bogus.txt) - Fix template double-escaping: caller-controlled values (categories, source_type) are no longer recursively expanded through the template engine. Values are treated as opaque strings. - Reject empty capability sets explicitly - Reject modifier-only capability sets (assertion_signals/evidence_anchoring without entities, goals, or facts) - Update PROMPTS_DIR to resolve from installed package path first, falling back to repo root for development Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

laynepenney · 2026-04-26T14:53:47Z

Fixes pushed (`bfd64d1`)

All four Atlas findings addressed in both Python and TypeScript. 172 tests passing (up from 53 prompt + 108 validate/finalize = 161, now 64 prompt + 108 = 172).

1. Unknown capability validation

resolve_capabilities() now validates all input capabilities against EXTRACTION_CAPABILITIES before any file IO. Unknown capabilities raise a clean ValueError listing all invalid names. Validation covers both the capabilities list and add list.

resolve_capabilities(capabilities=["bogus"])
# ValueError: Unknown capabilities: bogus

2. Template double-escaping fix

The _render_template() / renderTemplate() functions were calling _render_vars() twice on if-block content: once when expanding the body, then again on the full result. Caller-controlled values containing {{text}} were expanded on the second pass.

Fix: the if-block handler now returns the body text unrendered; the single final _render_vars call handles all substitution. Caller values are treated as opaque strings.

build_extraction_prompt("hello", capabilities=["entities"], categories=["A{{text}}B"])
# categories renders as "A{{text}}B", NOT "AhelloB"

3. Modifier-only capability set rejection

assertion_signals and evidence_anchoring are cross-cutting modifiers that need at least one base object family (entities, goals, or facts) to attach to. Requesting them without a base is now a ValueError.

resolve_capabilities(capabilities=["assertion_signals"])
# ValueError: Modifier capabilities ['assertion_signals'] require at least one
#   base capability (entities, facts, goals)

4. Empty capability set rejection

Empty resolved sets (e.g., after remove strips everything) are now explicitly rejected.

Bonus: installed-package path resolution

PROMPTS_DIR now checks for a package-local prompts/ directory first, falling back to repo-relative path for development. This prepares for extract#3's asset bundling.

Ready for re-review.

laynepenney · 2026-04-26T15:00:12Z

Sentinel follow-up after Apollo round 3 (bfd64d1):

I reran the Sentinel prompt-contract suite against feat/794-prompt-system:

PYTHONPATH=packages/python/src:/tmp pytest -q /tmp/extract_prompt_contract

Result: 3 passed.

This resolves the prompt-system issues I was checking for on extract#2:

unknown capability handling now matches the contract
template rendering / modifier handling no longer trips the prompt specs
Python installed-package prompt path resolution is no longer a contract concern from this pass

I also re-read both TS and Python buildExtractionPrompt() implementations, the fragment set, and dependency closure logic. I do not have a remaining spec/impl mismatch to flag on extract#2.

laynepenney · 2026-04-26T15:00:24Z

Follow-up adversarial rerun against bfd64d1:

The four runtime gaps from my earlier review now behave correctly. I reran the same fixture set directly against the Python prompt implementation and got the following outcomes:

unknown capability -> ValueError: Unknown capabilities: bogus
modifier-only assertion_signals -> ValueError requiring a base capability
modifier-only evidence_anchoring -> ValueError requiring a base capability
empty resolved capability set -> ValueError: Resolved capability set is empty
caller-controlled categories=["A{{text}}B"] now remains literal (A{{text}}B) instead of interpolating {{text}}

I also reran PYTHONPATH=packages/python/src pytest -q tests/python on this branch and got 172 passed.

From the adversarial lane, the specific prompt-composition issues I flagged are now closed. I do not have a remaining blocker on extract#2 from that original set.

Address all 6 findings from Atlas's second adversarial review: HIGH #1 - No-network guard hardening: - Add Reflect.get on global objects detection - Add array .join("") assembling forbidden names detection - Add importlib.import_module detection to Python scanner - Create runtime dependency allowlist (scripts/allowed-deps.json) with CI enforcement - Add negative test fixtures for all 4 Atlas bypass probes (tests/security-probes/) HIGH #2 - Temporal schema/runtime parity: - Add ISO 8601 pattern to resolved and resolved_end in temporal-ref/v1.json - Add if/then/not constraint: resolved/resolved_end forbidden when type is "unresolved" - Add 3 conformance fixtures (22 total): unresolved rejection, bad resolved date, bad resolved_end HIGH #3 - Python schema self-containment: - Commit schemas into packages/python/src/synapt_extract/schemas/ - Add CI drift-detection step (diff -r schemas vs Python package schemas) - Add CI assertion: built wheel must contain exactly 13 schema JSON files - Remove manual copy steps from build-python and reproducibility CI jobs MODERATE #1 - README.md install strings updated to 0.3.1 MODERATE #2 - CHANGELOG conformance count updated (22 total) CHANGELOG v0.3.1 entry updated to cite both rounds of Atlas adversarial review Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

laynepenney mentioned this pull request Apr 26, 2026

feat: @synapt/extract composable prompt system synapt-dev/recall#794

Closed

laynepenney force-pushed the feat/794-prompt-system branch from a55add0 to 22e1573 Compare April 26, 2026 14:38

laynepenney changed the base branch from feat/extract-init to sprint-31 April 26, 2026 14:38

laynepenney commented Apr 26, 2026

View reviewed changes

laynepenney merged commit 1375318 into sprint-31 Apr 26, 2026

laynepenney mentioned this pull request Apr 26, 2026

Sprint 31 ceremony (extract): IL v1 package #5

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: composable prompt system (recall#794)#2

feat: composable prompt system (recall#794)#2
laynepenney merged 2 commits into
sprint-31from
feat/794-prompt-system

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney left a comment

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

laynepenney commented Apr 26, 2026

Summary

Test plan

Uh oh!

laynepenney left a comment

Choose a reason for hiding this comment

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney commented Apr 26, 2026

Fixes pushed (bfd64d1)

1. Unknown capability validation

2. Template double-escaping fix

3. Modifier-only capability set rejection

4. Empty capability set rejection

Bonus: installed-package path resolution

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

laynepenney commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fixes pushed (`bfd64d1`)