Add prompt-management surface refinements (proposal 0033 + wishes 1, 5)#79
Merged
Merged
Conversation
Implements proposal 0033 (spec v0.26.0) plus python-side wishes 1 (FS flat layout) and 5 (Jinja-undefined opt-out). New fields on Prompt and PromptResult: - sampling: SamplingConfig | None, a RuntimeConfig subclass mirroring the seven declared fields plus extras. Splats directly into provider.complete(config=...) without translation. - observability_entities: dict[str, Any] | None, with the spec-normative key langfuse_prompt holding the Langfuse SDK Prompt reference (replaces the implementation-defined metadata key from proposal 0031's v0.23.0 placeholder). New LabelResolver primitive (openarmature.prompts.LabelResolver) with the spec §7 three-step fallback: per-name override > default override > spec fallback "production". The reference impl MappingLabelResolver is mapping-backed; the Protocol is open to JSON-file or remote-config implementations. PromptManager accepts label_resolver= and jinja_undefined= constructor kwargs; the render Environment is now per-instance to let the undefined-class knob bite without affecting other managers. FilesystemPromptBackend gains layout= (per-label default, flat opt-in) and sampling_source= (none default, per-prompt-sidecar reading <root>/<label>/<name>.config.json, or unified reading <root>/prompt_configs.json once at construction). A latent import cycle between openarmature.llm and openarmature.prompts surfaced once prompt.py imported RuntimeConfig from the llm package (for the SamplingConfig subclass). Deferred the current_prompt_group / current_prompt_result imports in openai.py to function-local; same behavior, no top-level re-entry. Spec submodule bumped to v0.26.0; conformance.toml grows entries for proposals 0033 (PR 2) and 0034 (PR 4), both not-yet pending the release PR. Fixture-parser defers prompt-management/015 and 016 (the PM-specific harness models the new shapes; the cross-capability parser doesn't) and observability/027-030 (PR 4 territory). Tests: four new prompt-management fixtures (013-016) plus six new unit tests covering the python-only ergonomics (jinja opt-out, flat layout, sidecar variants, LabelResolver precedence). Second of 6 PRs in the v0.10.0 batch.
There was a problem hiding this comment.
Pull request overview
Updates the prompt-management Python implementation to match spec v0.26.0 / proposal 0033, adding per-prompt sampling configuration, observability entity propagation, deployment-time label routing, and configurable Jinja undefined behavior.
Changes:
- Add
SamplingConfig,Prompt.sampling, andPromptResult.sampling(plus filesystem sidecar/unified config loading). - Add
LabelResolver/MappingLabelResolverand makePromptManager.fetch/getsupport label resolution when label is omitted. - Add
observability_entitiespropagation, bump spec pin/version, and adjust conformance harness/tests accordingly.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_prompts.py | Adds unit tests for Jinja undefined opt-out, filesystem layouts/sampling, and label resolver precedence. |
| tests/test_smoke.py | Updates asserted spec version to v0.26.0. |
| tests/conformance/test_prompt_management.py | Extends conformance runner to handle sampling/observability entities and multi-manager/cases fixture shapes. |
| tests/conformance/test_fixture_parsing.py | Skips new fixtures not yet modeled by the cross-capability parser. |
| tests/conformance/harness/prompt_management.py | Updates prompt-management fixture schema for new directive shapes and top-level expected capture assertions. |
| src/openarmature/prompts/prompt.py | Introduces SamplingConfig and adds sampling + observability_entities fields to Prompt/PromptResult. |
| src/openarmature/prompts/manager.py | Adds label resolver + per-instance Jinja environment; propagates sampling/observability entities into PromptResult. |
| src/openarmature/prompts/label_resolver.py | Adds the LabelResolver protocol and MappingLabelResolver reference implementation. |
| src/openarmature/prompts/backends/filesystem.py | Adds layout and sampling_source options; supports sidecar/unified sampling config. |
| src/openarmature/prompts/init.py | Exports new prompt-management symbols (SamplingConfig, label resolver types/constants). |
| src/openarmature/llm/providers/openai.py | Defers prompt context imports to avoid module import cycles. |
| src/openarmature/AGENTS.md | Updates embedded spec version reference. |
| src/openarmature/init.py | Bumps __spec_version__ to v0.26.0. |
| pyproject.toml | Updates tool-level spec_version pin. |
| docs/concepts/prompts.md | Documents new knobs/fields: jinja_undefined, sampling, label resolver, observability entities. |
| conformance.toml | Pins spec to v0.26.0 and adds proposal entries 0033/0034 as not-yet. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two defensive fixes from CoPilot PR review on #79: - Unified-mode sampling source: validate each per-prompt entry in prompt_configs.json is a JSON object before calling _sampling_from_dict, raising a structured PromptStoreUnavailable on shape drift. Matches the symmetric top-level guard already in _load_unified_configs. Relaxed _unified_sampling's value type to Any so the runtime isinstance guard remains meaningful (the cast would have made it dead code). - PromptResult construction: shallow-copy prompt.sampling (model_copy) and prompt.observability_entities (dict(...)) so a caller mutating the result can't leak into the source Prompt or whatever instance the backend may be caching.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements proposal 0033 (spec v0.26.0) plus python-side wishes 1 (FilesystemPromptBackend flat layout) and 5 (Jinja
Undefinedopt-out).New fields on
PromptandPromptResult:sampling: SamplingConfig | None— aRuntimeConfigsubclass mirroring the seven declared fields plus extras. Splats directly intoprovider.complete(config=...)without translation.observability_entities: dict[str, Any] | None— spec-normative keylangfuse_promptholds the Langfuse SDK Prompt reference; replaces the implementation-definedmetadataplaceholder from proposal 0031's v0.23.0.New
LabelResolverprimitive atopenarmature.prompts.LabelResolver. Three-step fallback: per-name override > default override > spec fallback"production".MappingLabelResolveris the reference impl; the Protocol is open to JSON-file / remote-config implementations.PromptManagerknobs:label_resolver=...— optional resolver consulted whenfetch()/get()is called without an explicitlabel.jinja_undefined=...— optional JinjaUndefinedsubclass; defaultStrictUndefinedmatches spec §8 (was §7). The renderEnvironmentis now per-instance so this knob bites without affecting other managers.FilesystemPromptBackendknobs:layout="per-label" | "flat"— flat reads<root>/<name>.j2ignoring label; returns the requested label on the Prompt verbatim. Wish 1.sampling_source="none" | "per-prompt-sidecar" | "unified"— sidecar adapter forPrompt.samplingper spec §5's informative filesystem conventions.Import-cycle fix. A latent cycle between
openarmature.llmandopenarmature.promptssurfaced onceprompts/prompt.pyimportedRuntimeConfig(for theSamplingConfigsubclass). Deferred thecurrent_prompt_group/current_prompt_resultimports inopenai.pyto function-local; same behavior, no top-level re-entry.Spec submodule bumped to v0.26.0;
conformance.tomlgrows entries for proposals 0033 (this PR) and 0034 (PR 4), bothnot-yetpending the release PR. The fixture-parser defers prompt-management/015 and 016 (the PM-specific harness models the new shapes; the cross-capability parser doesn't) and observability/027-030 (PR 4 territory).Second of 6 PRs in the v0.10.0 batch. Plan in coord thread
04-python-batched-impl-plan.md; spec greenlit in msg 05.Test plan
sampling, unifiedprompt_configs.jsonkeys by name, threeMappingLabelResolverprecedence casespython3 scripts/check_conformance_manifest.pyexits 0 with 30 accepted proposals / 30 manifest entries