Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 37 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,13 @@ those forms.
`--json` enables machine-readable NDJSON output for non-interactive usage
(one complete JSON object per processed input line).

Preload options keep authoritative state and runtime continuation separate:
- `--initial-state-json` / `--initial-state-file` load authoritative state
Preload options keep saved rules separate from in-progress confirmation state:
- `--initial-state-json` / `--initial-state-file` load saved state
(via exported state JSON).
- `--initial-checkpoint-json` / `--initial-checkpoint-file` restore full
checkpoint continuation (authoritative state + pending clarification state).
continuation checkpoint (saved state + pending confirmation state).

REPL command-layer commands (host/controller layer, not engine directives):
REPL commands (controller layer, not engine directives):
- `state` shows current authoritative state.
- `preview <input>` runs deterministic dry-run without mutating live state.
- `step <input>` is an explicit alias of normal bare-input step behavior.
Expand Down Expand Up @@ -113,6 +113,20 @@ The idea is similar to a traditional compiler: user directives are translated in

---

## FAQ

**Is this just prompt reinjection?**
Partly. Hosts still pass state to models as context. The difference is that
state is maintained by a deterministic engine with explicit update rules,
clarification behavior, and inspectable checkpoints.

**Isn’t this just prompt engineering?**
It complements prompt engineering, but solves a different problem. Prompting
shapes model behavior; Context Compiler provides a deterministic state layer
that updates only through explicit directives.

---

## 10-Second Example

User sets a constraint once:
Expand Down Expand Up @@ -181,7 +195,7 @@ Host Application
```

The compiler owns state updates and never calls the LLM.
The host decides whether to call the model based on the returned `Decision`.
Your app decides whether to call the model based on the returned `Decision`.

---

Expand Down Expand Up @@ -227,8 +241,8 @@ Meaning:

### Controller API (Reusable Outside REPL)

These controller-layer APIs are public package exports and can be used directly
in host code (not just inside the REPL).
These controller APIs are public package exports and can be used directly
in app code (not just inside the REPL).

| API | Description |
|---|---|
Expand Down Expand Up @@ -337,6 +351,18 @@ Use policies instead when the constraint is explicit and enforceable:
- “prohibit introducing new external dependencies”
- “use single-step preparation methods”

### Example domains

Hosts define what policy items and premise mean in context. Common patterns:

- safety-oriented constraints (for example, prohibited materials or tools)
- authority/evidence constraints (for example, cite only approved sources)
- software workflow constraints (for example, require `uv`, prohibit `npm`)
- accessibility/environment constraints (for example, no audio-only outputs)

Context Compiler enforces explicit directive/state mechanics. Domain reasoning
still belongs to the host and model workflow.

---

## Directive Examples
Expand Down Expand Up @@ -381,6 +407,10 @@ For full directive grammar and edge-case behavior, see [DirectiveGrammarSpec.md]
- [demos](demos/) — concrete scenarios showing how behavior differs with and without the compiler
- [integrations](examples/integrations/) — production-style host integrations (OpenWebUI, LiteLLM, etc.)

Integration note: current OpenWebUI example pipes return deterministic local
acknowledgements for directive-only `update` decisions instead of forwarding
those turns to the downstream LLM.

---

## Guarantees
Expand Down
42 changes: 20 additions & 22 deletions docs/DescriptionAndMilestones.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ conversations, and state can conflict over time.
This project adds a deterministic state layer that is independent of the model.
The model handles interpretation and generation; the engine handles premise and
policies. Only explicit user directives can change state.
By separating reasoning from state authority, the system improves reliability
By separating reasoning from state ownership, the system improves reliability
without requiring model retraining. The system never derives authoritative state
from model responses.
The goal is not to make the model smarter, but to make interactions
Expand All @@ -26,7 +26,7 @@ The state engine is the source of truth and is model-independent.
Model output is never interpreted to derive or modify state.
All state transitions originate from explicit user directives.

Behavioral details are authoritative in `docs/DirectiveGrammarSpec.md`.
Behavioral details are defined in `docs/DirectiveGrammarSpec.md`.

## Project Milestones

Expand All @@ -45,15 +45,15 @@ The current authoritative state shape and directive semantics are defined in `Di
- Apply explicit state changes as deterministic replacements
- Block ambiguous updates until clarified
- Maintain a source-of-truth state that does not depend on prior model wording
- Provide structured state for host-provided model context
- Provide structured state for app-provided model context

**Deliverables:**

- Directive grammar (conservative pattern set)
- State data model (authoritative conversational state)
- Deterministic update rules for explicit directives and clarification
- Clarification mechanism for ambiguous mutations
- Context serialization interface (`export_json` / `import_json`, state → host application)
- Context serialization interface (`export_json` / `import_json`, state → app layer)
- Reference integration harness (example host)
- Tests: persistence and non-regression of deterministic state updates

Expand All @@ -64,7 +64,7 @@ After correcting or constraining the assistant once, the behavior remains consis
### M3 — Cross-Session Recall (implemented, engine-level / host-enabled)

**Goal**
Extend host-level workflows around persisted exported state safely and intentionally.
Extend app-level workflows around persisted exported state safely and intentionally.

**Core capability:**

Expand All @@ -77,17 +77,17 @@ Extend host-level workflows around persisted exported state safely and intention

**Deliverables:**

- Host-side storage/recovery patterns built on the existing import/export API
- Host-side storage/recovery patterns for checkpoint object/checkpoint JSON continuation restore
- App-side storage/recovery patterns built on the existing import/export API
- App-side storage/recovery patterns for checkpoint object/checkpoint JSON continuation restore

**User-visible outcome:**

When hosts persist exported state, assistants can carry decisions across sessions without reintroducing old conflicts.
Pending confirmation-required flows can be resumed when the host persists checkpoints.
When apps persist exported state, assistants can carry decisions across sessions without reintroducing old conflicts.
Pending confirmation-required flows can be resumed when the app persists checkpoints.

`export_json()` / `import_json()` remain authoritative-state only.
Checkpoint APIs are separate and represent runtime continuation.
Long-term memory remains a host persistence responsibility, not an engine-owned store.
Long-term memory remains an app persistence responsibility, not an engine-owned store.

### 0.6.x

Expand All @@ -104,19 +104,15 @@ Make engine behavior inspectable and externally controllable without guessing.
- State inspection
- Deterministic dry-run / preview
- Structural state diff
- Thin controller layer around step / preview / replay behavior
- Thin stateless controller layer around step / preview behavior
- Machine-readable REPL JSON output containing:
- `decision`
- `prompt_to_user`
- `state`
- JSON input for initial state only:
- versioned one-object-per-line output (`output_version`)
- step / preview / state command result envelopes
- JSON preload for authoritative state and checkpoint continuation:
- `--initial-state-json`
- `--initial-state-file`
- REPL LLM fallback as explicit optional mode:
- `--with-llm-fallback`
- requires `--with-preprocessor`
- never implicit
- inspectable via preview / JSON output
- `--initial-checkpoint-json`
- `--initial-checkpoint-file`
- Explicit preprocessor policy for multi-line, multi-sentence, and conversational-prefix input
(for example `ok. prohibit peanuts`, `sure - use docker`, mixed conversational + directive content)
that is rule-based, fixture-covered, and inspectable
Expand All @@ -133,9 +129,11 @@ Make engine behavior inspectable and externally controllable without guessing.

### Post-0.7 Direction

- Profile commands and workflow conveniences
- 0.8 candidate direction: model-assisted state suggestions (inspectable, previewable,
and never directly mutating authoritative state)
- MCP adapter likely as a separate/later track after 0.8 direction is clearer
- Optional 0.7.1 MCP-readiness helpers only if narrowly justified
- Additional tooling built on auditability surfaces
- Broader heuristic responsibility remains default-avoid unless tightly justified

### 1.0 Target

Expand Down
24 changes: 19 additions & 5 deletions docs/multi-engine.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Most applications should start with a **single Context Compiler engine**.

A single engine is not a single rule.
It maintains a complete authoritative state consisting of:
It maintains a complete saved state consisting of:

- one premise (a single explicit conversational stance)
- a set of per-item policy states (`use` or `prohibit`)
Expand Down Expand Up @@ -39,7 +39,7 @@ Policies do not interact with each other.
- There is no grouping
- There is no domain model

Each policy entry is an independent authoritative key.
Each policy entry is an independent key in state.

## When to Use Multiple Engines

Expand All @@ -52,20 +52,34 @@ Typical cases:
- isolation between workflows
- independent persistence or reset behavior

## Composition Is a Host Concern
## Composition Is an App Concern

The compiler does not coordinate multiple engines.

The host is responsible for:
The app is responsible for:

- selecting which engine(s) apply
- combining state into model context
- managing lifecycle (reset, persistence, replay) per engine

The compiler only maintains a single authoritative state per instance.
The compiler only maintains a single state instance per engine.

## Guideline

Start with one engine.

Introduce multiple engines only when you need **independent lifecycle or isolation**, not because a single engine is insufficient.

## Combining Policies from Multiple Sources

If you need to combine constraints from separate sources, do it explicitly in
host code by replaying directives through `step(...)` into a target engine.

Pattern:

1. Select ordered source directives
2. Replay each directive via `engine.step(...)`
3. Handle any returned `clarify` decisions explicitly

This keeps conflict handling in normal engine semantics and avoids adding merge
semantics to core state APIs.
39 changes: 22 additions & 17 deletions examples/integrations/litellm/basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,17 @@

is_confirmation_text = _confirmation.is_confirmation_text

from host_support.confirmation import summarize_confirmation_update
try:
from host_support.confirmation import summarize_confirmation_update_from_checkpoint
except ImportError:
from host_support.confirmation import (
summarize_confirmation_update as _summarize_confirmation_update_from_pending,
)

def summarize_confirmation_update_from_checkpoint(user_input: str, checkpoint: object) -> str:
pending = checkpoint.get("pending") if isinstance(checkpoint, dict) else None
return _summarize_confirmation_update_from_pending(user_input, pending)


try:
from host_support import build_trace
Expand Down Expand Up @@ -191,14 +201,6 @@ def _persist_session_checkpoint_if_needed(
_CHECKPOINTS_BY_SESSION_KEY[session_key] = engine.export_checkpoint_json()


def _has_pending_clarification(engine: Engine) -> bool:
checker = getattr(engine, "has_pending_clarification", None)
if callable(checker):
return bool(checker())
checkpoint = engine.export_checkpoint()
return checkpoint.get("pending") is not None


def _normalize_confirmation_for_summary(value: str) -> str:
normalized = value.strip().lower()
normalized = re.sub(r"\s+", " ", normalized)
Expand All @@ -223,9 +225,9 @@ def _near_miss_directive_clarify(value: str) -> str | None:
return None


def _summarize_confirmation_update(user_input: str, pending: object) -> str:
summarize_fn: Callable[[str, object], str] = summarize_confirmation_update
return summarize_fn(user_input, pending)
def _summarize_confirmation_update(user_input: str, checkpoint: object) -> str:
summarize_fn: Callable[[str, object], str] = summarize_confirmation_update_from_checkpoint
return summarize_fn(user_input, checkpoint)


def _summarize_update_from_input(user_input: str) -> str:
Expand Down Expand Up @@ -294,9 +296,8 @@ def _append_trace(
def handle_turn(user_input: str, engine: Engine, *, session_key: str | None = None) -> str:
_restore_session_checkpoint_if_needed(engine, session_key)
state_before = engine.state
pending_before = (
engine.export_checkpoint().get("pending") if _has_pending_clarification(engine) else None
)
has_pending_before = engine.has_pending_clarification()
checkpoint_before = engine.export_checkpoint() if has_pending_before else None
logger.debug("litellm_basic: engine_input=%s", f"user_input len={len(user_input)}")
decision = engine.step(user_input)
kind = cast(str, decision["kind"])
Expand Down Expand Up @@ -326,8 +327,12 @@ def handle_turn(user_input: str, engine: Engine, *, session_key: str | None = No
llm_called=False,
)
_persist_session_checkpoint_if_needed(engine, kind, session_key)
if kind == DECISION_UPDATE and is_confirmation_text(user_input) and pending_before is not None:
response_text = _summarize_confirmation_update(user_input, pending_before)
if (
kind == DECISION_UPDATE
and is_confirmation_text(user_input)
and checkpoint_before is not None
):
response_text = _summarize_confirmation_update(user_input, checkpoint_before)
return _append_trace(
response_text,
original_input=user_input,
Expand Down
Loading