[codex] Implement PRD-4 ontology and Phi-4 integration plan#46
Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements the first concrete slices of PRD-4 by introducing canonical ontology primitives in ledger-core, wiring opt-in ontology emission through MCP ingest, adding typed Phi-4 “job” scaffolding with deterministic fallback behavior, and landing an end-to-end audit playbook contract across workbook/events/ontology.
Changes:
- Add canonical
ledger-coreontology snapshot/types + proposal lifecycle primitives; adapt MCP ontology store/hashes to match. - Extend MCP ingest (rows/PDF) with optional
ontology_pathand emit deterministic document→transaction ontology edges when enabled. - Introduce deterministic “semantic retrieval” fallback in
RuleRegistry, add typed Phi-4 classification job plumbing + fallback responses, and expand docs/playbook/test coverage.
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/mcp_cli_demo.sh | Adds demo modes + ontology snapshot export; updates default temp paths. |
| crates/ledgerr-mcp/tests/tax_evidence_chain_contract.rs | Updates ingest request with ontology_path: None. |
| crates/ledgerr-mcp/tests/restart_persistence.rs | Updates ingest request with ontology_path: None. |
| crates/ledgerr-mcp/tests/phase6_mcp_exposure_gaps.rs | Updates ingest PDF requests with ontology_path: None. |
| crates/ledgerr-mcp/tests/phase5_cpa_outputs.rs | Updates ingest PDF request with ontology_path: None. |
| crates/ledgerr-mcp/tests/phase4_audit_integrity.rs | Updates ingest PDF request with ontology_path: None. |
| crates/ledgerr-mcp/tests/phase3_mcp_classification.rs | Updates ingest PDF request with ontology_path: None. |
| crates/ledgerr-mcp/tests/phase2_mcp_contract_remaining.rs | Updates ingest PDF request with ontology_path: None. |
| crates/ledgerr-mcp/tests/ontology_contract.rs | Adds ontology conversion/mapping tests + ingest ontology emission + proposal provenance assertions. |
| crates/ledgerr-mcp/tests/interface.rs | Updates ingest requests with ontology_path: None. |
| crates/ledgerr-mcp/tests/events_replay_contract.rs | Updates ingest requests with ontology_path: None. |
| crates/ledgerr-mcp/tests/events_contract.rs | Updates ingest requests with ontology_path: None. |
| crates/ledgerr-mcp/tests/e2e_mvp_flow.rs | Updates ingest PDF request with ontology_path: None. |
| crates/ledgerr-mcp/tests/e2e_bdd.rs | Updates ingest PDF requests with ontology_path: None. |
| crates/ledgerr-mcp/tests/document_inventory.rs | Updates ingest request with ontology_path: None. |
| crates/ledgerr-mcp/tests/contract_codegen.rs | Adds contract parsing coverage for optional ontology_path in ingest actions. |
| crates/ledgerr-mcp/tests/audit_playbook_contract.rs | New end-to-end contract test tying tx_id across workbook, ontology, and events. |
| crates/ledgerr-mcp/src/ontology.rs | Rehomes ontology kinds/hashes onto ledger-core::ontology and adds to_core_snapshot. |
| crates/ledgerr-mcp/src/mcp_adapter.rs | Plumbs optional ontology_path through ingest parsing and expands ontology kind parsing. |
| crates/ledgerr-mcp/src/lib.rs | Adds ontology_path to ingest requests and emits ingest ontology edges when enabled. |
| crates/ledgerr-mcp/src/contract.rs | Extends documents tool contract to accept optional ontology_path. |
| crates/ledgerr-host/src/internal_openai.rs | Extends deterministic Phi-4 fallback outputs + adds tests for typed JSON and playbook prompt. |
| crates/ledgerr-host/src/chat.rs | Adds InvalidTypedOutput error mapping. |
| crates/ledgerr-host/src/agent_runtime.rs | Adds typed classification job + typed output validation + tests. |
| crates/ledger-core/tests/rule_registry.rs | Adds tests for stable semantic candidate IDs and semantic selection behavior. |
| crates/ledger-core/src/rule_registry.rs | Implements deterministic lexical “semantic” index + stable candidate IDs. |
| crates/ledger-core/src/proposal.rs | New proposal lifecycle/policy + commit gating + provenance injection + tests. |
| crates/ledger-core/src/ontology.rs | New canonical ontology primitives + deterministic hashing/sorting + Rhai DSL conversion + tests. |
| crates/ledger-core/src/lib.rs | Exposes new ontology/proposal modules and adjusts exports ordering. |
| book/src/capability-map.md | Documents PRD-4 phases with Rhai visual blocks. |
| book/src/audit-playbook.md | New playbook chapter describing runnable paths and the audit graph chain. |
| book/src/SUMMARY.md | Adds Audit Playbook and removes duplicate/invalid chapter entries. |
| book/mdbook-admonish.css | Adds admonish CSS asset file at book root. |
| book/book.toml | Bumps admonish assets_version and updates CSS include path. |
| PRD-4.md | Adds PRD-4 staged requirements + visuals + acceptance tests. |
| AGENTS.md | Adds durable PRD-4 implementation guidance and docs toolchain notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| DEMO_ROOT="${DEMO_ROOT:-/tmp/l3dg3rr-mcp-demo-$$}" | ||
| JOURNAL_PATH="${JOURNAL_PATH:-$DEMO_ROOT/demo.beancount}" | ||
| WORKBOOK_PATH="${WORKBOOK_PATH:-$DEMO_ROOT/demo.xlsx}" | ||
| ONTOLOGY_PATH="${ONTOLOGY_PATH:-$DEMO_ROOT/demo.ontology.json}" |
There was a problem hiding this comment.
DEMO_ROOT is used as the parent directory for JOURNAL_PATH/WORKBOOK_PATH/ONTOLOGY_PATH, but the script never creates it. Since journal/workbook writers open the target file paths directly (and don’t create parent dirs), the demo will fail on a clean run. Create the directory (e.g., mkdir -p "$DEMO_ROOT") before invoking cargo run.
| if let Some(ontology_path) = request.ontology_path.as_deref() { | ||
| emit_ingest_ontology_edges(ontology_path, &request.rows)?; | ||
| } |
There was a problem hiding this comment.
ontology_path is written to via emit_ingest_ontology_edges, but unlike source_ref/rkyv_ref it isn’t validated against workbook_path.parent() or checked for .. components. This lets an MCP caller choose an arbitrary file write location (absolute path or traversal via relative paths). Add the same confinement/traversal checks used in ingest_pdf (reject ParentDir, and restrict absolute paths to the allowed base) before calling emit_ingest_ontology_edges.
|
@copilot apply changes based on the comments in this thread |
…nd clippy lints - scripts/mcp_cli_demo.sh: add mkdir -p \"$DEMO_ROOT\" before cargo run invocations - crates/ledgerr-mcp/src/contract.rs: sync generated_mcp_cli_demo_script() to match updated script - crates/ledgerr-mcp/src/lib.rs: add ontology_path traversal/confinement checks in ingest_statement_rows (reject ParentDir components, reject absolute paths outside workbook_path.parent()) - crates/ledgerr-mcp/src/lib.rs:2173: remove redundant & on already-reference ontology_path (clippy) - crates/ledgerr-host/src/internal_openai.rs:877,901: replace Phi4LocalFallbackBackend::default() with unit-struct literal (clippy)" Agent-Logs-Url: https://github.com/PromptExecution/l3dg3rr/sessions/4bd8b5e7-1929-4a78-bdfa-37a4043bc7ca Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
Applied all changes in commit
|
Summary
Implements PRD-4 as a staged ontology and local Phi-4 integration plan, then lands the first implementation slices through the end-to-end audit playbook.
PRD-4.mdwith phased product requirements, visual-first audit diagrams, and required tests.ledger-core::ontologyprimitives and MCP compatibility mapping while preserving legacy ontology payload shape and hash prefixes.AGENTS.mdguidance.Validation
cargo test -p ledger-core ontology_cargo test -p ledger-core ontology_snapshot_to_rhai_dslcargo test -p ledger-core proposalcargo test -p ledger-core semanticcargo test -p ledger-core rule_registrycargo test -p ledgerr-hostcargo test -p ledgerr-mcpcargo test -p ledgerr-mcp audit_playbook_ids_match_across_workbook_ontology_and_eventscargo test -p ledgerr-host internal_phi_fallback_runs_audit_playbook_promptjust mcp-cli-basicjust mcp-cli-spinning-wheelsjust docgen-checkgit diff --checkjust testwas started and progressed through the workspace into the real Phi-4 smoke tests; it was still running in the local model output tests when the publish request arrived. Focused Phase 7 and prior-phase gates passed.