Skip to content

[codex] Implement PRD-4 ontology and Phi-4 integration plan#46

Merged
elasticdotventures merged 2 commits into
mainfrom
review-ontology-phi4-course
Apr 30, 2026
Merged

[codex] Implement PRD-4 ontology and Phi-4 integration plan#46
elasticdotventures merged 2 commits into
mainfrom
review-ontology-phi4-course

Conversation

@elasticdotventures
Copy link
Copy Markdown
Member

Summary

Implements PRD-4 as a staged ontology and local Phi-4 integration plan, then lands the first implementation slices through the end-to-end audit playbook.

  • Adds PRD-4.md with phased product requirements, visual-first audit diagrams, and required tests.
  • Adds canonical ledger-core::ontology primitives and MCP compatibility mapping while preserving legacy ontology payload shape and hash prefixes.
  • Adds opt-in ingest ontology emission, ontology-to-Rhai visual DSL output, typed host-owned Phi-4 classification jobs, proposal lifecycle enforcement, deterministic local semantic retrieval, and an audit playbook contract.
  • Updates mdBook capability/playbook docs, MCP CLI demo flows, and durable AGENTS.md guidance.

Validation

  • cargo test -p ledger-core ontology_
  • cargo test -p ledger-core ontology_snapshot_to_rhai_dsl
  • cargo test -p ledger-core proposal
  • cargo test -p ledger-core semantic
  • cargo test -p ledger-core rule_registry
  • cargo test -p ledgerr-host
  • cargo test -p ledgerr-mcp
  • cargo test -p ledgerr-mcp audit_playbook_ids_match_across_workbook_ontology_and_events
  • cargo test -p ledgerr-host internal_phi_fallback_runs_audit_playbook_prompt
  • just mcp-cli-basic
  • just mcp-cli-spinning-wheels
  • just docgen-check
  • git diff --check

just test was started and progressed through the workspace into the real Phi-4 smoke tests; it was still running in the local model output tests when the publish request arrived. Focused Phase 7 and prior-phase gates passed.

Comment thread crates/ledgerr-host/src/internal_openai.rs Fixed
Comment thread crates/ledgerr-host/src/internal_openai.rs Fixed
Comment thread crates/ledgerr-mcp/src/lib.rs Fixed
Comment thread crates/ledgerr-mcp/src/lib.rs Fixed
@elasticdotventures elasticdotventures marked this pull request as ready for review April 30, 2026 12:23
Copilot AI review requested due to automatic review settings April 30, 2026 12:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the first concrete slices of PRD-4 by introducing canonical ontology primitives in ledger-core, wiring opt-in ontology emission through MCP ingest, adding typed Phi-4 “job” scaffolding with deterministic fallback behavior, and landing an end-to-end audit playbook contract across workbook/events/ontology.

Changes:

  • Add canonical ledger-core ontology snapshot/types + proposal lifecycle primitives; adapt MCP ontology store/hashes to match.
  • Extend MCP ingest (rows/PDF) with optional ontology_path and emit deterministic document→transaction ontology edges when enabled.
  • Introduce deterministic “semantic retrieval” fallback in RuleRegistry, add typed Phi-4 classification job plumbing + fallback responses, and expand docs/playbook/test coverage.

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scripts/mcp_cli_demo.sh Adds demo modes + ontology snapshot export; updates default temp paths.
crates/ledgerr-mcp/tests/tax_evidence_chain_contract.rs Updates ingest request with ontology_path: None.
crates/ledgerr-mcp/tests/restart_persistence.rs Updates ingest request with ontology_path: None.
crates/ledgerr-mcp/tests/phase6_mcp_exposure_gaps.rs Updates ingest PDF requests with ontology_path: None.
crates/ledgerr-mcp/tests/phase5_cpa_outputs.rs Updates ingest PDF request with ontology_path: None.
crates/ledgerr-mcp/tests/phase4_audit_integrity.rs Updates ingest PDF request with ontology_path: None.
crates/ledgerr-mcp/tests/phase3_mcp_classification.rs Updates ingest PDF request with ontology_path: None.
crates/ledgerr-mcp/tests/phase2_mcp_contract_remaining.rs Updates ingest PDF request with ontology_path: None.
crates/ledgerr-mcp/tests/ontology_contract.rs Adds ontology conversion/mapping tests + ingest ontology emission + proposal provenance assertions.
crates/ledgerr-mcp/tests/interface.rs Updates ingest requests with ontology_path: None.
crates/ledgerr-mcp/tests/events_replay_contract.rs Updates ingest requests with ontology_path: None.
crates/ledgerr-mcp/tests/events_contract.rs Updates ingest requests with ontology_path: None.
crates/ledgerr-mcp/tests/e2e_mvp_flow.rs Updates ingest PDF request with ontology_path: None.
crates/ledgerr-mcp/tests/e2e_bdd.rs Updates ingest PDF requests with ontology_path: None.
crates/ledgerr-mcp/tests/document_inventory.rs Updates ingest request with ontology_path: None.
crates/ledgerr-mcp/tests/contract_codegen.rs Adds contract parsing coverage for optional ontology_path in ingest actions.
crates/ledgerr-mcp/tests/audit_playbook_contract.rs New end-to-end contract test tying tx_id across workbook, ontology, and events.
crates/ledgerr-mcp/src/ontology.rs Rehomes ontology kinds/hashes onto ledger-core::ontology and adds to_core_snapshot.
crates/ledgerr-mcp/src/mcp_adapter.rs Plumbs optional ontology_path through ingest parsing and expands ontology kind parsing.
crates/ledgerr-mcp/src/lib.rs Adds ontology_path to ingest requests and emits ingest ontology edges when enabled.
crates/ledgerr-mcp/src/contract.rs Extends documents tool contract to accept optional ontology_path.
crates/ledgerr-host/src/internal_openai.rs Extends deterministic Phi-4 fallback outputs + adds tests for typed JSON and playbook prompt.
crates/ledgerr-host/src/chat.rs Adds InvalidTypedOutput error mapping.
crates/ledgerr-host/src/agent_runtime.rs Adds typed classification job + typed output validation + tests.
crates/ledger-core/tests/rule_registry.rs Adds tests for stable semantic candidate IDs and semantic selection behavior.
crates/ledger-core/src/rule_registry.rs Implements deterministic lexical “semantic” index + stable candidate IDs.
crates/ledger-core/src/proposal.rs New proposal lifecycle/policy + commit gating + provenance injection + tests.
crates/ledger-core/src/ontology.rs New canonical ontology primitives + deterministic hashing/sorting + Rhai DSL conversion + tests.
crates/ledger-core/src/lib.rs Exposes new ontology/proposal modules and adjusts exports ordering.
book/src/capability-map.md Documents PRD-4 phases with Rhai visual blocks.
book/src/audit-playbook.md New playbook chapter describing runnable paths and the audit graph chain.
book/src/SUMMARY.md Adds Audit Playbook and removes duplicate/invalid chapter entries.
book/mdbook-admonish.css Adds admonish CSS asset file at book root.
book/book.toml Bumps admonish assets_version and updates CSS include path.
PRD-4.md Adds PRD-4 staged requirements + visuals + acceptance tests.
AGENTS.md Adds durable PRD-4 implementation guidance and docs toolchain notes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/mcp_cli_demo.sh
Comment on lines +4 to +7
DEMO_ROOT="${DEMO_ROOT:-/tmp/l3dg3rr-mcp-demo-$$}"
JOURNAL_PATH="${JOURNAL_PATH:-$DEMO_ROOT/demo.beancount}"
WORKBOOK_PATH="${WORKBOOK_PATH:-$DEMO_ROOT/demo.xlsx}"
ONTOLOGY_PATH="${ONTOLOGY_PATH:-$DEMO_ROOT/demo.ontology.json}"
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEMO_ROOT is used as the parent directory for JOURNAL_PATH/WORKBOOK_PATH/ONTOLOGY_PATH, but the script never creates it. Since journal/workbook writers open the target file paths directly (and don’t create parent dirs), the demo will fail on a clean run. Create the directory (e.g., mkdir -p "$DEMO_ROOT") before invoking cargo run.

Copilot uses AI. Check for mistakes.
Comment thread crates/ledgerr-mcp/src/lib.rs Outdated
Comment on lines +1128 to +1130
if let Some(ontology_path) = request.ontology_path.as_deref() {
emit_ingest_ontology_edges(ontology_path, &request.rows)?;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ontology_path is written to via emit_ingest_ontology_edges, but unlike source_ref/rkyv_ref it isn’t validated against workbook_path.parent() or checked for .. components. This lets an MCP caller choose an arbitrary file write location (absolute path or traversal via relative paths). Add the same confinement/traversal checks used in ingest_pdf (reject ParentDir, and restrict absolute paths to the allowed base) before calling emit_ingest_ontology_edges.

Copilot uses AI. Check for mistakes.
@elasticdotventures
Copy link
Copy Markdown
Member Author

@copilot apply changes based on the comments in this thread

…nd clippy lints

- scripts/mcp_cli_demo.sh: add mkdir -p \"$DEMO_ROOT\" before cargo run invocations
- crates/ledgerr-mcp/src/contract.rs: sync generated_mcp_cli_demo_script() to match updated script
- crates/ledgerr-mcp/src/lib.rs: add ontology_path traversal/confinement checks in ingest_statement_rows (reject ParentDir components, reject absolute paths outside workbook_path.parent())
- crates/ledgerr-mcp/src/lib.rs:2173: remove redundant & on already-reference ontology_path (clippy)
- crates/ledgerr-host/src/internal_openai.rs:877,901: replace Phi4LocalFallbackBackend::default() with unit-struct literal (clippy)"

Agent-Logs-Url: https://github.com/PromptExecution/l3dg3rr/sessions/4bd8b5e7-1929-4a78-bdfa-37a4043bc7ca

Co-authored-by: elasticdotventures <35611074+elasticdotventures@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 30, 2026

@copilot apply changes based on the comments in this thread

Applied all changes in commit 6f11460:

  1. scripts/mcp_cli_demo.sh – added mkdir -p "$DEMO_ROOT" before the first cargo run so the parent directory exists on a clean run.
  2. crates/ledgerr-mcp/src/contract.rs – synced generated_mcp_cli_demo_script() to match the updated script (required by the codegen drift test).
  3. crates/ledgerr-mcp/src/lib.rs – added ontology_path confinement and traversal checks in ingest_statement_rows: rejects .. (ParentDir) components, rejects absolute paths outside workbook_path.parent(), and resolves relative paths under the same allowed base. This mirrors the pattern already used for source_ref and rkyv_ref.
  4. crates/ledgerr-mcp/src/lib.rs:2173 – removed the redundant & on the already-reference ontology_path (clippy).
  5. crates/ledgerr-host/src/internal_openai.rs:877,901 – replaced Phi4LocalFallbackBackend::default() with the unit-struct literal Phi4LocalFallbackBackend (clippy).

@elasticdotventures elasticdotventures merged commit d30ee7e into main Apr 30, 2026
1 check passed
@elasticdotventures elasticdotventures deleted the review-ontology-phi4-course branch April 30, 2026 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants