Skip to content

ProfRandom92/Comptextv7

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

126 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comptextv7 logo

CompText V7

Deterministic replay-integrity validation for compressed MCP-style operational traces.

No embeddings • No vector DB • No semantic scoring • No LLM judges

CI Python Deterministic Replay Replay Native Replay Artifacts

Research Positioning · Benchmark Details · Multi-Family Benchmark · Failure Taxonomy

CompText V7 validates whether compressed operational commitments survive deterministic replay reconstruction in MCP-style agent workflows.


In 30 seconds

Long-horizon agents compress prior work into smaller summaries. Those summaries can silently lose blockers, constraints, evidence, dependency order, recovery paths, and tool order.

CompText V7 treats that as a deterministic replay-validation problem. It checks whether compressed operational state remains admissible after reconstruction using fixture-defined contracts, exact scoring, failure labels, committed artifacts, and CI gates.


What CompText V7 is

  • Deterministic replay-validation infrastructure for operational state.
  • Fixture-bound and contract-linked.
  • Artifact-backed with reproducible JSON/SVG outputs.
  • CI-reproducible through repository checks.
  • Focused on operational admissibility, not prose quality.

What CompText V7 is not

  • Agent framework.
  • Workflow orchestrator.
  • Learned compressor.
  • Vector memory system.
  • RAG replacement.
  • KV-cache optimizer.
  • Production telemetry platform.
  • Clinical-grade system.
  • Universal AI-memory solution.
  • LLM judge.

Replay validation model

flowchart LR
    A["Checked-in fixture"] --> B["Original operational state"]
    B --> C["Reconstructed replay state"]
    C --> D["Contract validator"]
    D --> E["Admissibility scorer"]
    E --> F["Failure labels"]
    E --> G["Committed artifacts"]
    G --> H["CI gates"]
    F --> H
Loading

Operational commitments

CompText V7 validates whether deterministic replay reconstruction preserves:

  • evidence
  • constraints
  • blockers
  • dependencies
  • recovery paths
  • tool order
  • capability boundaries
  • governance/policy gates

The mcp_trace_replay fixture family validates deterministic replay safety for tool order, validation-before-action, dependency chains, recovery paths, and capability boundaries. Registered contracts: tool_call_order_preserved, validation_before_unsafe_action, dependency_chain_preserved, recovery_path_available, capability_boundary_respected.


Current fixture-bound signal

  • Four manifest-registered operational fixture families.
  • Standard levels: baseline, mild, moderate, severe.
  • Deterministic evaluation mode.
  • Exact rational scoring.
  • Reproducible artifacts.
  • No LLM judges or external APIs.

These are internal fixture-bound results, not external benchmark claims, production-readiness claims, or solved-memory claims.

Signal Current fixture-bound result
Agent trace replay consistency 1.000000
Paper replay consistency 0.791667
CONSERVATIVE replay consistency 0.895833
BALANCED replay consistency 0.250000
AGGRESSIVE replay consistency 0.125000
Paper avg compression 1.347063
Agent avg compression 1.773954
Agent replay consistency 1.000000
Agent operational drift 0.000000

Artifact evidence pipeline

flowchart LR
    A["fixtures/manifest.json"] --> B["Fixture families"]
    B --> C["DegradationCurveGenerator"]
    B --> D["AdmissibilityScorer"]
    C --> E["multi_family_admissibility_curves.svg"]
    D --> F["layered_admissibility_results.json"]
    D --> G["multi_family_admissibility_results.json"]
    F --> H["Reproducibility tests"]
    G --> H
    E --> I["Progression tests"]
    H --> J["GitHub Actions"]
    I --> J
Loading

Minimal deterministic example

{
  "original_operational_state": {
    "policy_steps": ["identify_owner", "collect_evidence", "execute_recovery"],
    "causal_dependencies": [["alert", "triage"], ["triage", "recovery"]],
    "recovery_paths": ["ack -> mitigation_runbook"]
  },
  "reconstructed_state": {
    "policy_steps": ["collect_evidence", "identify_owner", "execute_recovery"],
    "causal_dependencies": [["alert", "recovery"]],
    "recovery_paths": []
  },
  "deterministic_validation_result": {
    "admissible": false,
    "failure_labels": [
      "POLICY_ORDER_BROKEN",
      "CAUSAL_DEPENDENCY_LOSS",
      "RECOVERY_PATH_INVALID",
      "INVARIANT_VIOLATION"
    ]
  }
}

Proof artifacts

Artifact Purpose
artifacts/layered_admissibility_results.json Layered admissibility outputs.
artifacts/multi_family_admissibility_results.json Multi-family deterministic aggregates.
artifacts/multi_family_admissibility_curves.svg Deterministic degradation curve rendering.
artifacts/mcp_trace_replay_results.json Deterministic MCP trace replay contract outcomes.
artifacts/replay_semantic_integrity_results.json Deterministic replay semantic integrity outcomes.
docs/benchmarks/multi_family_admissibility_benchmark.md Benchmark method and interpretation boundaries.
docs/failure_taxonomy.md Failure label documentation.

Verify locally

python -m pip install -e '.[test]'
npm install --no-save --no-package-lock
npm run check
pytest tests/test_failure_taxonomy.py -q
pytest tests/test_multi_family_admissibility_artifact.py -q
pytest tests/test_multi_family_svg_renderer.py -q
pytest tests/test_paper_replay_bench.py tests/test_agent_trace_replay.py -q

Benchmark families

  • coding_workflow_pr_review
  • incident_response_page_triage
  • cross_domain_operational_dependency_workflow
  • mcp_trace_replay
flowchart LR
    A["coding_workflow_pr_review"] --> L1["baseline"]
    A --> L2["mild"]
    A --> L3["moderate"]
    A --> L4["severe"]
    B["incident_response_page_triage"] --> L1
    B --> L2
    B --> L3
    B --> L4
    C["cross_domain_operational_dependency_workflow"] --> L1
    C --> L2
    C --> L3
    C --> L4
    D["mcp_trace_replay"] --> L1
    D --> L2
    D --> L3
    D --> L4
    L1 --> M["manifest registration"]
    L2 --> M
    L3 --> M
    L4 --> M
    M --> N["multi-family artifact"]
    N --> O["deterministic SVG"]
Loading

Failure labels

Primary registered labels used across deterministic admissibility validation:

  • POLICY_ORDER_BROKEN: required policy order failed.
  • TOOL_ORDER_VIOLATION: replayed tool sequence violated required order.
  • CAUSAL_DEPENDENCY_LOSS: required causal edges were not preserved.
  • DEPENDENCY_CHAIN_BREAK: required dependency chain broke.
  • RECOVERY_PATH_INVALID: recovery reachability contract failed.
  • RECOVERY_PATH_LOSS: required recovery route was not preserved.
  • INVARIANT_VIOLATION: declared invariant failed.
  • EVIDENCE_LOSS: required evidence did not survive replay.
  • EVIDENCE_SURVIVAL_LOSS: expected evidence units were not preserved.
  • HIGH_CRITICAL_EVIDENCE_LOSS: high-critical evidence was lost.
  • CONSTRAINT_DRIFT: constraint preservation drifted.
  • BLOCKER_DETACHMENT: blocker attachment was lost.
  • GOVERNANCE_DRIFT: governance constraint drifted.
  • ARTIFACT_INTEGRITY_VIOLATION: artifact integrity drifted.
  • REPLAY_NON_REPRODUCIBLE: deterministic replay was not reproducible.
flowchart LR
    O1["POLICY_ORDER_BROKEN"] --> C1["ordering"]
    O2["TOOL_ORDER_VIOLATION"] --> C1
    D1["CAUSAL_DEPENDENCY_LOSS"] --> C2["causality/dependency"]
    D2["DEPENDENCY_CHAIN_BREAK"] --> C2
    R1["RECOVERY_PATH_INVALID"] --> C3["recovery/reachability"]
    R2["RECOVERY_PATH_LOSS"] --> C3
    I1["INVARIANT_VIOLATION"] --> C4["invariant/no-orphan"]
    E1["EVIDENCE_LOSS"] --> C5["evidence/criticality"]
    E2["EVIDENCE_SURVIVAL_LOSS"] --> C5
    E3["HIGH_CRITICAL_EVIDENCE_LOSS"] --> C5
    E4["CONSTRAINT_DRIFT"] --> C5
    E5["BLOCKER_DETACHMENT"] --> C5
    E6["GOVERNANCE_DRIFT"] --> C5
    A1["ARTIFACT_INTEGRITY_VIOLATION"] --> C6["artifact/reproducibility"]
    A2["REPLAY_NON_REPRODUCIBLE"] --> C6
Loading

How this differs from adjacent systems

System type Stores state Compresses context Orchestrates agents Deterministically validates replay loss
Workflow runtimes Sometimes No Yes No
Agent frameworks Sometimes Sometimes Yes Usually no
Vector memory / RAG Yes Retrieval-centric No No
Learned prompt compressors Sometimes Yes No Usually no
LLM-as-judge evaluators Sometimes N/A No No
CompText V7 Yes Yes No Yes

CI and merge gate

flowchart LR
    A["PR head SHA"] --> B["GitHub Actions"]
    B --> C["Agent Workflow Checks"]
    B --> D["hash-companion-validation"]
    B --> E["CompText V7 Industrial Validation"]
    C --> F["all success"]
    D --> F
    E --> F
    F --> G["squash merge"]
Loading

Vercel/Netlify/deployment previews are not merge gates unless explicitly scoped.


Repository map

Comptextv7/
├── artifacts/
├── docs/
├── fixtures/
├── reports/
├── scripts/
├── tests/
└── src/
    ├── core/
    └── validation/

Replay-validation roadmap

flowchart LR
    A["failure taxonomy"] --> B["cross-domain fixture families"]
    B --> C["forensic reports"]
    C --> D["schema stabilization"]
    D --> E["cross-family comparison"]
    E --> F["integrity gates"]
    F --> G["golden corpus"]
    G --> H["offline import/export"]
Loading
  • Forensic audit reports with deterministic exports.
  • Artifact schema stabilization.
  • Cross-family degradation comparison.
  • Minimal artifact integrity gates.
  • Golden corpus foundation.
  • Offline import/export schemas only.

Limitations

  • Metrics are fixture-bound and internal to checked-in datasets.
  • Fixtures are curated and checked in, not live production traces.
  • This is a deterministic prototype, not a production-readiness claim.
  • This is not a universal AI-memory claim.
  • This does not claim runtime integration or orchestration coverage.

Releases

No releases published

Packages

 
 
 

Contributors