Claude/add external fixture loader t ap7 n#86
Merged
Conversation
Root cause: candidate texts explicitly stated pass/fail criteria, letting Raw LLM solve cases by surface reading without needing PSE tags. Affected cases (8 fixture files): - topo_t02: "exit route" / "local loop" → neutral path descriptions - topo_t03: explicit TopologyGuardProof presence/absence → neutral "evaluation record" language; proof status in tags only - mem_m02: explicit SHA-256/PoR/gate/drift pass/fail → structural chain descriptions only (length + domain) - mem_m03: "All four recall preconditions passed" + explicit failures → neutral crystal descriptions (type, domain) only - sched_sc01: "Resolves constraint violations" / "gate evaluation" → neutral log/audit/update language; urgency in tags only - cog_c02/c03: "Budget and TTL within limits" / "trace_replay_valid=true" / explicit failures → reason code only; admission status in tags only - hor_c01: event log named "MigrateCarrier"/"NeedsCarrierMigration" directly → opaque "failure policy selected per spec table"; candidate texts no longer mention carrier order / emission order explicitly; distractor texts no longer name their gate association - dyn_c02: "directly reducing path_delta" / "that drive large per-tick displacements" → neutral operational descriptions; delta impact in tags - nctcs_c01/c03: event log named reached_class directly → opaque "conformance classification complete/applied"; candidate texts no longer state which class is reached or reference obligation names - pm_c02: "passed field is true" / "all sub-gate checks passed" / "passed=false" / "no StitcherGateReport reference" → neutral gate reference descriptions; validity in tags only - pm_c03: "cycle N" / "cycle N-1" / "cycle N-2" / "passed=true" → neutral "evaluation window" language; cycle correctness in tags only PSE tags remain unchanged and are the sole reliable signal for all cases.
…eration cases 15 productive cases (5 audit, 5 recovery_plan, 5 gate_trace) scored by required_elements substring matching instead of candidate selection. PSE constraints supply the formal vocabulary (DeterminismViolation, G_trace, Recondense, recondensation_status, MigrateCarrier, NeedsCarrierMigration, KeepTensorUnchanged, Axiom 6.1.1, etc.) that raw LLM lacks. - fixtures/productive/productive_v1.json: 15 cases, 75 evaluation slots - pse_groq_agent.py: PSE_AUDIT_CONSTRAINTS, PSE_RECOVERY_CONSTRAINTS, PSE_GATE_TRACE_CONSTRAINTS, build_raw/pse_prompt_productive, run_case_productive, PRODUCTIVE_SCHEMAS; call_groq gains max_tokens param - pse_fullstack_runner.py: productive layer in STACK, run_case_productive branch in run_fixture, imports updated https://claude.ai/code/session_01K5AN3s9TnGo1Az4jYwagtw
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.