Skip to content

Introduce reason chain feature and JSON validation#28

Merged
RCFans merged 5 commits into
mainfrom
release/reason-chain
Apr 9, 2026
Merged

Introduce reason chain feature and JSON validation#28
RCFans merged 5 commits into
mainfrom
release/reason-chain

Conversation

@RCFans

@RCFans RCFans commented Apr 9, 2026

Copy link
Copy Markdown
Member

Add a new reason chain feature with models and validation logic. Introduce JSON validation for input data and resolve linting issues. Update the project version to reflect these changes.

Copilot AI review requested due to automatic review settings April 9, 2026 11:10
@RCFans RCFans added the enhancement New feature or request label Apr 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a deterministic “reason chain” artifact (and optional view-model graph) to the A/B/C deterministic simulation pipeline, plus more resilient JSON parsing for LLM outputs and basic artifact validation.

Changes:

  • Generate and persist reason_chain.json (and optional reason_chain_view_model.json) during deterministic simulate/compare; attach linked evidence refs to scenario results.
  • Add LLM prompt template + optional --config/--debug/--workshop-ui-mode CLI flags for enrichment/debugging.
  • Introduce validators and expand test coverage (unit/integration/contract) for the new artifacts and parsing.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
tests/unit/test_reason_chain_step_ids.py Unit coverage for step-id helpers and step-list validation.
tests/unit/test_reason_chain_order.py Unit coverage for reasoning-order validation.
tests/unit/test_insight_json_parser.py Unit coverage for JSON extraction/sanitization behavior.
tests/unit/test_blocking_reference_linkage.py Unit coverage for blocking linkage helper.
tests/integration/test_simulate_reason_chain_integration.py Ensures deterministic simulate/compare write reason-chain artifact with expected structure.
tests/integration/test_reason_chain_evidence_refs.py Ensures scenario results include evidence refs linked to reason steps.
tests/integration/test_actor_linked_paths_trace.py Regression-style integration test for deterministic simulate output stability.
tests/contract/test_reason_chain_view_model_contract.py Contract validation for workshop view-model artifact.
tests/contract/test_reason_chain_contract.py Contract validation for reason-chain artifact and blocking linkage.
tests/contract/test_actor_derivation_baseline_contract.py Contract baseline for actor-derivation artifact builder.
src/omen/scenario/models.py Adds Pydantic models for reason-chain step/blocking structures.
src/omen/ingest/validators/scenario.py Adds artifact-level validators for reason-chain and view-model outputs.
src/omen/ingest/synthesizer/prompts/registry.py Adds prompt-version token helper for scenario reason-chain prompt.
src/omen/cli/main.py Adds CLI flags for workshop view-model emission, config-driven enrichment, and debug logging.
src/omen/cli/case.py Core wiring: builds reason-chain artifacts, optional LLM enrichment, evidence linkage, and optional view-model.
src/omen/analysis/actor/report_writer.py Adds file writers for reason-chain artifacts.
src/omen/analysis/actor/insight.py Improves JSON extraction and adds LLM invocation utilities for scenario reason-chain enrichment.
src/omen/analysis/actor/derivation.py Adds deterministic reasoning-order list to strategic freedom conditions.
src/omen/analysis/actor/derivation_trace.py Implements reason-chain construction, evidence ref linkage, and view-model graph builder.
README.zh.md Updates Chinese README structure and quickstart guidance.
README.md Updates English README structure and quickstart guidance.
pyproject.toml Bumps project version to 0.1.5.
config/prompts/base.yaml Registers and defines the new scenario_reason_chain_prompt template.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/omen/cli/case.py Outdated
Comment on lines +218 to +221
if isinstance(llm_chain.get("steps"), list) and llm_chain.get("steps"):
chain["steps"] = llm_chain.get("steps")
if isinstance(llm_chain.get("conclusions"), dict) and llm_chain.get("conclusions"):
chain["conclusions"] = llm_chain.get("conclusions")

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLM override block currently replaces chain["steps"] and chain["conclusions"] when present, but the preceding comment says the LLM path is for “intermediate reasoning detail only” and that the deterministic core chain remains local for replay stability. To keep deterministic replay stable (and avoid schema drift), either (a) restrict the merge to intermediate only, or (b) validate the LLM payload against the expected reason-chain schema (step ids/types + conclusions buckets) and only then allow replacing steps/conclusions.

Suggested change
if isinstance(llm_chain.get("steps"), list) and llm_chain.get("steps"):
chain["steps"] = llm_chain.get("steps")
if isinstance(llm_chain.get("conclusions"), dict) and llm_chain.get("conclusions"):
chain["conclusions"] = llm_chain.get("conclusions")

Copilot uses AI. Check for mistakes.
Comment thread src/omen/cli/case.py
Comment on lines +207 to +214
llm_payload = try_generate_scenario_reason_chain_via_llm(
scenario_json=(planned_scenarios or {}).get(scenario_key) or {},
actor_profile_json={"actor_profile_ref": actor_profile_ref},
planning_query_json={},
situation_markdown="",
config_path=config_path,
debug_output_path=debug_output_path,
scenario_key=scenario_key,

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try_generate_scenario_reason_chain_via_llm is currently called with scenario_json=(planned_scenarios or {}).get(scenario_key) or {}; when planned_scenarios is missing or does not contain the key, the LLM receives {} even though scene (the deterministic scenario_ontology used to build the local chain) is available. Pass scene (or the same payload used for build_scenario_reason_chain) so the LLM enrichment has the required scenario context.

Copilot uses AI. Check for mistakes.
Comment thread src/omen/cli/case.py
Comment on lines +242 to +246
for result in scenario_results:
key = str(result.get("scenario_key") or "")
reason_chain = chain_by_key.get(key, {})
result["evidence_refs"] = build_linked_evidence_refs(reason_chain)

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result["evidence_refs"] is now set to the output of build_linked_evidence_refs, which is a list of dict records. This conflicts with the existing DeterministicScenarioResult.evidence_refs: list[str] model (src/omen/types.py) and also means confidence_level is computed earlier using an empty evidence_refs list and never recomputed after linkage. Update the deterministic run schema (and any validation) to reflect the new evidence_refs structure, and recompute confidence/missing-evidence reasons after populating evidence_refs.

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +24
def is_hierarchical_step_id(step_id: str) -> bool:
return bool(str(step_id or "").strip())

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_hierarchical_step_id currently returns true for any non-empty string, so it does not actually validate a hierarchical step id format despite its name (and despite build_hierarchical_step_id producing step_<major>.<minor>). Tighten this to a concrete allowed pattern (e.g., step_<int> / step_<int>.<int> and any explicit special-case ids you support) or rename the function to match the behavior.

Copilot uses AI. Check for mistakes.
def validate_reason_chain_step_ids(steps: list[dict[str, Any]]) -> bool:
if not isinstance(steps, list) or not steps:
return False
return all(is_hierarchical_step_id(str(item.get("step_id") or "")) for item in steps)

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_reason_chain_step_ids assumes every element in steps is a dict and will raise AttributeError on non-dict items (item.get(...)). If this function is meant to validate external/LLM-provided steps, it should defensively handle non-dict entries (e.g., isinstance(item, dict) checks) and return false instead of throwing.

Suggested change
return all(is_hierarchical_step_id(str(item.get("step_id") or "")) for item in steps)
for item in steps:
if not isinstance(item, dict):
return False
if not is_hierarchical_step_id(str(item.get("step_id") or "")):
return False
return True

Copilot uses AI. Check for mistakes.
Comment on lines +329 to +336
conclusions = reason_chain.get("conclusions") or {}
for item in list(conclusions.get("blocking") or []):
if not isinstance(item, dict):
raise ValueError("blocking conclusion must be an object")
if not list(item.get("activation_step_ids") or []):
raise ValueError("blocking conclusion missing activation_step_ids")
if not list(item.get("reason_step_ids") or []):
raise ValueError("blocking conclusion missing reason_step_ids")

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For blocking conclusions, the validator only checks that activation_step_ids / reason_step_ids are non-empty lists, but it will accept lists containing empty strings (or whitespace). Validate that each list contains at least one non-empty, stripped id string (and ideally that referenced ids exist in reason_chain.steps).

Copilot uses AI. Check for mistakes.
Comment thread config/prompts/base.yaml
Comment on lines +733 to +741
"strategic_freedom": {
"required": [],
"warning": [],
"blocking": []
},
"gap_summary": {
"dimensions": [],
"blocking": []
}

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt’s final reason_chain.conclusions structure is defined as {strategic_freedom: {required/warning/blocking}, gap_summary: ...}, but the runtime code that consumes reason chains (e.g., evidence linkage/view model) expects conclusions to have top-level required/warning/blocking lists of claim objects with text + step id refs. Align the prompt output schema with what the code consumes, or update the merge/consumption logic accordingly; otherwise enabling --config will likely break evidence linkage.

Suggested change
"strategic_freedom": {
"required": [],
"warning": [],
"blocking": []
},
"gap_summary": {
"dimensions": [],
"blocking": []
}
"required": [
{
"text": "",
"step_ids": []
}
],
"warning": [
{
"text": "",
"step_ids": []
}
],
"blocking": [
{
"text": "",
"step_ids": []
}
]
},
"gap_summary": {
"dimensions": [],
"blocking": []

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +21
assert blocking_has_activation_links(
{
"text": "Insufficient migration confidence blocks move",
"activation_step_ids": ["step_2.1"],
"reason_step_ids": ["step_5.1"],
}
)


def test_blocking_linkage_rejects_missing_ref_group() -> None:
assert not blocking_has_activation_links(
{
"text": "Missing activation path",
"activation_step_ids": [],
"reason_step_ids": ["step_5.1"],
}
)

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new test file uses 2-space indentation inside functions (e.g., the assert blocks), which will violate Ruff/pycodestyle indentation rules and is inconsistent with existing tests. Reformat to 4-space indentation (or run the repo formatter) to keep linting green.

Suggested change
assert blocking_has_activation_links(
{
"text": "Insufficient migration confidence blocks move",
"activation_step_ids": ["step_2.1"],
"reason_step_ids": ["step_5.1"],
}
)
def test_blocking_linkage_rejects_missing_ref_group() -> None:
assert not blocking_has_activation_links(
{
"text": "Missing activation path",
"activation_step_ids": [],
"reason_step_ids": ["step_5.1"],
}
)
assert blocking_has_activation_links(
{
"text": "Insufficient migration confidence blocks move",
"activation_step_ids": ["step_2.1"],
"reason_step_ids": ["step_5.1"],
}
)
def test_blocking_linkage_rejects_missing_ref_group() -> None:
assert not blocking_has_activation_links(
{
"text": "Missing activation path",
"activation_step_ids": [],
"reason_step_ids": ["step_5.1"],
}
)

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +77
payload = {
"pack_id": "strategic_actor_nokia_v1",
"pack_version": "1.0.0",
"derived_from_situation_id": "nokia-elop-2010",
"ontology_version": "scenario_ontology_v1",
"planning_query_ref": "traces/planning_query.json",
"prior_snapshot_ref": "traces/prior_snapshot.json",
"scenarios": [
{
"scenario_key": "A",
"title": "A",
"goal": "gA",
"target": "tA",
"objective": "oA",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cA"],
"tradeoff_pressure": ["tA"],
"resistance_assumptions": {
"structural_conflict": 0.8,
"resource_reallocation_drag": 0.7,
"cultural_misalignment": 0.6,
"veto_node_intensity": 0.7,
"aggregate_resistance": 0.7,
"assumption_rationale": ["rA"],
},
"modeling_notes": ["nA"],
},
{
"scenario_key": "B",
"title": "B",
"goal": "gB",
"target": "tB",
"objective": "oB",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cB"],
"tradeoff_pressure": ["tB"],
"resistance_assumptions": {
"structural_conflict": 0.5,
"resource_reallocation_drag": 0.5,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.4,
"aggregate_resistance": 0.475,
"assumption_rationale": ["rB"],
},
"modeling_notes": ["nB"],
},
{
"scenario_key": "C",
"title": "C",
"goal": "gC",
"target": "tC",
"objective": "oC",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cC"],
"tradeoff_pressure": ["tC"],
"resistance_assumptions": {
"structural_conflict": 0.4,
"resource_reallocation_drag": 0.4,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.3,
"aggregate_resistance": 0.4,
"assumption_rationale": ["rC"],
},
"modeling_notes": ["nC"],
},
],
}
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new integration test uses 2-space indentation throughout, which is inconsistent with the rest of the test suite and likely to fail Ruff indentation checks. Please reformat to standard 4-space indentation.

Suggested change
payload = {
"pack_id": "strategic_actor_nokia_v1",
"pack_version": "1.0.0",
"derived_from_situation_id": "nokia-elop-2010",
"ontology_version": "scenario_ontology_v1",
"planning_query_ref": "traces/planning_query.json",
"prior_snapshot_ref": "traces/prior_snapshot.json",
"scenarios": [
{
"scenario_key": "A",
"title": "A",
"goal": "gA",
"target": "tA",
"objective": "oA",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cA"],
"tradeoff_pressure": ["tA"],
"resistance_assumptions": {
"structural_conflict": 0.8,
"resource_reallocation_drag": 0.7,
"cultural_misalignment": 0.6,
"veto_node_intensity": 0.7,
"aggregate_resistance": 0.7,
"assumption_rationale": ["rA"],
},
"modeling_notes": ["nA"],
},
{
"scenario_key": "B",
"title": "B",
"goal": "gB",
"target": "tB",
"objective": "oB",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cB"],
"tradeoff_pressure": ["tB"],
"resistance_assumptions": {
"structural_conflict": 0.5,
"resource_reallocation_drag": 0.5,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.4,
"aggregate_resistance": 0.475,
"assumption_rationale": ["rB"],
},
"modeling_notes": ["nB"],
},
{
"scenario_key": "C",
"title": "C",
"goal": "gC",
"target": "tC",
"objective": "oC",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cC"],
"tradeoff_pressure": ["tC"],
"resistance_assumptions": {
"structural_conflict": 0.4,
"resource_reallocation_drag": 0.4,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.3,
"aggregate_resistance": 0.4,
"assumption_rationale": ["rC"],
},
"modeling_notes": ["nC"],
},
],
}
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")
payload = {
"pack_id": "strategic_actor_nokia_v1",
"pack_version": "1.0.0",
"derived_from_situation_id": "nokia-elop-2010",
"ontology_version": "scenario_ontology_v1",
"planning_query_ref": "traces/planning_query.json",
"prior_snapshot_ref": "traces/prior_snapshot.json",
"scenarios": [
{
"scenario_key": "A",
"title": "A",
"goal": "gA",
"target": "tA",
"objective": "oA",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cA"],
"tradeoff_pressure": ["tA"],
"resistance_assumptions": {
"structural_conflict": 0.8,
"resource_reallocation_drag": 0.7,
"cultural_misalignment": 0.6,
"veto_node_intensity": 0.7,
"aggregate_resistance": 0.7,
"assumption_rationale": ["rA"],
},
"modeling_notes": ["nA"],
},
{
"scenario_key": "B",
"title": "B",
"goal": "gB",
"target": "tB",
"objective": "oB",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cB"],
"tradeoff_pressure": ["tB"],
"resistance_assumptions": {
"structural_conflict": 0.5,
"resource_reallocation_drag": 0.5,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.4,
"aggregate_resistance": 0.475,
"assumption_rationale": ["rB"],
},
"modeling_notes": ["nB"],
},
{
"scenario_key": "C",
"title": "C",
"goal": "gC",
"target": "tC",
"objective": "oC",
"variables": [{"name": "x", "type": "categorical"}],
"constraints": ["cC"],
"tradeoff_pressure": ["tC"],
"resistance_assumptions": {
"structural_conflict": 0.4,
"resource_reallocation_drag": 0.4,
"cultural_misalignment": 0.5,
"veto_node_intensity": 0.3,
"aggregate_resistance": 0.4,
"assumption_rationale": ["rC"],
},
"modeling_notes": ["nC"],
},
],
}
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +41
payload = build_actor_derivation_artifact(
run_id="det-123456",
actor_profile_ref="actor_profile_v1",
scenario_pack_ref="strategic_actor_nokia_v1",
scenario_derivations=[
{
"scenario_key": "A",
"actor_derivation": {"decision_style": "offense_breakthrough"},
"selected_dimensions": {"selected_dimension_keys": ["ecosystem_control"]},
"strategic_freedom_score": 0.73,
},
{
"scenario_key": "B",
"actor_derivation": {"decision_style": "defense_resilience"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.51,
},
{
"scenario_key": "C",
"actor_derivation": {"decision_style": "confrontation_competition"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.49,
},
],
)

assert payload["artifact_type"] == "actor_derivation"
assert payload["version"] == "actor_derivation_v1"
assert payload["run_id"] == "det-123456"
assert payload["actor_profile_ref"] == "actor_profile_v1"
assert payload["scenario_pack_ref"] == "strategic_actor_nokia_v1"

rows = list(payload.get("scenario_derivations") or [])
assert [row.get("scenario_key") for row in rows] == ["A", "B", "C"]
assert all("actor_derivation" in row for row in rows)
assert all("selected_dimensions" in row for row in rows)
assert all(isinstance(row.get("strategic_freedom_score"), float) for row in rows)

Copilot AI Apr 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new contract test uses 2-space indentation throughout, which is inconsistent with the rest of the test suite and likely to fail Ruff indentation checks. Please reformat to standard 4-space indentation.

Suggested change
payload = build_actor_derivation_artifact(
run_id="det-123456",
actor_profile_ref="actor_profile_v1",
scenario_pack_ref="strategic_actor_nokia_v1",
scenario_derivations=[
{
"scenario_key": "A",
"actor_derivation": {"decision_style": "offense_breakthrough"},
"selected_dimensions": {"selected_dimension_keys": ["ecosystem_control"]},
"strategic_freedom_score": 0.73,
},
{
"scenario_key": "B",
"actor_derivation": {"decision_style": "defense_resilience"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.51,
},
{
"scenario_key": "C",
"actor_derivation": {"decision_style": "confrontation_competition"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.49,
},
],
)
assert payload["artifact_type"] == "actor_derivation"
assert payload["version"] == "actor_derivation_v1"
assert payload["run_id"] == "det-123456"
assert payload["actor_profile_ref"] == "actor_profile_v1"
assert payload["scenario_pack_ref"] == "strategic_actor_nokia_v1"
rows = list(payload.get("scenario_derivations") or [])
assert [row.get("scenario_key") for row in rows] == ["A", "B", "C"]
assert all("actor_derivation" in row for row in rows)
assert all("selected_dimensions" in row for row in rows)
assert all(isinstance(row.get("strategic_freedom_score"), float) for row in rows)
payload = build_actor_derivation_artifact(
run_id="det-123456",
actor_profile_ref="actor_profile_v1",
scenario_pack_ref="strategic_actor_nokia_v1",
scenario_derivations=[
{
"scenario_key": "A",
"actor_derivation": {"decision_style": "offense_breakthrough"},
"selected_dimensions": {"selected_dimension_keys": ["ecosystem_control"]},
"strategic_freedom_score": 0.73,
},
{
"scenario_key": "B",
"actor_derivation": {"decision_style": "defense_resilience"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.51,
},
{
"scenario_key": "C",
"actor_derivation": {"decision_style": "confrontation_competition"},
"selected_dimensions": {"selected_dimension_keys": ["execution_velocity"]},
"strategic_freedom_score": 0.49,
},
],
)
assert payload["artifact_type"] == "actor_derivation"
assert payload["version"] == "actor_derivation_v1"
assert payload["run_id"] == "det-123456"
assert payload["actor_profile_ref"] == "actor_profile_v1"
assert payload["scenario_pack_ref"] == "strategic_actor_nokia_v1"
rows = list(payload.get("scenario_derivations") or [])
assert [row.get("scenario_key") for row in rows] == ["A", "B", "C"]
assert all("actor_derivation" in row for row in rows)
assert all("selected_dimensions" in row for row in rows)
assert all(isinstance(row.get("strategic_freedom_score"), float) for row in rows)

Copilot uses AI. Check for mistakes.
@codecov

codecov Bot commented Apr 9, 2026

Copy link
Copy Markdown

@RCFans RCFans merged commit 6be99b0 into main Apr 9, 2026
9 checks passed
@RCFans RCFans deleted the release/reason-chain branch April 9, 2026 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants