Skip to content

Rectify: Formatter Raw/Derived Field Duplication#502

Merged
Trecek merged 10 commits intointegrationfrom
bug-duplicate-ingredient-row-in-recipe-table-due-to-redundan/499
Mar 25, 2026
Merged

Rectify: Formatter Raw/Derived Field Duplication#502
Trecek merged 10 commits intointegrationfrom
bug-duplicate-ingredient-row-in-recipe-table-due-to-redundan/499

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented Mar 25, 2026

Summary

LoadRecipeResult carries both content (raw YAML text including the ingredients: block) and ingredients_table (a GFM table derived from those same ingredients). The PostToolUse formatter (_fmt_recipe_body()) rendered both unconditionally, causing duplicate ingredient display. This PR installs the missing structural signal: a _LOAD_RECIPE_DERIVED_FROM mapping that makes the derivation relationship machine-readable, augments the field-coverage test to validate it, and drives _fmt_recipe_body() to automatically strip derived-field content from source fields. It also adds a pipeline-internal-not-hidden semantic rule that enforces hidden: true on ingredients whose descriptions reveal automation-only use, and updates bundled recipe YAMLs accordingly.

Architecture Impact

Data Lineage Diagram

%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart LR
    classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;

    subgraph Source ["Data Origin"]
        YAML["recipe.yaml<br/>━━━━━━━━━━<br/>ingredients: block<br/>steps:<br/>kitchen_rules:<br/>(hidden: true markers)"]
    end

    subgraph Producer ["● Producer: recipe/_api.py"]
        PARSE["load_and_validate()<br/>━━━━━━━━━━<br/>parse + validate<br/>find_recipe_by_name()"]
        RECIPE_OBJ["Recipe dataclass<br/>━━━━━━━━━━<br/>typed ingredient list"]
        MERGE["● _merge_sub_recipe()<br/>━━━━━━━━━━<br/>skips hidden=True<br/>sub-recipe ingredients"]
        BUILD["build_ingredient_rows()<br/>━━━━━━━━━━<br/>filters hidden=True<br/>sorts by priority"]
        RAWCONTENT["content: str<br/>━━━━━━━━━━<br/>full raw YAML text<br/>(source form)"]
        INGTABLE["ingredients_table: str<br/>━━━━━━━━━━<br/>GFM markdown table<br/>(derived display form)"]
    end

    subgraph Boundary ["● TypedDict Boundary"]
        LOADRESULT["● LoadRecipeResult<br/>━━━━━━━━━━<br/>content + ingredients_table<br/>+ suggestions + valid"]
    end

    subgraph Rule ["★ Semantic Rule: rules_inputs.py"]
        PINH["★ pipeline-internal-not-hidden<br/>━━━━━━━━━━<br/>warns when ingredient lacks<br/>hidden: true but description<br/>signals automation-only use"]
        SUGGESTIONS["suggestions: list<br/>━━━━━━━━━━<br/>WARNING findings<br/>flow into LoadRecipeResult"]
    end

    subgraph Contract ["★ Derivation Contract: pretty_output.py"]
        DERIVEDMAP["★ ● _LOAD_RECIPE_DERIVED_FROM<br/>━━━━━━━━━━<br/>{'ingredients_table': 'content'}<br/>machine-readable source map"]
        STRIPFN["★ _strip_yaml_ingredients_block()<br/>━━━━━━━━━━<br/>removes ingredients: YAML block<br/>when derived table present"]
    end

    subgraph Formatter ["● Formatter: _fmt_recipe_body()"]
        FMTBODY["● _fmt_recipe_body()<br/>━━━━━━━━━━<br/>consults derivation map<br/>strips source block<br/>when derived field present"]
        RECIPESEC["--- RECIPE ---<br/>━━━━━━━━━━<br/>YAML minus ingredients: block<br/>(steps, kitchen_rules only)"]
        TABLESEC["--- INGREDIENTS TABLE ---<br/>━━━━━━━━━━<br/>sole display of ingredients<br/>sorted + filtered"]
    end

    AGENT(["LLM Agent<br/>━━━━━━━━━━<br/>sees each ingredient once"])

    YAML -->|"read_text()"| PARSE
    PARSE -->|"raw text"| RAWCONTENT
    PARSE -->|"_parse_recipe()"| RECIPE_OBJ
    RECIPE_OBJ -->|"sub-recipe merge"| MERGE
    MERGE -->|"hidden filtered"| BUILD
    BUILD -->|"format_ingredients_table()"| INGTABLE
    RAWCONTENT --> LOADRESULT
    INGTABLE --> LOADRESULT
    RECIPE_OBJ -->|"validate ingredients"| PINH
    PINH -->|"RuleFinding WARNING"| SUGGESTIONS
    SUGGESTIONS --> LOADRESULT
    LOADRESULT -->|"PostToolUse hook stdin"| FMTBODY
    DERIVEDMAP -->|"consulted at render"| FMTBODY
    FMTBODY -->|"strip content"| STRIPFN
    STRIPFN -->|"stripped YAML"| RECIPESEC
    FMTBODY -->|"pass through"| TABLESEC
    RECIPESEC --> AGENT
    TABLESEC --> AGENT

    class YAML cli;
    class PARSE,BUILD handler;
    class RECIPE_OBJ,LOADRESULT stateNode;
    class RAWCONTENT,INGTABLE stateNode;
    class MERGE handler;
    class PINH detector;
    class SUGGESTIONS phase;
    class DERIVEDMAP,STRIPFN newComponent;
    class FMTBODY handler;
    class RECIPESEC,TABLESEC output;
    class AGENT terminal;
Loading

Process Flow Diagram

%%{init: {'flowchart': {'nodeSpacing': 40, 'rankSpacing': 50, 'curve': 'basis'}}}%%
flowchart TB
    classDef terminal fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
    classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
    classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
    classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
    classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
    classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;

    START(["PostToolUse Hook<br/>━━━━━━━━━━<br/>stdin: JSON event"])

    subgraph Dispatch ["Hook Dispatch"]
        RESOLVE["_resolve_payload()<br/>━━━━━━━━━━<br/>unwrap MCP envelope<br/>→ DictPayload"]
        ROUTE{"tool_name?<br/>━━━━━━━━━━<br/>short name match"}
        FMTLR["_fmt_load_recipe()<br/>━━━━━━━━━━<br/>wraps _fmt_recipe_body()"]
    end

    subgraph DerivCheck ["● Derivation-Aware Content Render"]
        GETFIELDS["get content, ingredients_table<br/>━━━━━━━━━━<br/>from data dict"]
        LOOPDERIV{"★ for each entry in<br/>_LOAD_RECIPE_DERIVED_FROM<br/>━━━━━━━━━━<br/>source_field == 'content'?"}
        HASDERIVED{"★ data.get(derived_field)<br/>━━━━━━━━━━<br/>derived value present?"}
        CALLSTRIP["★ _strip_yaml_ingredients_block()<br/>━━━━━━━━━━<br/>strip source block"]
        DISPLAYCONTENT["display_content<br/>━━━━━━━━━━<br/>original or stripped YAML"]
    end

    subgraph StripSM ["★ _strip_yaml_ingredients_block() — Line State Machine"]
        ITERLINES["iterate YAML lines<br/>━━━━━━━━━━<br/>splitlines(keepends=True)"]
        CHKINGKEY{"line starts with<br/>'ingredients:'?"}
        SETFLAG["in_ingredients = True<br/>━━━━━━━━━━<br/>skip this line"]
        CHKINDENT{"in_ingredients AND<br/>line non-empty AND<br/>not line[0].isspace()?"}
        CLEARFLAG["in_ingredients = False<br/>━━━━━━━━━━<br/>next top-level key"]
        KEEPLINE["append line to result"]
        SKIPLINE["skip line<br/>━━━━━━━━━━<br/>indented child of ingredients:"]
        JOINRESULT["''.join(result)<br/>━━━━━━━━━━<br/>return stripped YAML"]
    end

    subgraph RenderOut ["Output Render"]
        APPENDRECIPE["append --- RECIPE --- block<br/>━━━━━━━━━━<br/>display_content (stripped)"]
        HASINGTABLE{"ing_table present?"}
        APPENDTABLE["append --- INGREDIENTS TABLE ---<br/>━━━━━━━━━━<br/>GFM table pass-through"]
        HASERRORS{"errors in suggestions?"}
        APPENDERRORS["append N finding(s)<br/>━━━━━━━━━━<br/>error/warning count"]
    end

    subgraph RuleFlow ["★ pipeline-internal-not-hidden Rule (validation time)"]
        ITERINGR["iterate recipe.ingredients<br/>━━━━━━━━━━<br/>ValidationContext"]
        CHKHIDDEN{"ing.hidden == True?"}
        CHKPATTERN{"★ _PIPELINE_INTERNAL_PATTERNS<br/>━━━━━━━━━━<br/>.search(desc) matches?"}
        EMITFINDING["emit RuleFinding<br/>━━━━━━━━━━<br/>severity=WARNING<br/>rule=pipeline-internal-not-hidden"]
        SKIPINGR["skip (already correct)"]
    end

    HOOKOUT(["Hook stdout<br/>━━━━━━━━━━<br/>updatedMCPToolOutput<br/>Markdown-KV string"])

    START --> RESOLVE
    RESOLVE --> ROUTE
    ROUTE -->|"load_recipe / open_kitchen"| FMTLR
    ROUTE -->|"other tools"| HOOKOUT

    FMTLR --> GETFIELDS
    GETFIELDS --> LOOPDERIV
    LOOPDERIV -->|"source is 'content'"| HASDERIVED
    LOOPDERIV -->|"no more entries"| DISPLAYCONTENT
    HASDERIVED -->|"YES: table present"| CALLSTRIP
    HASDERIVED -->|"NO: table absent"| DISPLAYCONTENT
    CALLSTRIP --> JOINRESULT
    JOINRESULT --> DISPLAYCONTENT

    CALLSTRIP --> ITERLINES
    ITERLINES --> CHKINGKEY
    CHKINGKEY -->|"YES"| SETFLAG
    CHKINGKEY -->|"NO"| CHKINDENT
    SETFLAG --> ITERLINES
    CHKINDENT -->|"YES: top-level key"| CLEARFLAG
    CHKINDENT -->|"NO: still indented"| SKIPLINE
    CLEARFLAG --> KEEPLINE
    SKIPLINE --> ITERLINES
    KEEPLINE --> ITERLINES

    DISPLAYCONTENT --> APPENDRECIPE
    APPENDRECIPE --> HASINGTABLE
    HASINGTABLE -->|"YES"| APPENDTABLE
    HASINGTABLE -->|"NO"| HASERRORS
    APPENDTABLE --> HASERRORS
    HASERRORS -->|"YES"| APPENDERRORS
    HASERRORS -->|"NO"| HOOKOUT
    APPENDERRORS --> HOOKOUT

    ITERINGR --> CHKHIDDEN
    CHKHIDDEN -->|"YES"| SKIPINGR
    CHKHIDDEN -->|"NO"| CHKPATTERN
    CHKPATTERN -->|"MATCH"| EMITFINDING
    CHKPATTERN -->|"NO MATCH"| SKIPINGR
    EMITFINDING --> ITERINGR
    SKIPINGR --> ITERINGR

    class START,HOOKOUT terminal;
    class RESOLVE,FMTLR,GETFIELDS,APPENDRECIPE,APPENDTABLE,APPENDERRORS handler;
    class ROUTE,HASINGTABLE,HASERRORS stateNode;
    class LOOPDERIV,HASDERIVED,CALLSTRIP,DISPLAYCONTENT newComponent;
    class ITERLINES,CHKINGKEY,SETFLAG,CHKINDENT,CLEARFLAG,KEEPLINE,SKIPLINE,JOINRESULT newComponent;
    class ITERINGR,CHKPATTERN,EMITFINDING detector;
    class CHKHIDDEN,SKIPINGR phase;
Loading

Closes #499

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260324-221121-429358/.autoskillit/temp/rectify/rectify_duplicate-ingredient-row_2026-03-24_222100_part_a.md

🤖 Generated with Claude Code via AutoSkillit

Trecek and others added 7 commits March 24, 2026 22:41
Adds _LOAD_RECIPE_DERIVED_FROM derivation map, _strip_yaml_ingredients_block()
helper, and updates _fmt_recipe_body() to consult the map — preventing duplicate
ingredient display when both content and ingredients_table are present in a
load_recipe or open_kitchen response.

Adds REALISTIC_RECIPE_YAML fixture constant and nine new tests covering the
deduplication behaviour, the strip helper edge cases, the derivation map coverage
contract, and the open_kitchen tool path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds three test files as the CI gate for Part B:
- test_rules_pipeline_internal.py: unit tests for the new semantic rule
- test_bundled_recipe_hidden_policy.py: parameterized tests asserting no
  pipeline-internal ingredients in bundled recipes lack hidden: true
- test_merge_sub_recipe_hidden.py: tests that _merge_sub_recipe() correctly
  skips hidden sub-recipe ingredients per its documented contract

These tests are expected to fail until the rule and YAML fixes are applied.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new WARNING-severity semantic rule that detects recipe ingredients
whose description signals pipeline-internal use (patterns: 'Set to', 'Set by',
'Set when', 'Used by', 'Passed by', 'already claimed', etc.) but that are
missing hidden: true. This makes the visibility policy self-documenting in
the validation system and fires during any recipe authoring workflow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds hidden: true to all pipeline-internal ingredients across four bundled
recipes. These ingredients are set programmatically by upstream automation
(process-issues, implementation-groups dispatcher, batch orchestrator) and
should not appear in the agent's interactive ingredients table.

Affected ingredients per recipe:
- implementation: upfront_claimed, run_mode, defer_cleanup, registry_path
- remediation: upfront_claimed, run_mode, defer_cleanup, registry_path
- implementation-groups: upfront_claimed, defer_cleanup, registry_path
- merge-prs: defer_cleanup, registry_path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The function's docstring stated it would merge 'non-hidden' sub-recipe
ingredients, but the code had no such filter. Adding the hidden guard
makes the implementation match the documented contract.

This has no behavioral impact on current bundled recipes (the sprint-prefix
sub-recipe has no hidden ingredients), but prevents a future correctness
gap if a sub-recipe introduces hidden ingredients intended to be internal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@Trecek Trecek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoSkillit PR Review — Verdict: changes_requested

Comment thread tests/infra/test_pretty_output.py
Comment thread tests/infra/test_pretty_output.py
Comment thread tests/infra/test_pretty_output.py
"""
recipe_path = pkg_root() / "recipes" / f"{recipe_name}.yaml"
recipe = load_recipe(recipe_path)
all_findings = run_semantic_rules(recipe)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] tests: run_semantic_rules(recipe) runs all rules; if pipeline-internal-not-hidden were silently deregistered, violations == [] still passes vacuously. The test_pipeline_internal_not_hidden_rule_is_registered test in test_rules_pipeline_internal.py provides the needed guard, but consider whether this parameterized test should also assert the rule is in the registry to be self-contained.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid observation — flagged for design decision. The dedicated guard test_pipeline_internal_not_hidden_rule_is_registered in test_rules_pipeline_internal.py already guards against silent deregistration. Whether this parameterized test should also inline that guard is a test-coupling philosophy question (single-responsibility vs. self-containment) that requires a human decision on the codebase's testing conventions.

Comment thread src/autoskillit/hooks/pretty_output.py Outdated
Comment thread src/autoskillit/recipe/rules_inputs.py Outdated
}
)

# Maps derived-display field name → source field name in LoadRecipeResult.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[warning] slop: The 8-line "HOW TO USE" block comment above _LOAD_RECIPE_DERIVED_FROM is developer-documentation-level prose better placed in a module docstring or CONTRIBUTING guide. The existing augmented-field-coverage test already enforces the contract mechanically; the comment over-explains what the test makes self-enforcing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated — this is intentional. The 'HOW TO USE' block provides authoring guidance: it tells future developers when to add an entry to the map and what decision criterion to apply. The coverage test (test_fmt_load_recipe_derivation_map_coverage) only validates consistency of existing declarations against _FMT_LOAD_RECIPE_RENDERED — it does not enforce the decision logic for adding new entries. The comment and the test serve distinct purposes; the comment is not redundant.

Trecek and others added 3 commits March 24, 2026 23:41
… test_pretty_output

Added `assert "--- INGREDIENTS TABLE" in result` before each
`result.split("--- INGREDIENTS TABLE")` call in the deduplication
tests. Without the guard, a silent split (separator absent) would
yield the full string as `recipe_section`, causing the deduplication
assertions to pass vacuously.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_INTERNAL_PATTERN

Aligns with established codebase naming convention: all other single
compiled re.Pattern constants use the singular form (_LINKED_ISSUE_PATTERN,
_FM_PATTERN, _OUTCOME_PATTERN, _FAILED_STEP_PATTERN). The plural name was
misleading since the constant holds a single re.Pattern object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_DERIVED_FROM

The loop over the map hard-coded `if source_field == "content"`,
making the generic dict abstraction misleading — any future entry
with a non-content source_field would be silently ignored.

Renamed to `_LOAD_RECIPE_CONTENT_DERIVED_FROM` to restrict the map
to content-only entries and documented the restriction. The loop now
iterates keys only, removing the redundant `source_field == "content"`
guard that was a symptom of the misleading abstraction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant