Skip to content

feat(extract): distinguish anchor_value_blank from anchor_not_found (P1-4)#29

Open
dev360 wants to merge 1 commit into
mainfrom
feat/p1-4-anchor-value-blank
Open

feat(extract): distinguish anchor_value_blank from anchor_not_found (P1-4)#29
dev360 wants to merge 1 commit into
mainfrom
feat/p1-4-anchor-value-blank

Conversation

@dev360
Copy link
Copy Markdown
Owner

@dev360 dev360 commented May 21, 2026

Summary

Closes P1-4 — anchored fields used to be completely silent when their label was present on the sheet but the adjacent value cell was empty. That's operationally indistinguishable from the case where the label itself was missing, even though the two require very different operator responses (fix the file vs. update the template).

This PR splits the cases via a new informational code and a new context key.

Behavior

  • Label not found at all → anchor_not_found (existing code), now tagged with ctx.label_was: "absent".
  • Label found, value cell empty, nullable: true field → new anchor_value_blank informational code, tagged with ctx.label_was: "present". Fires despite nullable: true precisely because the operator asked to be notified of the slot.
  • Label found, value empty, non-nullable field → existing missing_required code, additionally tagged with ctx.label_was: "present" so consumers can distinguish "value required and missing" from "label and value both missing" without inferring it from co-occurring errors.

RowExtractError carries a new label_was field that the validator lifts into Error.ctx.

Test plan

  • uv run pytest tests/test_field_scan_gaps.py -q — P1-4 graduates.
  • uv run pytest -q — 105 passed, 36 xfailed.
  • uvx --from 'ruff==0.6.9' ruff check . and ruff format --check . — clean.

All fixtures and copy use only fictitious values per CLAUDE.md.

🤖 Generated with Claude Code

An anchored ``nullable: true`` field used to be completely silent when
its label was present on the sheet but the adjacent value cell was
empty — operationally indistinguishable from the case where the label
itself was missing. The two cases require different operator
responses (fix the file vs. update the template), and the report had
no way to tell them apart.

Anchored extraction now distinguishes the two:

- Label not found at all → ``anchor_not_found`` (existing code), now
  tagged with ``ctx.label_was: "absent"``.
- Label found, adjacent value cell empty, field ``nullable: true`` →
  new ``anchor_value_blank`` informational code with
  ``ctx.label_was: "present"``. Fires regardless of whether
  ``nullable: true`` would otherwise have suppressed the error,
  precisely because the operator asked to be notified of the slot.
- Label found, value empty, field non-nullable → existing
  ``missing_required`` code, additionally tagged with
  ``ctx.label_was: "present"`` so consumers can distinguish "value
  required and missing" from "label and value both missing" without
  inferring it from co-occurring errors.

``RowExtractError`` carries a new ``label_was`` field that the
validator lifts into ``Error.ctx``. README error taxonomy and the
validator's row-message map are updated with the new code.
Graduates the P1-4 xfail test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dev360 dev360 force-pushed the feat/p1-4-anchor-value-blank branch from 550b04c to fa04b45 Compare May 22, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant