Skip to content

feat(t2): Phase 2 legal_representatives extraction for CZ#137

Merged
petterlindstrom79 merged 2 commits into
mainfrom
feat/phase-2-extraction-cz
May 18, 2026
Merged

feat(t2): Phase 2 legal_representatives extraction for CZ#137
petterlindstrom79 merged 2 commits into
mainfrom
feat/phase-2-extraction-cz

Conversation

@petterlindstrom79
Copy link
Copy Markdown
Member

Summary

Adds legal_representatives[] extraction to cz-company-data.ts via the ARES Veřejný rejstřík (VR view). Flips tier_2_available: true on records with active statutory body members. Cost-free (free public ARES endpoint, no auth).

Sibling of #136 (NO). Together: +2 binding-ready T2 in this Phase 2 sprint.

Audit phase

  1. Files read: cz-company-data.ts, italian-company-stakeholders.ts + norwegian-company-data.ts (sibling extraction patterns). Live probe against ARES ekonomicke-subjekty-vr/00177041.
  2. Current state: Handler called only the BE (basic entity) view which exposes company-level fields but no person records. Set tier_2_available: false with a follow-up-task reason — false claim on a record where the field IS extractable.
  3. Proposed change: Parallel-fetch BE + VR views. Map zaznamy[primary].statutarniOrgany[].clenoveOrganu[] to canonical legal_representatives[] shape. Extract zpusobJednani (manner of acting) as signing_authority. Flip tier_2_available based on populated representatives.
  4. Implications: Additive — existing fields preserved. VR fetch is opportunistic (404 → empty array, no hard fail) since sole traders / associations / foreign branches lack a public-register record.
  5. Worse than proposed: No — currently false-claims tier_2 unavailable.

Scope decisions

  • Includes: Statutární orgán (představenstvo / jednatel) members + Prokura (registered procurators).
  • Excludes: ostatniOrgany (other bodies — supervisory boards / dozorčí rada). Per Czech commercial law, supervisory bodies do not legally represent the company.
  • Filter: Drops historic register rows (datumVymazu set) and ended memberships (clenstvi.zanikClenstvi set).

Smoke test

npx tsx apps/api/scripts/capture-tier-fixtures.ts --slug cz-company-data:

  • Škoda Auto a.s. (00177041): 7 active representatives extracted (6 představenstvo members + 1 prokura). Fixture written with legal_representatives redacted by the PII scrubber.

Important: cross-PR coupling on PII scrubber

apps/api/scripts/capture-tier-fixtures.ts adds legal_representatives to PII_ARRAY_FIELDS — the same edit is in #136. Either PR landing first satisfies the dependency; the second commit no-ops the addition. First capture before this fix accidentally produced real-PII fixture (7 Skoda board members with full DOBs + nationality). Caught + deleted pre-commit; the fix is now baked in here.

CI gates

  • tsc --noEmit: clean for changed files (one pre-existing unrelated error in italian-company-stakeholders.ts:54).
  • check-tier-coverage.mjs: 11 findings, 0 not in allowlist. CZ added to tier-coverage-allowlist.txt for the pre-existing alias-key drift from PR feat(evidence-tier): labeling sweep across 31 company-data handlers #131 (legal_name / primary_registration_id / etc. not in manifest output_schema — separate cleanup).
  • check-fetch-timeout-coverage.mjs: clean (fetchVrByIco uses AbortSignal.timeout(10000)).
  • check-manifest-guaranteed-consistency.mjs: clean.

Files

  • apps/api/src/capabilities/cz-company-data.ts — adds VR-view types + fetchVrByIco + shapeRepresentatives; wires parallel BE+VR fetch.
  • manifests/cz-company-data.yaml — declares new fields (legal_representatives, total_legal_representatives, signing_authority, tier_2_available, tier_2_available_reason) in schema + reliability.
  • apps/api/scripts/capture-tier-fixtures.ts — adds legal_representatives to PII_ARRAY_FIELDS (same edit as feat(t2): Phase 2 legal_representatives extraction for NO #136).
  • apps/api/scripts/tier-coverage-allowlist.txt — allowlists CZ for pre-existing alias-key drift.
  • apps/api/tests/fixtures/tier-coverage/cz-company-data.json — re-captured with new fields.
  • apps/api/coverage-matrix/cz-company-data__cz__company-registry.yaml — bumps tier_2_coverage: 4/5 → 5/5.

Test plan

  • Live smoke against Škoda Auto via capture-tier-fixtures — 7 reps extracted
  • Typecheck (clean for changed files)
  • Tier-coverage gate (0 new findings)
  • Fetch-timeout coverage gate (clean)
  • Manifest-guaranteed-consistency gate (clean)
  • PII verified redacted in committed fixture

🤖 Generated with Claude Code

petterlindstrom79 and others added 2 commits May 18, 2026 16:44
Adds extraction of legal_representatives[] from ARES Veřejný rejstřík
(VR view) to cz-company-data.ts. Flips tier_2_available: true on records
with active statutory body members. Cost-free (free public ARES endpoint).

Canonical shape: { type, name, role, role_code, role_group, date_of_birth,
nationality, start_date } — extends the NO/IT-stakeholders shape with
nationality (statniObcanstvi) and start_date (vznikFunkce/vznikClenstvi).
Also surfaces signing_authority (způsob jednání) from the same record.

Scope:
- Includes Statutární orgán members (představenstvo / jednatel) + Prokura
- Excludes supervisory bodies (ostatniOrgany, e.g. dozorčí rada) — not
  legal representatives by Czech commercial law
- Filters out historic entries (datumVymazu present) and ended memberships
  (zanikClenstvi present)
- VR view fetch is opportunistic (404 → empty array, no failure) since
  sole traders / associations / foreign branches lack a VR record

Live verified:
- Škoda Auto a.s. (00177041): 7 active representatives extracted (6 board
  members + 1 procurist); fixture written with PII scrubbed

Also adds legal_representatives to capture-tier-fixtures PII_ARRAY_FIELDS
so future captures across all handlers stay PII-redacted (parallel of the
same edit on feat/phase-2-extraction-no — both branches need it).

Lifts binding-ready T2 by +1. Phase 2 of cost-free coverage sprint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up the 2026-05-18 last_verified shift on cz-company-data from
the previous commit. CI gate `coverage-matrix:check` requires
COVERAGE.md to be in sync with the per-row YAML files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant