audit(data): container provenance + UN/LOCODE anchor verification#36
Merged
Conversation
Provenance-only update across two datasets. Per the discipline rule "no
field changes without primary-source backing" and a sandbox HTTPS allowlist
that did not include UNECE / Maersk / ISO / Wikipedia on this audit pass,
NO field values were mutated; every record is flagged verified: false until
a future audit pass with HTTPS access can re-fetch and diff.
Containers (lib/calculations/container-capacity.ts):
- Added isoCode field on all 10 sea-container records (22G1, 42G1, 45G1,
L5G1, 22R1, 45R1, 22U1, 42U1, 22P1, 42P1).
- Added per-record provenance { sources, auditedAt, verified,
decisionRationale } following the ULD / airline-codes pattern.
- Surfaced new fields on /api/containers (snake_case at the boundary).
UN/LOCODE (lib/data/unlocode-anchor-provenance.json — new file):
- 30 LHR-weighted anchor codes (GBLHR, GBMAN, GBLGW, GBSTN, DEFRA, NLAMS,
FRCDG, BEBRU, USJFK, USLAX, USORD, SGSIN, HKHKG, JPNRT, KRICN, CNPVG,
AEDXB, QADOH, TRIST, ETADD, KENBO, ZAJNB, BRGRU, MXMEX, CAYYZ, AUSYD,
INDEL, INBOM, MYKUL, THBKK).
- Per-record sources (UNECE country page + Wikipedia + IATA finder where
applicable), audited_at, verified, decision_rationale.
- Merged at lookupByCode in lib/calculations/unlocode.ts so the API surfaces
provenance only on anchor records (other 116K records unchanged).
- 11MB main file untouched.
Audit docs:
- docs/audit/containers-unlocode-verification-2026-05-15.md — anchor map,
sandbox-constraint disclosure, "how to flip verified: true" runbook.
- docs/audit/containers-unlocode-completeness-2026-05-15.md — per-field
populated counts: 100% on engineering containers; UN/LOCODE main file
has 100% name + status, 99.2% functions, 87.9% subdivision, 80% coords,
10.9% ascii_name, 0.6% IATA (sparse-by-design fields explained).
CHANGELOG + lib/changelog-data.ts updated. tsc clean. next build green.
/changelog renders the new May 15 Data Update entry alongside the
existing May 15 Bug Fix entry.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This was referenced May 15, 2026
SoapyRED
added a commit
that referenced
this pull request
May 16, 2026
* audit(data): vehicles + customs duty provenance PART A — Vehicles All 18 records in lib/data/vehicles-ref.json carry per-record `provenance` (≥2 sources each: EU Directive 96/53/EC, UK gov.uk weights & licence categories, 49 CFR Part 393 / FMCSA 393.5 for US trailers, plus manufacturer published specs from Schmitz Cargobull, Krone, Faymonville, DAF, Mercedes-Benz, VW, Ford, Wabash National) plus `auditedAt: "2026-05-16"`, `verified: false`, and a record- specific `decisionRationale` citing the binding regulation and the manufacturer match. lib/calculations/vehicle-ref.ts gained a `VehicleProvenance` interface (snake_case `accessed_at` on source items, camelCase `auditedAt` / `decisionRationale` matching the container-capacity.ts pattern from PR #36). app/api/vehicles/route.ts now surfaces the provenance block snake- cased: `provenance.{sources, audited_at, verified, decision_rationale}`. Additive — no breaking change for existing callers. PART B — Customs duty lib/calculations/duty.ts gained a top-of-file methodology docstring + a structured `DUTY_METHODOLOGY` constant: the four-step formula (CIF → duty → VAT → totals), Trade Tariff measure-type resolution (103 third-country, 142 preferential, 305 VAT, 695 anti-dumping, 277 restrictions), 8 HMRC source URLs, `audited_at: "2026-05-16"`, `verified: false`, and decision rationale. app/api/duty/route.ts gained a GET handler: GET /api/duty?methodology=true → DUTY_METHODOLOGY Any other GET shape returns 400 with the api-docs link. POST is unchanged. lib/data/duty-sample-fixture.json anchors 10 high-volume HS headings (0901 coffee, 2204 wine, 6110 jerseys, 8471 computers, 8703 motor cars, 3004 medicaments, 7113 jewellery, 4011 tyres, 9504 game consoles, 0207 poultry) to expected description substrings + common origins + canonical Trade Tariff URLs. Data only this sprint — smoke assertions are a follow-up chore. DISCIPLINE Sandbox HTTPS allowlist did not include eur-lex.europa.eu, gov.uk, trade-tariff.service.gov.uk, the manufacturer domains, ecfr.gov, fmcsa.dot.gov, unece.org, or en.wikipedia.org on this audit pass. Per the rule "no field changes without primary-source backing", NO field values were mutated. Every record + the methodology carry `verified: false` until a future audit pass with HTTPS access can re-fetch and diff. DOCS - docs/audit/vehicles-customs-completeness-2026-05-16.md — per-field gap report (18 vehicles, methodology, fixture). - docs/audit/vehicles-customs-verification-2026-05-16.md — what was verified, what was deferred, runbook for flipping verified: false → verified: true once HTTPS access is restored. FAULT 5 — CHANGELOG.md + lib/changelog-data.ts entries added for 2026-05-16. No new public page, no sitemap entry needed, no MCP surface affected. Existing /api/vehicles + /api/duty routes still audit-wrapped (lint:audit confirms). lint:api-casing confirms 38 route files clean — snake_case throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(audit): correct vehicle record count — 17 not 18 Miscounted the source dataset; the live API meta.total flagged it (returned 17). Updates four files: - docs/audit/vehicles-customs-completeness-2026-05-16.md — total count, per-category breakdown (9 artic / 5 rigid / 3 van), EU/US split (15 / 2), eur-lex citation count (10 records, not 8). - docs/audit/vehicles-customs-verification-2026-05-16.md — "all 17 records" + the blocked-domains table count. - CHANGELOG.md and lib/changelog-data.ts entries — "17" not "18". No code change; no field-value change; provenance count unchanged (17/17 = 100% as already documented). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: SoapyRED <soapyred@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sprint
containers-and-unlocode-data-integrity-C— provenance-only update across two datasets. Per the discipline rule "no field changes without primary-source backing" and a sandbox HTTPS allowlist that did not include UNECE / Maersk / ISO / Wikipedia on this audit pass, NO field values were mutated; every record is flaggedverified: falseuntil a future audit pass with HTTPS access can re-fetch and diff.What's in
Containers (
lib/calculations/container-capacity.ts)isoCodefield added to all 10 sea-container records: 22G1, 42G1, 45G1, L5G1, 22R1, 45R1, 22U1, 42U1, 22P1, 42P1.provenance: { sources, auditedAt, verified, decisionRationale }on all 10, following the ULD / airline-codes pattern./api/containers(snake_case at the boundary).20ft-flat-rackMGW 30 480 kg in-data vs sprint anchor table 34 000 kg. Maersk publishes both for ISO 22P1; existing 30 480 preserved this pass.UN/LOCODE (
lib/data/unlocode-anchor-provenance.json— new file)sources(UNECE country page + Wikipedia + IATA finder where applicable),audited_at,verified,decision_rationale.lookupByCodeinlib/calculations/unlocode.tsso the API surfaces provenance only on anchor codes — the other 116 099 records pass through unchanged.Docs
docs/audit/containers-unlocode-verification-2026-05-15.md— anchor map for both datasets, sandbox-constraint disclosure, runbook for flippingverified: trueon a future fetch-enabled audit pass.docs/audit/containers-unlocode-completeness-2026-05-15.md— per-field populated counts. 100% on engineering containers. UN/LOCODE main file has 100% name + status, 99.2% functions, 87.9% subdivision, 80% coords, 10.9% ascii_name, 0.6% IATA (sparse-by-design fields explained).Sandbox constraint disclosure
WebFetch returned 403 on every primary source attempted (ISO, UNECE, Wikipedia, Maersk, Hapag-Lloyd). Direct curl returned
Host not in allowlist. User confirmed viaAskUserQuestion: provenance-only update withverified: false, no value mutations, file-storage = separate anchor-provenance JSON with merge-on-API.FAULT 5 checklist
iso_code,sources,audited_at,verified,decision_rationaleto the container + UN/LOCODE response schemas in a follow-up; this PR ships the runtime fields, openapi.json refresh deferred since no new endpointnext buildstatic HTML; both May 15 entries present)withAuditReston new API routes — N/A (existing routes only)generateMetadataon new public pages — N/ATest plan
npx tsc --noEmit— cleannpx next build— green/changelogstatic HTML contains the new May 15 Data Update entrynode -e20ft-standard,40ft-high-cube,20ft-reefer) + 3 UN/LOCODE codes (GBLHR, USJFK, SGSIN) — confirmiso_code/sources/audited_at/verified/decision_rationalepresent🤖 Generated with Claude Code
Generated by Claude Code