Skip to content

audit(data): container provenance + UN/LOCODE anchor verification#36

Merged
SoapyRED merged 1 commit into
mainfrom
audit/containers-unlocode-provenance
May 15, 2026
Merged

audit(data): container provenance + UN/LOCODE anchor verification#36
SoapyRED merged 1 commit into
mainfrom
audit/containers-unlocode-provenance

Conversation

@SoapyRED
Copy link
Copy Markdown
Owner

Summary

Sprint containers-and-unlocode-data-integrity-C — provenance-only update across two datasets. Per the discipline rule "no field changes without primary-source backing" and a sandbox HTTPS allowlist that did not include UNECE / Maersk / ISO / Wikipedia on this audit pass, NO field values were mutated; every record is flagged verified: false until a future audit pass with HTTPS access can re-fetch and diff.

What's in

Containers (lib/calculations/container-capacity.ts)

  • isoCode field added to all 10 sea-container records: 22G1, 42G1, 45G1, L5G1, 22R1, 45R1, 22U1, 42U1, 22P1, 42P1.
  • provenance: { sources, auditedAt, verified, decisionRationale } on all 10, following the ULD / airline-codes pattern.
  • New fields surfaced on /api/containers (snake_case at the boundary).
  • 9/10 records map cleanly to the sprint anchor table within tolerance (±100–300 kg tare); one flag noted in audit doc — 20ft-flat-rack MGW 30 480 kg in-data vs sprint anchor table 34 000 kg. Maersk publishes both for ISO 22P1; existing 30 480 preserved this pass.
  • Tank container 22T1 NOT added — platform doesn't currently support tank-loading calculations.

UN/LOCODE (lib/data/unlocode-anchor-provenance.json — new file)

  • 30 LHR-weighted anchor codes (full set from the brief).
  • Per-record sources (UNECE country page + Wikipedia + IATA finder where applicable), audited_at, verified, decision_rationale.
  • Merged at lookupByCode in lib/calculations/unlocode.ts so the API surfaces provenance only on anchor codes — the other 116 099 records pass through unchanged.
  • 11MB main file untouched.

Docs

  • docs/audit/containers-unlocode-verification-2026-05-15.md — anchor map for both datasets, sandbox-constraint disclosure, runbook for flipping verified: true on a future fetch-enabled audit pass.
  • docs/audit/containers-unlocode-completeness-2026-05-15.md — per-field populated counts. 100% on engineering containers. UN/LOCODE main file has 100% name + status, 99.2% functions, 87.9% subdivision, 80% coords, 10.9% ascii_name, 0.6% IATA (sparse-by-design fields explained).

Sandbox constraint disclosure

WebFetch returned 403 on every primary source attempted (ISO, UNECE, Wikipedia, Maersk, Hapag-Lloyd). Direct curl returned Host not in allowlist. User confirmed via AskUserQuestion: provenance-only update with verified: false, no value mutations, file-storage = separate anchor-provenance JSON with merge-on-API.

FAULT 5 checklist

  • siteStats.ts — N/A (no displayed-number change)
  • app/sitemap.ts — N/A (no new URLs)
  • public/openapi.json — should add iso_code, sources, audited_at, verified, decision_rationale to the container + UN/LOCODE response schemas in a follow-up; this PR ships the runtime fields, openapi.json refresh deferred since no new endpoint
  • /api-docs page — same as above
  • nav dropdown — N/A
  • homepage tool grid — N/A
  • CHANGELOG.md — YES (May 15 Data Update entry)
  • lib/changelog-data.ts — YES (mirrored, newest-first)
  • /changelog page renders new entry — YES (verified in next build static HTML; both May 15 entries present)
  • MCP server tool registration — N/A (no new tool)
  • footer — N/A
  • GitHub README in freightutils-mcp — N/A
  • npm package version — N/A
  • Postman collection — N/A
  • 200-word page minimum — N/A
  • withAuditRest on new API routes — N/A (existing routes only)
  • generateMetadata on new public pages — N/A
  • IndexNow ping — N/A (no new URLs)

Test plan

  • npx tsc --noEmit — clean
  • npx next build — green
  • /changelog static HTML contains the new May 15 Data Update entry
  • Anchor file readable + 30 records validated via node -e
  • Vercel preview build (auto)
  • Smoke test against preview
  • Post-merge prod-curl: 3 container types (e.g. 20ft-standard, 40ft-high-cube, 20ft-reefer) + 3 UN/LOCODE codes (GBLHR, USJFK, SGSIN) — confirm iso_code / sources / audited_at / verified / decision_rationale present
  • Sentry-quiet 10-path 5xx sweep

🤖 Generated with Claude Code


Generated by Claude Code

Provenance-only update across two datasets. Per the discipline rule "no
field changes without primary-source backing" and a sandbox HTTPS allowlist
that did not include UNECE / Maersk / ISO / Wikipedia on this audit pass,
NO field values were mutated; every record is flagged verified: false until
a future audit pass with HTTPS access can re-fetch and diff.

Containers (lib/calculations/container-capacity.ts):
- Added isoCode field on all 10 sea-container records (22G1, 42G1, 45G1,
  L5G1, 22R1, 45R1, 22U1, 42U1, 22P1, 42P1).
- Added per-record provenance { sources, auditedAt, verified,
  decisionRationale } following the ULD / airline-codes pattern.
- Surfaced new fields on /api/containers (snake_case at the boundary).

UN/LOCODE (lib/data/unlocode-anchor-provenance.json — new file):
- 30 LHR-weighted anchor codes (GBLHR, GBMAN, GBLGW, GBSTN, DEFRA, NLAMS,
  FRCDG, BEBRU, USJFK, USLAX, USORD, SGSIN, HKHKG, JPNRT, KRICN, CNPVG,
  AEDXB, QADOH, TRIST, ETADD, KENBO, ZAJNB, BRGRU, MXMEX, CAYYZ, AUSYD,
  INDEL, INBOM, MYKUL, THBKK).
- Per-record sources (UNECE country page + Wikipedia + IATA finder where
  applicable), audited_at, verified, decision_rationale.
- Merged at lookupByCode in lib/calculations/unlocode.ts so the API surfaces
  provenance only on anchor records (other 116K records unchanged).
- 11MB main file untouched.

Audit docs:
- docs/audit/containers-unlocode-verification-2026-05-15.md — anchor map,
  sandbox-constraint disclosure, "how to flip verified: true" runbook.
- docs/audit/containers-unlocode-completeness-2026-05-15.md — per-field
  populated counts: 100% on engineering containers; UN/LOCODE main file
  has 100% name + status, 99.2% functions, 87.9% subdivision, 80% coords,
  10.9% ascii_name, 0.6% IATA (sparse-by-design fields explained).

CHANGELOG + lib/changelog-data.ts updated. tsc clean. next build green.
/changelog renders the new May 15 Data Update entry alongside the
existing May 15 Bug Fix entry.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
freighttools Ready Ready Preview, Comment May 15, 2026 6:55pm

Request Review

@SoapyRED SoapyRED marked this pull request as ready for review May 15, 2026 18:57
@SoapyRED SoapyRED merged commit d7c1751 into main May 15, 2026
2 checks passed
SoapyRED added a commit that referenced this pull request May 16, 2026
* audit(data): vehicles + customs duty provenance

PART A — Vehicles

All 18 records in lib/data/vehicles-ref.json carry per-record
`provenance` (≥2 sources each: EU Directive 96/53/EC, UK gov.uk
weights & licence categories, 49 CFR Part 393 / FMCSA 393.5 for US
trailers, plus manufacturer published specs from Schmitz Cargobull,
Krone, Faymonville, DAF, Mercedes-Benz, VW, Ford, Wabash National)
plus `auditedAt: "2026-05-16"`, `verified: false`, and a record-
specific `decisionRationale` citing the binding regulation and the
manufacturer match.

lib/calculations/vehicle-ref.ts gained a `VehicleProvenance`
interface (snake_case `accessed_at` on source items, camelCase
`auditedAt` / `decisionRationale` matching the container-capacity.ts
pattern from PR #36).

app/api/vehicles/route.ts now surfaces the provenance block snake-
cased: `provenance.{sources, audited_at, verified, decision_rationale}`.
Additive — no breaking change for existing callers.

PART B — Customs duty

lib/calculations/duty.ts gained a top-of-file methodology docstring +
a structured `DUTY_METHODOLOGY` constant: the four-step formula
(CIF → duty → VAT → totals), Trade Tariff measure-type resolution
(103 third-country, 142 preferential, 305 VAT, 695 anti-dumping,
277 restrictions), 8 HMRC source URLs, `audited_at: "2026-05-16"`,
`verified: false`, and decision rationale.

app/api/duty/route.ts gained a GET handler:
  GET /api/duty?methodology=true → DUTY_METHODOLOGY
Any other GET shape returns 400 with the api-docs link. POST is
unchanged.

lib/data/duty-sample-fixture.json anchors 10 high-volume HS headings
(0901 coffee, 2204 wine, 6110 jerseys, 8471 computers, 8703 motor
cars, 3004 medicaments, 7113 jewellery, 4011 tyres, 9504 game
consoles, 0207 poultry) to expected description substrings + common
origins + canonical Trade Tariff URLs. Data only this sprint — smoke
assertions are a follow-up chore.

DISCIPLINE

Sandbox HTTPS allowlist did not include eur-lex.europa.eu, gov.uk,
trade-tariff.service.gov.uk, the manufacturer domains, ecfr.gov,
fmcsa.dot.gov, unece.org, or en.wikipedia.org on this audit pass.
Per the rule "no field changes without primary-source backing",
NO field values were mutated. Every record + the methodology carry
`verified: false` until a future audit pass with HTTPS access can
re-fetch and diff.

DOCS

- docs/audit/vehicles-customs-completeness-2026-05-16.md — per-field
  gap report (18 vehicles, methodology, fixture).
- docs/audit/vehicles-customs-verification-2026-05-16.md — what was
  verified, what was deferred, runbook for flipping verified: false
  → verified: true once HTTPS access is restored.

FAULT 5 — CHANGELOG.md + lib/changelog-data.ts entries added for
2026-05-16. No new public page, no sitemap entry needed, no MCP
surface affected. Existing /api/vehicles + /api/duty routes still
audit-wrapped (lint:audit confirms). lint:api-casing confirms 38
route files clean — snake_case throughout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(audit): correct vehicle record count — 17 not 18

Miscounted the source dataset; the live API meta.total flagged it
(returned 17). Updates four files:

- docs/audit/vehicles-customs-completeness-2026-05-16.md —
  total count, per-category breakdown (9 artic / 5 rigid / 3 van),
  EU/US split (15 / 2), eur-lex citation count (10 records, not 8).
- docs/audit/vehicles-customs-verification-2026-05-16.md —
  "all 17 records" + the blocked-domains table count.
- CHANGELOG.md and lib/changelog-data.ts entries — "17" not "18".

No code change; no field-value change; provenance count unchanged
(17/17 = 100% as already documented).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: SoapyRED <soapyred@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants