feat(profile)!: generalize --validate / --strict / --no-projection flags#3915
Merged
Conversation
The profile command's CLI grew up around DCAT-US v3 and the original
flag names (`--validate-dcat`, `--strict-dcat`, `--no-dcat`) baked
DCAT into the surface. With dcat-ap-v3, croissant, and geoconnex
profiles now shipping — the last two of which are schema.org-rooted
JSON-LD, not DCAT at all — the names are misleading: you'd type
`--validate-dcat` to validate a Croissant document.
Renamed (clean break, no aliasing — profile is a new command):
--validate-dcat → --validate
--strict-dcat → --strict
--no-dcat → --no-projection
Output JSON keys:
dcat → projection
dcat_warnings → projection_warnings
dcat_discovered → kept (the discovered payload really is always
DCAT-shaped, parsed from HTTP Link:
rel=describedBy markup, regardless of which
profile is active)
Kept as-is because they genuinely are DCAT-specific:
--dcat-legacy-license, --no-dcat-discovery, --dcat-discovery-timeout
Internal: flag_* field names, final_dcat_has_field → final_projection_has_field,
strict-error message format, test fn names, and all `/dcat/...` JSON
pointers in test fixtures + integration tests follow the rename.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
…orev 2549) The prior rename PR moved the output JSON key from `dcat` to `projection` but left two internal address spaces unmigrated: 1. `discovery_merge` synthesized candidate paths as `/dcat/<key>` when checking which top-level keys (and per-distribution fields) were marked forced by the user's `dataset_info` overrides. With users now writing `/projection/...` JSON pointers per the documented contract, a forced nested path like `/projection/dcat:contactPoint/vcard:fn` no longer blocked the discovered parent object from being merged first — discovered sibling fields could leak into the final projection even though the `/dcat/...` legacy form would have protected that subtree (Medium, src/cmd/profile.rs:431). 2. YAML `field_mappings:` targets across all four bundled profiles (dcat-us-v3, dcat-ap-v3, croissant, geoconnex — 109 lines total) still used `/dcat/...` form. `translate_ckan_ptr` returned these verbatim, so package-side `force: true` overrides via `apply_force_overrides` wrote into a stale `/dcat/...` top-level subtree instead of `/projection/...`, AND the same paths fed to `discovery_merge` never matched the new `/projection/...` candidates either. Both are fixed by migrating the internal namespace to `/projection/`: * discovery_merge: param + struct field `forced_dcat_paths` → `forced_paths`; candidate prefix `/dcat/<key>` → `/projection/<key>`; `dist_prefix` likewise; all in-module unit tests updated. * context.rs: struct field `Analysis::forced_dcat_paths` → `forced_paths`; doc comments + force-path tests updated. * All four bundled YAMLs: `target: /dcat/...` → `target: /projection/...`. * profile.rs `--initial-context` help text: "final DCAT block" → "final projection block" (Low finding); regenerated docs/help/profile.md. Verified: 59/59 profile integration tests + 161/161 profile module unit tests pass; clippy clean; nightly fmt applied. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pace (roborev 2551) The init-context guide linked from `--initial-context --help` still documented force-override targets under the legacy `/dcat/...` root in two places after the runtime migration: 1. The CKAN→target mapping table at lines 107-113 listed entries like `/dcat/dct:title`. Users following the table would write force pointers that miss discovery-merge protection and land in a stale top-level `dcat` subtree instead of `/projection/...`. 2. The catalog-mode paragraph at line 165 used `/dcat/dct:title` to illustrate that dataset_info overrides target the inner Dataset. Updated both to `/projection/...` form, renamed the table column header "DCAT pointer" → "Projection pointer", and replaced the "CKAN slots without a DCAT counterpart" wording with "without a projection counterpart" so the prose matches the cross-profile naming the rest of the change uses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR generalizes the profile command’s DCAT-specific flag names, output keys, and pointer conventions to a profile-agnostic “projection” surface, aligning the command with DCAT-US, DCAT-AP, Croissant, and Geoconnex profiles.
Changes:
- Renames profile flags and output keys from DCAT-specific names to generalized projection terminology.
- Updates projection pointer mappings, tests, fixtures, and generated help documentation.
- Preserves genuinely DCAT-specific discovery and legacy-license terminology.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/cmd/profile.rs |
Renames CLI flags, output keys, strict messages, and projection warning handling. |
src/cmd/profile/context.rs |
Renames forced path fields and updates projection pointer comments/tests. |
src/cmd/profile/discovery_merge.rs |
Updates forced-path merge protection to use /projection/.... |
src/cmd/profile/profile_spec.rs |
Updates field-mapping documentation for projection targets. |
src/cmd/profile/projection.rs |
Updates projection warning documentation. |
src/cmd/profile/spec.rs |
Updates validator warning documentation. |
src/cmd/profile/dcat_validate.rs |
Updates validation flag/warning documentation. |
resources/profiles/*.yaml |
Rewrites embedded profile field mappings from /dcat/... to /projection/.... |
resources/profiles/README.md |
Updates validator flag documentation. |
resources/dcat-us-v3/README.md |
Updates validation/strict flag names. |
tests/test_profile.rs |
Updates integration assertions, flag usage, and test names for projection terminology. |
tests/resources/profile/dcat-init-context.* |
Updates initial-context examples and fixture pointers. |
docs/help/profile.md |
Regenerates profile command help with generalized flags and keys. |
Cargo.toml |
Updates Geoconnex feature comment to the new validation flag. |
Three Copilot review comments on PR #3915: 1. `--strict` help text overstated the behavior — said "any validation finding" but `run_profile_validation` RFC4180 warnings are appended regardless of `--strict`, and external-validator `Info`-severity findings are explicitly filtered out of the abort path. Tightened to "JSON Schema violations or non-Info external-validator findings (Required/Recommended severities)" and called out that RFC4180 structural failures are always warnings regardless of --strict. 2. `FieldMapping` doc-comment example in profile_spec.rs still said Croissant maps to "top-level keys", but the bundled Croissant profile now uses `/projection/name`, `/projection/description`, etc. Updated the example so custom-profile authors don't copy an invalid target shape — clarified that all bundled profiles share the `/projection/...` root regardless of their JSON-LD vocabulary. 3. dcat-init-context.README catalog-mode section claimed that pointer paths like `/projection/dct:title` target the inner Dataset under `--catalog`. They don't — `apply_pointer_overrides` and `apply_force_overrides` both run on the full output after the Catalog envelope is wrapped, so that pointer writes to the Catalog object. Documented the actual behaviour: inner-Dataset overrides need `/projection/dcat:dataset/0/...` (or `/projection/schema:dataset/0/...` for schema.org-rooted profiles); discovery-merge force-protection still works on the pre-wrap Dataset, so the protection guarantee is unchanged. Verified: 59/59 profile integration tests + 161/161 profile module unit tests pass; clippy clean; nightly fmt applied; help docs regenerated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `--strict` help text ended with a continuation line that began with `--strict.` (the trailing token of "regardless of\n--strict."), which `qsv --generate-help-md`'s table extractor mistook for a new option entry. The generated docs/help/profile.md grew a phantom `| --strict. | flag | |` row right after the real `--strict` row, documenting a non-existent flag. Reworded the wrap so the continuation reads "appended as warnings, regardless of this flag." — no continuation line now starts with an option-looking token. Regenerated profile.md confirms the fake row is gone and the real description is complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--validate-dcat,--strict-dcat,--no-dcat). Withdcat-ap-v3,croissant, andgeoconnexprofiles now shipping — the last two of which are schema.org-rooted JSON-LD, not DCAT at all — the names became misleading: you'd type--validate-dcatto validate a Croissant document.profileis a new command without an established user base.dcat→projection,dcat_warnings→projection_warnings.dcat_discoveredstays because the discovered payload is always DCAT-shaped (parsed from HTTPLink: rel=describedBymarkup) regardless of the active profile.Renamed
--validate-dcat--validate--strict-dcat--strict--no-dcat--no-projectiondcatprojectiondcat_warningsprojection_warningsKept (genuinely DCAT-specific)
--dcat-legacy-license— DCAT v1.1 backwards-compat fordct:licenseplacement--no-dcat-discovery— HTTPLink: rel=describedByDCAT-markup discovery--dcat-discovery-timeout— samedcat_discovered— always DCAT-shaped payloadInternal updates
Argsfield names:flag_validate_dcat/flag_strict_dcat/flag_no_dcat→flag_validate/flag_strict/flag_no_projectionfinal_dcat_has_field()→final_projection_has_field()qsv profile --strict-dcat:→qsv profile --strict:validate_dcat_*/strict_dcat_*→validate_*/strict_*/dcat/...JSON pointers in test fixtures + integration tests →/projection/...docs/help/profile.mdviaqsv --generate-help-mdTest plan
cargo build --locked --bin qsv -F all_features— cleancargo build --locked --bin qsvlite -F lite— clean (pre-existing unrelatedextract_sql_samplewarning)cargo build --locked --bin qsvmcp -F qsvmcp— cleancargo build --locked --bin qsvdp -F datapusher_plus— clean (pre-existing unrelated warnings)cargo clippy --bin qsv -F all_features— cleancargo +nightly fmt— appliedcargo test -F all_features --test tests test_profile::— 59/59 passcargo test -F all_features --bin qsv cmd::profile— 161/161 passqsv --generate-help-md— 73/73 regeneratedqsv --update-mcp-skills— re-ran🤖 Generated with Claude Code