-
Notifications
You must be signed in to change notification settings - Fork 5
Switch from pykwalify to StrictYAML #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@spoorcc has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 44 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (9)
WalkthroughMigrate manifest validation from PyKwalify to StrictYAML: add a StrictYAML schema module, update validation to load with strictyaml and enforce cross-item uniqueness, remove the old YAML schema file, rename a resource helper, update tests/features, and replace pykwalify with strictyaml in dependencies. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant FS as Filesystem
participant Validator as validate(path)
participant SY as StrictYAML (loader/schema)
participant Uniq as _ensure_unique
participant Reporter as Error reporting
User->>Validator: validate(manifest_path)
Validator->>FS: read(manifest_path)
FS-->>Validator: manifest_text
Validator->>SY: load(manifest_text, MANIFEST_SCHEMA)
alt YAML parse error
SY-->>Reporter: raise StrictYAMLError (with line/col)
Reporter-->>User: RuntimeError(error message + location)
else parsed
SY-->>Validator: manifest_dict
Validator->>Uniq: check remotes.name, projects.name, projects.dst
alt duplicates found
Uniq-->>Reporter: raise RuntimeError(Duplicate ... )
Reporter-->>User: RuntimeError(duplicate message)
else success
Validator-->>User: None (validation passed)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Follow-up for #903 see what i prefer more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI Agents
In @CHANGELOG.rst:
- Line 5: Update the changelog entry line that currently reads "* Switch from
pykwalify to StrictYAML (#0)" to reference the correct PR number by replacing
"#0" with "#922" so the line becomes "* Switch from pykwalify to StrictYAML
(#922)".
In @dfetch/manifest/schema.py:
- Line 36: The schema references an undefined symbol `Number` causing a
NameError; change the field mapping that uses `Number` to use the defined
`NUMBER` constant (defined as `NUMBER = Int() | Float()`) so the `"version"`
schema entry references `NUMBER` instead of `Number`, ensuring the runtime uses
the correct validator.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.rstdfetch/manifest/schema.pydfetch/manifest/validate.pydfetch/resources/__init__.pydfetch/resources/schema.yamlfeatures/updated-project-has-dependencies.featurepyproject.tomltests/test_resources.py
💤 Files with no reviewable changes (1)
- dfetch/resources/schema.yaml
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-01T11:05:43.761Z
Learnt from: ben-edna
Repo: dfetch-org/dfetch PR: 898
File: script/package.py:201-205
Timestamp: 2026-01-01T11:05:43.761Z
Learning: In the dfetch project's script/package.py, prefer explicit failures ("crash and burn") over silently handling missing configuration values, ensuring that builds fail if required metadata (e.g., URLs in pyproject.toml) is not properly configured.
Applied to files:
features/updated-project-has-dependencies.feature
📚 Learning: 2025-12-25T22:46:21.776Z
Learnt from: spoorcc
Repo: dfetch-org/dfetch PR: 861
File: dfetch/resources/__init__.py:12-17
Timestamp: 2025-12-25T22:46:21.776Z
Learning: In Nuitka standalone builds for dfetch, ensure resources are referenced from the top-level 'resources' directory (not the package path 'dfetch.resources') to avoid clashes with the executable named 'dfetch'. Specifically, in code under dfetch/resources/, when __compiled__ is detected, use the top-level 'resources' path as the resource locator (as already done in dfetch/resources/__init__.py with the 'resources' string). This pattern should apply to all Py files within dfetch/resources/ to prevent name collisions in standalone binaries.
Applied to files:
dfetch/resources/__init__.py
🧬 Code graph analysis (2)
dfetch/manifest/validate.py (1)
dfetch/manifest/manifest.py (2)
projects(266-268)remotes(287-289)
tests/test_resources.py (1)
dfetch/resources/__init__.py (1)
template_path(20-22)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build / build (macos-latest)
- GitHub Check: build / build (windows-latest)
- GitHub Check: build / build (ubuntu-latest)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: run / run (ubuntu-latest, 3.10)
- GitHub Check: run / test-cygwin
- GitHub Check: DevContainer Build & Test
🔇 Additional comments (7)
tests/test_resources.py (1)
11-25: LGTM!The test functions have been correctly updated to use the new
template_path()API. The test logic remains sound and properly validates the context manager behavior.features/updated-project-has-dependencies.feature (1)
93-97: LGTM!The enhanced error format provides better diagnostics for manifest validation failures. The multi-line output with content display, line position indicator, and specific error message ("found arbitrary text") improves the user experience when debugging invalid manifests.
dfetch/resources/__init__.py (1)
20-22: LGTM!The API has been correctly updated from
schema_path()totemplate_path(), referencingtemplate.yamlinstead ofschema.yaml. The function signature and implementation pattern remain consistent.pyproject.toml (1)
44-44: strictyaml 1.7.3 is valid with no known security vulnerabilities.dfetch/manifest/validate.py (2)
14-21: LGTM — clever duplicate detection.The set-based one-liner on line 18 is idiomatic, though the side-effect in a comprehension can surprise readers. An alternative using a simple loop might be clearer, but this works correctly.
24-50: Validation logic looks solid.The error handling correctly distinguishes schema violations (
YAMLValidationError) from parsing issues (StrictYAMLError). The uniqueness enforcement fornameanddstfields properly replaces the pykwalify constraints.Minor note: on line 45,
manifest.get("remotes", []) or []is slightly redundant—if the key is missing,getalready returns[]. Theor []only guards against an explicitNonevalue, which shouldn't occur if schema validation passed. Not worth changing, just noting for clarity.dfetch/manifest/schema.py (1)
7-30: Schema definitions look correct.
REMOTE_SCHEMAandPROJECT_SCHEMAproperly define required and optional fields with appropriate types. TheEnum(["git", "svn"])constraint forvcsandSeq(Str())forignoreare well-formed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
dfetch/manifest/validate.py (1)
14-21: The duplicate-detection logic is clever but obscure.The set comprehension on line 18 uses a side-effect trick (
seen.add(v)returnsNone, making theorclause always add and return falsy). While functional, this pattern can confuse readers.🔎 More readable alternative
def _ensure_unique(seq: list[dict[str, Any]], key: str, context: str) -> None: """Ensure values for `key` are unique within a sequence of dicts.""" values = [item.get(key) for item in seq if key in item] - seen = set() - dups = {v for v in values if (v in seen) or seen.add(v)} # type: ignore + seen: set[Any] = set() + dups: set[Any] = set() + for v in values: + if v in seen: + dups.add(v) + else: + seen.add(v) if dups: dup_list = ", ".join(sorted(map(str, dups))) raise RuntimeError(f"Duplicate {context}.{key} value(s): {dup_list}")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.rstdfetch/manifest/schema.pydfetch/manifest/validate.pydfetch/resources/__init__.pydfetch/resources/schema.yamlfeatures/updated-project-has-dependencies.featurepyproject.tomltests/test_resources.py
💤 Files with no reviewable changes (1)
- dfetch/resources/schema.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
- dfetch/resources/init.py
- features/updated-project-has-dependencies.feature
- pyproject.toml
- tests/test_resources.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: build / build (windows-latest)
- GitHub Check: build / build (macos-latest)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: run / run (ubuntu-latest, 3.10)
- GitHub Check: run / run (macos-latest, 3.9)
- GitHub Check: run / run (macos-latest, 3.13)
- GitHub Check: run / run (ubuntu-latest, 3.9)
- GitHub Check: run / run (windows-latest, 3.14)
- GitHub Check: run / run (macos-latest, 3.10)
- GitHub Check: run / run (ubuntu-latest, 3.12)
- GitHub Check: run / run (windows-latest, 3.13)
- GitHub Check: run / run (windows-latest, 3.9)
- GitHub Check: run / run (windows-latest, 3.12)
- GitHub Check: run / run (windows-latest, 3.10)
- GitHub Check: run / run (ubuntu-latest, 3.11)
- GitHub Check: run / run (windows-latest, 3.11)
- GitHub Check: run / test-cygwin
- GitHub Check: DevContainer Build & Test
🔇 Additional comments (5)
CHANGELOG.rst (1)
6-6: Verify the issue/PR reference #36.The reference
#36appears unusually low for an unreleased feature in version 0.12.0. Please confirm this is the correct issue number, or update it to reference the appropriate issue/PR.dfetch/manifest/schema.py (2)
1-42: Schema definition looks correct.The StrictYAML schema is well-structured with appropriate use of
Optional()for non-required fields, proper type validators, and a sensibleEnumconstraint for the VCS field.
38-38: Verify ifprojectsshould be required or optional.The schema marks
projectsas required (noOptional()wrapper), meaning manifests without any projects will fail validation. If empty manifests should be valid (e.g., for initialization), consider wrapping it withOptional():- "projects": Seq(PROJECT_SCHEMA), + Optional("projects"): Seq(PROJECT_SCHEMA),If requiring at least one project is intentional, this is fine as-is.
dfetch/manifest/validate.py (2)
44-45: Syncprojectshandling with schema optionality.If
projectsis required in the schema (as currently defined), line 44's direct access is correct. However, if you makeprojectsoptional in the schema (per the earlier suggestion), you'll need to update this line to use.get("projects", [])similar toremotes.Also, the
or []on line 45 is defensive againstremotes: nullin YAML—which StrictYAML'sSeq()validator would reject anyway, making it redundant. Consider removing it for clarity, or keep it if you prefer the extra safety.
24-50: Validation logic is well-structured.The migration to StrictYAML is clean:
- Proper exception hierarchy handling (
YAMLValidationErrorbeforeStrictYAMLError)- Appropriate use of exception chaining with
from err- Clear uniqueness constraint enforcement
- Reasonable logging suppression for the strictyaml module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI Agents
In @dfetch/manifest/validate.py:
- Line 51: In validate.py update the inline comment for the remotes declaration
(the line with "remotes: list[dict[str, Any]] = manifest.get(\"remotes\", []) or
[]") to fix the typo: change "option" to "optional" so the comment reads that
the field is optional.
🧹 Nitpick comments (2)
dfetch/manifest/validate.py (1)
12-21: Consider a more readable duplicate detection approach.The current implementation works correctly but uses a set comprehension with side effects (
seen.add(v)) that relies onset.add()returningNone. This pattern can be confusing for maintainers.🔎 Alternative implementation for clarity
def _ensure_unique(seq: list[dict[str, Any]], key: str, context: str) -> None: """Ensure values for `key` are unique within a sequence of dicts.""" values = [item.get(key) for item in seq if key in item] - seen = set() - dups = {v for v in values if (v in seen) or seen.add(v)} # type: ignore + from collections import Counter + counts = Counter(values) + dups = {v for v, count in counts.items() if count > 1} if dups: dup_list = ", ".join(sorted(map(str, dups))) raise RuntimeError( f"Schema validation failed:\nDuplicate {context}.{key} value(s): {dup_list}" )dfetch/manifest/schema.py (1)
19-21: Consider validating mutual exclusivity of branch/tag/revision.The
branch,tag, andrevisionfields are typically mutually exclusive in version control workflows. While StrictYAML schemas don't support cross-field constraints, consider adding runtime validation invalidate.pyto ensure only one of these fields is specified per project.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
CHANGELOG.rstdfetch/manifest/schema.pydfetch/manifest/validate.pydfetch/resources/__init__.pydfetch/resources/schema.yamlfeatures/updated-project-has-dependencies.featurefeatures/validate-manifest.featurepyproject.tomltests/test_resources.py
💤 Files with no reviewable changes (1)
- dfetch/resources/schema.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- dfetch/resources/init.py
- CHANGELOG.rst
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-01T11:05:43.761Z
Learnt from: ben-edna
Repo: dfetch-org/dfetch PR: 898
File: script/package.py:201-205
Timestamp: 2026-01-01T11:05:43.761Z
Learning: In the dfetch project's script/package.py, prefer explicit failures ("crash and burn") over silently handling missing configuration values, ensuring that builds fail if required metadata (e.g., URLs in pyproject.toml) is not properly configured.
Applied to files:
features/updated-project-has-dependencies.feature
🧬 Code graph analysis (2)
tests/test_resources.py (1)
dfetch/resources/__init__.py (1)
template_path(20-22)
dfetch/manifest/validate.py (2)
dfetch/manifest/manifest.py (3)
path(251-253)projects(266-268)remotes(287-289)dfetch/project/metadata.py (1)
path(128-136)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: Publish Python distribution 📦 to TestPyPI
- GitHub Check: build / build (windows-latest)
- GitHub Check: build / build (ubuntu-latest)
- GitHub Check: build / build (macos-latest)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: run / run (windows-latest, 3.14)
- GitHub Check: run / run (windows-latest, 3.11)
- GitHub Check: run / run (windows-latest, 3.13)
- GitHub Check: run / run (macos-latest, 3.10)
- GitHub Check: run / run (windows-latest, 3.10)
- GitHub Check: run / run (windows-latest, 3.12)
- GitHub Check: run / run (windows-latest, 3.9)
- GitHub Check: run / run (ubuntu-latest, 3.9)
- GitHub Check: run / run (macos-latest, 3.9)
- GitHub Check: run / test-cygwin
- GitHub Check: test / test
- GitHub Check: DevContainer Build & Test
🔇 Additional comments (11)
features/updated-project-has-dependencies.feature (1)
93-97: LGTM: Enhanced error diagnostics.The updated error format provides better context with multi-line diagnostics including the invalid content, line position, and a clear error description. This aligns with StrictYAML's error reporting style and improves the user experience.
features/validate-manifest.feature (2)
47-52: LGTM: Updated error format aligns with StrictYAML.The multi-line diagnostic format with line position indicators matches StrictYAML's error reporting style and provides clearer feedback to users.
54-72: LGTM: Good test coverage for duplicate detection.The new test scenario validates the duplicate project name detection feature introduced in the validation logic. This is valuable coverage for the new
_ensure_uniquefunctionality.tests/test_resources.py (1)
11-25: LGTM: Clean API rename from schema_path to template_path.The test updates correctly reflect the renamed resource API. The functionality remains unchanged, only the naming has been updated for clarity.
dfetch/manifest/validate.py (2)
24-46: LGTM: Clean StrictYAML integration with proper error handling.The validation function correctly loads the manifest with StrictYAML and handles both
YAMLValidationErrorandStrictYAMLErrorexceptions. The error message formatting provides clear, multi-line diagnostics that improve the user experience.
48-55: No changes needed—MANIFEST_SCHEMA correctly enforces required keys.The schema defines
"manifest"and"projects"as required fields (not wrapped inOptional()), while"remotes"is markedOptional(). The code's direct key access and.get()usage align with the schema constraints, making the implementation safe.pyproject.toml (1)
44-44: strictyaml==1.7.3 is valid and free from known vulnerabilities.Version 1.7.3 (released 2023-03-10) is the latest stable release available on PyPI and has no known CVEs or security advisories.
dfetch/manifest/schema.py (4)
1-5: LGTM!The imports and
NUMBERdefinition are correct. The previously flaggedNameError(usingNumberinstead ofNUMBER) has been resolved.
7-13: LGTM!The
REMOTE_SCHEMAdefinition correctly specifies required fields (name,url-base) and the optionaldefaultfield with appropriate types.
26-26: No issues found. Thevcsenum is correctly restricted to["git", "svn"], which are the only version control systems supported in the codebase.
38-38: No change needed —projectsis correctly required.The
projectsfield is intentionally required, not optional. The schema code is consistent with the documentation, which explicitly listsprojectsin the required fields section. The validation code also treats it as required by directly accessingmanifest["projects"]without optional handling. There is no inconsistency.Likely an incorrect or invalid review comment.
Summary by CodeRabbit
Bug Fixes
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.