feat: auto-generate Pydantic models from JSON schema & add CI + linting#13
Conversation
There was a problem hiding this comment.
This file is used to automatically generate the Pydantic models
There was a problem hiding this comment.
All changes are only lint related
There was a problem hiding this comment.
Changes here are lint related + updating some of the logic to match the newly updated v2-rc1 schema. No additional computation logic was added.
There was a problem hiding this comment.
Lint updates exclusively
There was a problem hiding this comment.
Pull request overview
This PR modernizes the project’s GTFS Diff v2 output handling by switching to schema-driven (auto-generated) Pydantic v2 models, aligning engine output with the v2-rc1 schema, and adding CI + Ruff-based lint/format tooling to keep models and style consistent.
Changes:
- Add
scripts/generate_models.sh+schema.confto generate and pinsrc/gtfs_diff/models.pyfrom the upstream JSON Schema. - Update engine + tests to match the updated schema structure (notably moving per-file counts into
file_diffs[].stats). - Add Ruff configuration/scripts and a path-filtered GitHub Actions workflow (tests, lint, models freshness).
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/gtfs_diff/models.py |
Replace hand-written models with auto-generated schema-derived Pydantic v2 models. |
scripts/generate_models.sh |
Add a generator script to fetch schema + run datamodel-codegen and post-process the output. |
schema.conf |
Pin the schema version used for generation (v2-rc1). |
src/gtfs_diff/engine.py |
Align output to new schema (introduce FileStats, compute totals from stats, read schema version). |
src/gtfs_diff/cli.py |
Ruff-driven formatting and CLI output formatting tweaks. |
src/gtfs_diff/gtfs_definitions.py |
Formatting-only changes and wrapped docstring. |
tests/test_engine.py |
Update assertions to use file_diffs[].stats and apply formatting. |
tests/test_models.py |
Update/extend model tests for new schema fields (FileStats, files_not_compared_count, etc.). |
tests/test_cli.py |
Formatting updates for fixture zips and imports. |
tests/conftest.py |
Minor whitespace cleanup. |
scripts/lint.sh / scripts/lint-fix.sh |
Add lint/format check and auto-fix scripts via Ruff. |
pyproject.toml |
Add Ruff + datamodel-code-generator (dev deps) and configure Ruff. |
.github/workflows/ci.yml |
Add CI jobs for lint, tests (matrix), and model freshness enforcement. |
Comments suppressed due to low confidence (1)
src/gtfs_diff/engine.py:480
- This comment is now inaccurate: per-file counts are no longer surfaced in summary.files (FileSummary only contains file_name/status). The counts are now in file_diffs[].stats.
# Determine row-changes output based on cap.
# cap=0 means "summary counts only" — row_changes is omitted from the output
# entirely (serialized as null, excluded by exclude_none=True in the CLI).
# True counts are still computed and surfaced in summary.files.
base_header_set = set(base_headers)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary:
Automates diff output model generation and adds CI infrastructure.
generate_models.shscript to auto-generate Pydantic v2 models from the GTFS Diff JSON Schema usingdatamodel-code-generatorschema.confto pin the schema version (v2-rc1)lint.sh/lint-fix.shscriptsCloses #8, closes #9