Fix CI/CD pipeline: comprehensive error handling and versioning#154
Conversation
## Changes: 1. **reusable-enforce-standards.yml**: Fixed "Remove prohibited files" step - Added proper error handling for 404 responses when files don't exist - Uses helper function to safely delete files - Won't fail when .github/workflows directory is missing 2. **reusable-release.yml**: Fixed version management - Replaced pycalver (not configured) with inline CalVer generation - Uses YYYY.MM.BUILD_NUMBER format matching existing convention - Properly finds and updates __version__ in package __init__.py 3. **ci.yml**: Fixed job dependencies - Removed enforce from sync dependencies (runs in parallel now) - enforce failures no longer block sync/release/docs pipeline 4. **reusable-docs.yml**: Improved robustness - Better handling of missing docs directories - Proper gh-pages branch creation - Added .nojekyll file - Better error handling throughout
|
Note Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
This PR fixes critical issues in the CI/CD pipeline, focusing on error handling and proper versioning implementation. The changes transition from the broken pycalver approach to inline CalVer generation matching the project's established convention (YYYY.MM.BUILD with non-zero-padded months), improve error handling in the enforce standards workflow to gracefully handle missing files, and decouple the enforce job from the release pipeline to prevent blocking.
Key changes:
- Replaced pycalver dependency with inline CalVer generation in release workflow
- Added comprehensive error handling for file deletion operations in enforce standards
- Restructured CI workflow dependencies to run enforce in parallel rather than blocking releases
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
.github/workflows/reusable-release.yml |
Implements inline CalVer versioning (YYYY.MM.BUILD) to replace broken pycalver, adds version update step that finds and modifies __init__.py |
.github/workflows/reusable-enforce-standards.yml |
Adds delete_file() helper function with proper 404 handling and error suppression for safe file deletion |
.github/workflows/reusable-docs.yml |
Improves robustness with better path handling, fallback HTML generation, .nojekyll file creation, and comprehensive error handling |
.github/workflows/ci.yml |
Decouples enforce job from sync/release dependency chain, allowing it to run in parallel without blocking releases |
| pip install build twine pycalver | ||
| PKG_DIR="packages/${{ inputs.package }}" | ||
| VERSION="${{ steps.version.outputs.version }}" | ||
|
|
There was a problem hiding this comment.
The find command may fail silently if $PKG_DIR/src doesn't exist, causing the script to continue with an empty INIT_FILE. Add an explicit check that the source directory exists before running find, or modify the error check on line 46-49 to also verify the directory exists and provide a more specific error message.
| # Check that source directory exists | |
| if [ ! -d "$PKG_DIR/src" ]; then | |
| echo "Error: Source directory $PKG_DIR/src does not exist." | |
| exit 1 | |
| fi | |
| fi | ||
|
|
||
| # Clear and copy new docs | ||
| rm -rf gh-pages/* gh-pages/.* 2>/dev/null || true |
There was a problem hiding this comment.
The pattern gh-pages/.* will match and attempt to delete .git directory, which could corrupt the repository. Exclude .git explicitly: find gh-pages -mindepth 1 -maxdepth 1 ! -name '.git' -exec rm -rf {} + 2>/dev/null || true
| rm -rf gh-pages/* gh-pages/.* 2>/dev/null || true | |
| find gh-pages -mindepth 1 -maxdepth 1 ! -name '.git' -exec rm -rf {} + 2>/dev/null || true |
| # Get file SHA, suppress all errors | ||
| SHA=$(gh api "repos/$REPO/contents/$FILE_PATH" --jq '.sha' 2>/dev/null) || true |
There was a problem hiding this comment.
The function makes two identical API calls to get the SHA (line 148) and then delete the file (line 151). Consider storing the full response from the first call and extracting both the SHA and using it to verify the file exists, reducing API calls by half.
| # Get file SHA, suppress all errors | |
| SHA=$(gh api "repos/$REPO/contents/$FILE_PATH" --jq '.sha' 2>/dev/null) || true | |
| # Get file metadata (including SHA), suppress all errors | |
| FILE_JSON=$(gh api "repos/$REPO/contents/$FILE_PATH" 2>/dev/null) || true | |
| SHA=$(echo "$FILE_JSON" | jq -r '.sha') || true |
.github/workflows/reusable-docs.yml
Outdated
| else | ||
| git commit -m "📚 Update docs for ${{ inputs.package }}" | ||
| git push -u origin gh-pages --force | ||
| echo "✅ Docs deployed to https://$(echo $REPO | cut -d'/' -f1).github.io/$(echo $REPO | cut -d'/' -f2)/" |
There was a problem hiding this comment.
Using cut to parse the repo URL is fragile. Use bash parameter expansion instead: echo \"✅ Docs deployed to https://${REPO%%/*}.github.io/${REPO##*/}/\"
| echo "✅ Docs deployed to https://$(echo $REPO | cut -d'/' -f1).github.io/$(echo $REPO | cut -d'/' -f2)/" | |
| echo "✅ Docs deployed to https://${REPO%%/*}.github.io/${REPO##*/}/" |
.github/workflows/reusable-docs.yml
Outdated
| uv pip install sphinx sphinx-rtd-theme myst-parser --system | ||
|
|
||
| # Install package for API docs | ||
| uv pip install -e packages/extended-data-types --system |
There was a problem hiding this comment.
The hardcoded path packages/extended-data-types should use the $PKG_DIR variable for the current package (defined on line 32) or construct it properly if this is intentionally installing a dependency. If this is meant to be a different package, add a comment explaining why.
## Key Changes:
1. **ci.yml**: Added `version` job that runs pycalver ONCE at start
- Generates single CalVer version (YYYYMM.BUILD)
- Uploads versioned source as artifact
- All downstream jobs use this unified version
2. **reusable-sync.yml**: Downloads versioned artifact, syncs to public repos
- Commits show unified version in message
3. **reusable-release.yml**: Uses passed version, not its own
- No more individual version generation
- All packages release with same version
4. **reusable-docs.yml**: Uses unified version in docs
5. **pyproject.toml**: Fixed vendor-connectors path in pycalver config
- Was: cloud_connectors (wrong)
- Now: vendor_connectors (correct)
## Architecture:
```
pycalver bump (once)
↓
Upload versioned-source artifact
↓
sync → release → docs (all use same version)
```
Every CI run = same version for all packages that release.
* Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * fix(agentic-control): correct Anthropic model IDs and document model fetching - Fixed invalid model names (claude-4-opus → claude-sonnet-4-5-20250929) - Added documentation on how to fetch latest models from Anthropic API - Haiku 4.5 has structured output issues - use Sonnet 4.5 for triage - Updated README with model selection table - Recovered agent bc-e8225222 analysis (14 completed, 8 outstanding tasks) - Updated memory-bank with recovery context To get latest models: curl -s "https://api.anthropic.com/v1/models" \ -H "x-api-key: \$ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" * fix: address Amazon Q review feedback on model IDs - Fixed recovery report model ID: claude-sonnet-4-20250514 → claude-sonnet-4-5-20250929 - Clarified wrong example: Opus 4.5 date is 20251101, not 20250514 - Updated correct example to show Sonnet 4.5 as DEFAULT * docs: formal agent handoff for vault-secret-sync and cluster-ops ## Spawned Agents 1. **vault-secret-sync Agent** (bc-d68dcb7c-9938-45e3-afb4-3551a92a052e) - Repository: jbcom/vault-secret-sync - Mission: Complete CI, publish Docker/Helm, merge PR #1 2. **cluster-ops Agent** (bc-a92c71bd-21d9-4955-8015-ac89eb5fdd8c) - Repository: fsc-platform/cluster-ops - Mission: Complete PR #154 integration ## Work Completed Before Handoff - Added Doppler store implementation - Added AWS Identity Center store for account discovery - Added CI/CD workflows - Updated Helm charts for jbcom registry - Addressed 23 AI review threads ## Next Steps Agents will: - Fix remaining CI issues - Publish Docker/Helm artifacts - Address new AI feedback - Prepare PRs for human review * docs: update handoff - cluster-ops requires local agent Cursor Cloud agents cannot access fsc-platform/cluster-ops (permission issue). vault-secret-sync agent (bc-d68dcb7c) is running successfully. cluster-ops PR will need local agent or manual handling. * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * refactor: consolidate fleet tooling to agentic-control - Remove cursor-fleet package (deprecated, replaced by agentic-control) - Update all .ruler files to reference agentic-control commands - Update process-compose.yml to use agentic-control - Add fsc-platform org to agentic.config.json (uses GITHUB_FSC_TOKEN) - Regenerate all agent instruction files via ruler apply - Update documentation to reflect new tooling BREAKING: cursor-fleet CLI commands replaced by agentic CLI: - cursor-fleet list -> agentic fleet list - cursor-fleet spawn -> agentic fleet spawn - cursor-fleet analyze -> agentic triage analyze - cursor-fleet review -> agentic triage review * Update dependencies and add ruler package Co-authored-by: jon <jon@jonbogaty.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* docs: add secrets infrastructure unification tracker Tracks all PRs and issues for secrets infrastructure consolidation: - data-platform-secrets-syncing: Greenfielded, 2 proposal PRs - terraform-aws-secretsmanager: Deprecation PR #52 - terraform-modules: Cleanup issues #225-229, PR #226 - cluster-ops: Deployment PR #154 Decision pending from FSC department heads. * fix: address Gemini review feedback - Renamed 'Issue' column to 'Issue / PR' for consistency - Removed 'PR' prefix from #226 for visual uniformity - Improved ASCII diagram with box-drawing characters for clarity * docs: improve ASCII diagram clarity per review feedback --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
… * Fix CI/CD pipeline: robust error handling and proper versioning ## Changes: 1. **reusable-enforce-standards.yml**: Fixed "Remove prohibited files" step - Added proper error handling for 404 responses when files don't exist - Uses helper function to safely delete files - Won't fail when .github/workflows directory is missing 2. **reusable-release.yml**: Fixed version management - Replaced pycalver (not configured) with inline CalVer generation - Uses YYYY.MM.BUILD_NUMBER format matching existing convention - Properly finds and updates __version__ in package __init__.py 3. **ci.yml**: Fixed job dependencies - Removed enforce from sync dependencies (runs in parallel now) - enforce failures no longer block sync/release/docs pipeline 4. **reusable-docs.yml**: Improved robustness - Better handling of missing docs directories - Proper gh-pages branch creation - Added .nojekyll file - Better error handling throughout * Fix YAML syntax error in docs workflow - heredoc breaking parser * Unified CalVer: ONE version for ALL packages via pycalver ## Key Changes: 1. **ci.yml**: Added `version` job that runs pycalver ONCE at start - Generates single CalVer version (YYYYMM.BUILD) - Uploads versioned source as artifact - All downstream jobs use this unified version 2. **reusable-sync.yml**: Downloads versioned artifact, syncs to public repos - Commits show unified version in message 3. **reusable-release.yml**: Uses passed version, not its own - No more individual version generation - All packages release with same version 4. **reusable-docs.yml**: Uses unified version in docs 5. **pyproject.toml**: Fixed vendor-connectors path in pycalver config - Was: cloud_connectors (wrong) - Now: vendor_connectors (correct) ## Architecture: ``` pycalver bump (once) ↓ Upload versioned-source artifact ↓ sync → release → docs (all use same version) ``` Every CI run = same version for all packages that release. --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
…e follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * fix(agentic-control): correct Anthropic model IDs and document model fetching - Fixed invalid model names (claude-4-opus → claude-sonnet-4-5-20250929) - Added documentation on how to fetch latest models from Anthropic API - Haiku 4.5 has structured output issues - use Sonnet 4.5 for triage - Updated README with model selection table - Recovered agent bc-e8225222 analysis (14 completed, 8 outstanding tasks) - Updated memory-bank with recovery context To get latest models: curl -s "https://api.anthropic.com/v1/models" \ -H "x-api-key: \$ANTHROPIC_API_KEY" \ -H "anthropic-version: 2023-06-01" * fix: address Amazon Q review feedback on model IDs - Fixed recovery report model ID: claude-sonnet-4-20250514 → claude-sonnet-4-5-20250929 - Clarified wrong example: Opus 4.5 date is 20251101, not 20250514 - Updated correct example to show Sonnet 4.5 as DEFAULT * docs: formal agent handoff for vault-secret-sync and cluster-ops ## Spawned Agents 1. **vault-secret-sync Agent** (bc-d68dcb7c-9938-45e3-afb4-3551a92a052e) - Repository: jbcom/vault-secret-sync - Mission: Complete CI, publish Docker/Helm, merge PR #1 2. **cluster-ops Agent** (bc-a92c71bd-21d9-4955-8015-ac89eb5fdd8c) - Repository: /cluster-ops - Mission: Complete PR #154 integration ## Work Completed Before Handoff - Added Doppler store implementation - Added AWS Identity Center store for account discovery - Added CI/CD workflows - Updated Helm charts for jbcom registry - Addressed 23 AI review threads ## Next Steps Agents will: - Fix remaining CI issues - Publish Docker/Helm artifacts - Address new AI feedback - Prepare PRs for human review * docs: update handoff - cluster-ops requires local agent Cursor Cloud agents cannot access /cluster-ops (permission issue). vault-secret-sync agent (bc-d68dcb7c) is running successfully. cluster-ops PR will need local agent or manual handling. * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * Checkpoint before follow-up message Co-authored-by: jon <jon@jonbogaty.com> * refactor: consolidate fleet tooling to agentic-control - Remove cursor-fleet package (deprecated, replaced by agentic-control) - Update all .ruler files to reference agentic-control commands - Update process-compose.yml to use agentic-control - Add org to agentic.config.json (uses GITHUB_FSC_TOKEN) - Regenerate all agent instruction files via ruler apply - Update documentation to reflect new tooling BREAKING: cursor-fleet CLI commands replaced by agentic CLI: - cursor-fleet list -> agentic fleet list - cursor-fleet spawn -> agentic fleet spawn - cursor-fleet analyze -> agentic triage analyze - cursor-fleet review -> agentic triage review * Update dependencies and add ruler package Co-authored-by: jon <jon@jonbogaty.com> --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Summary
Changes
Test plan
Note
Decouples and makes enforce non-blocking, hardens docs deployment, and switches releases to inline CalVer with automatic version update.
.github/workflows/ci.yml)syncno longer waits forenforce(runs in parallel).enforcemarked non-blocking; now runs aftermatrix,test, andlint.reusable-release.yml)YYYY.M.BUILD) and update__version__in package viased.--skip-existing.reusable-docs.yml)PKG_DIR; ensuredocs/_build.conf.py); otherwise generate placeholder HTML.index.html, add.nojekyll, commit only on changes, print deployed URL.reusable-enforce-standards.yml)delete_filehelper with graceful error handling; only delete workflows if directory exists..github/CODEOWNERS,dependabot.yml,repo-standards.yml); improved summary output.Written by Cursor Bugbot for commit b282ce6. This will update automatically on new commits. Configure here.