-
Notifications
You must be signed in to change notification settings - Fork 119
Description
Summary
Port the comprehensive automated documentation freshness checking infrastructure implemented in Azure-Samples/azure-nvidia-robotics-reference-architecture to hve-core for broader reuse across Microsoft projects. The system detects and tracks stale markdown files based on ms.date frontmatter timestamps using PowerShell validation scripts, GitHub Actions workflows, and automated issue creation.
Context
Successfully implemented and merged in Azure-Samples/azure-nvidia-robotics-reference-architecture#448 to address issue #164. The implementation completed 57 commits with comprehensive testing, creating a production-ready solution with PowerShell validation scripts, Pester test coverage, GitHub Actions workflows, and automated issue tracking.
Workflow test results: Successfully validated in workflow run 22880060875, creating 10 GitHub issues for stale documentation files.
Implementation Overview
The architecture uses a dual-orchestrator pattern separating weekly comprehensive scans (hard-fail mode with GitHub notifications) from PR-scoped validation (soft-fail warnings). Artifact-based data flow connects the validation workflow to the issue automation workflow, creating GitHub issues for each stale file with idempotent duplicate detection via automation markers.
Key Components
-
PowerShell Validation Script (
Invoke-MsDateFreshnessCheck.ps1)- Parses YAML frontmatter from markdown files
- Calculates age in days since
ms.datetimestamp - Generates dual outputs: JSON artifact + markdown summary
- Supports changed-files-only mode via Git merge-base comparison
- Configurable staleness threshold (default: 90 days)
-
GitHub Actions Workflows
msdate-freshness-check.yml— Reusable workflow with configurable threshold, soft-fail mode, and changed-files-only filteringweekly-validation.yml— Scheduled Monday 9 AM UTC scans with hard-fail mode triggering notificationspr-validation.yml— Immediate developer feedback during code review without blocking mergescreate-stale-docs-issues.yml— Issue automation creating/updating GitHub issues using automation markers
-
Pester Test Suite
- File discovery and exclusion patterns
- YAML frontmatter parsing with date format validation
- Report generation (JSON + markdown output formats)
- Integration tests covering full processing pipeline
- CI annotation verification using mock patterns
- Environment isolation with conditional execution based on PowerShell-Yaml availability
Execution Contexts
- Weekly scans: Scheduled for Monday 9 AM UTC, checks all markdown files, uses hard-fail mode to trigger GitHub email notifications when stale files exceed threshold
- PR checks: Runs automatically on pull requests, checks only modified files, uses soft-fail mode to warn without blocking merges
- Manual dispatch: Available via GitHub Actions UI workflow_dispatch on both orchestrator workflows
Technical Learnings
PowerShell Parameter Binding Issues (30+ commits of debugging)
The commit history documents extensive iterative debugging of PowerShell 7.4.13 array parameter binding issues in GitHub Actions:
- Root cause: PowerShell 7.4.13 bug on ubuntu-latest runners with
[string[]]array defaults - Approaches tested: hashtable splatting, array splatting, string conversion, Invoke-Expression, direct parameter binding
- Final solution: Moved parameter defaults outside param block to runtime assignment with
[AllowNull()]and[AllowEmptyCollection()]attributes
This debugging pattern demonstrates environment-specific edge cases invisible in local development, highlighting the value of CI-based validation.
Files to Port
From PR #448 (9 files changed, 1185 additions):
Core Implementation
scripts/linting/Invoke-MsDateFreshnessCheck.ps1— PowerShell validation scriptscripts/tests/linting/Invoke-MsDateFreshnessCheck.Tests.ps1— Pester test suite.github/workflows/msdate-freshness-check.yml— Reusable workflow.github/workflows/weekly-validation.yml— Weekly scheduled scan orchestrator.github/workflows/pr-validation.yml— PR validation orchestrator (integration point).github/workflows/create-stale-docs-issues.yml— Issue automation workflow
Documentation
docs/contributing/documentation-maintenance.md— Updated with implementation details and remediation workflow
Supporting Configuration
.markdownlint-cli2.jsonc— Exclusion oflogs/**directory from linting.cspell/general-technical.txt— Addedmsdateto custom dictionary
Acceptance Criteria for hve-core Port
- All PowerShell scripts ported to hve-core with repository-agnostic paths
- Pester test suite passing in hve-core CI environment
- GitHub Actions workflows generalized for reuse across Microsoft projects
- Documentation created explaining integration steps for consuming repositories
- Issue automation workflow templates provided with clear customization points
- PowerShell parameter binding workarounds documented for future maintenance
- Validation results artifact retention configurable (default: 30 days)
- Staleness threshold configurable per repository (default: 90 days)
- Test coverage matches or exceeds original implementation (comprehensive Pester suite)
- Example integration provided (similar to hve-core instruction file patterns)
Validation Metrics from Original Implementation
- Files checked: 63 markdown files
- Stale files detected: 10 (8 test fixtures at 419 days, 2 READMEs at 96 days)
- JSON report size: 9.1K
- Markdown summary size: 958 bytes
- Issues created: 10 GitHub issues (chore(devcontainer): pin base image version and align Node.js to v20 #513-docs(readme): update Quick Start to reflect collection-based extension and installer options #522) with automation markers
- Test coverage: Comprehensive Pester suite with integration tests
- Linting validation: 82 markdown files, 327 spell-checked files, 0 errors
Documentation Impact
- No documentation changes needed
- Documentation should be created explaining:
- Integration steps for consuming repositories
- Workflow customization parameters
- PowerShell environment requirements
- Issue automation marker conventions
- Troubleshooting PowerShell parameter binding issues
Priority Justification
This infrastructure improves compliance with the OpenSSF documentation_current criterion through automated staleness detection. The manual approach of relying on contributor memory is error-prone. Successfully tested implementation with production validation results.
References
- Original Issue: feat(workflows): add automated ms.date freshness checking for documentation Azure-Samples/azure-nvidia-robotics-reference-architecture#164
- Implementation PR: feat(build): add automated ms.date freshness checking Azure-Samples/azure-nvidia-robotics-reference-architecture#448
- Test Workflow Run: https://github.com/Azure-Samples/azure-nvidia-robotics-reference-architecture/actions/runs/22880060875/job/66380653143?pr=448
- Created Issues: #513-#522
Additional Notes
The porting effort should focus on generalizing repository-specific paths and configuration while preserving the dual-orchestrator pattern and artifact-based data flow. The PowerShell parameter binding workarounds are critical for ubuntu-latest GitHub Actions runners and should be documented for future maintenance.
Issue automation markers (<!-- automation:stale-docs -->) enable idempotent duplicate detection across workflow runs. This pattern could be valuable for other hve-core automation scenarios.