CLI tool for documentation quality - validate code snippets, detect broken links, auto-fix issues, and integrate with CI/CD.
Documentation | Installation | Quick Start | CI/CD Integration
- π Code Snippet Validation - Validate code examples against actual source code using tree-sitter
- π Link Checking - Internal files, external URLs, GitHub repos, anchors
- π§ Auto-fixing - Outdated snippets, missing extensions, anchor typos, case issues
- πΎ Smart Caching - SQLite-based with 24h TTL, batch operations
- π₯ CODEOWNERS Support - Group issues by team, create PRs per owner
- π CI/CD Ready - JSON/Markdown output, GitHub annotations, exit codes
# Via pip (recommended)
pip install clean-docs # Core features
pip install 'clean-docs[snippets]' # + Code snippet validation
pip install 'clean-docs[semantic]' # + AI-powered analysis
pip install 'clean-docs[snippets,semantic]' # All features
# Or via curl installer
curl -fsSL https://raw.githubusercontent.com/Algiras/clean-docs/main/install.sh | bash# Check setup
clean-docs doctor
# Scan documentation for broken links
clean-docs scan ./docs
# Validate code snippets against source
clean-docs validate-snippets ./docs --code-dir ./src
# Auto-fix issues
clean-docs scan ./docs --fix --yes# Basic scan
clean-docs scan ./docs
# Fast mode (internal links only)
clean-docs scan ./docs --internal-only
# With options
clean-docs scan ./docs \
--verbose \
--timeout 30 \
--retry 3 \
--fail-fast# Console (default)
clean-docs scan ./docs
# JSON
clean-docs scan ./docs --format json
# Markdown report
clean-docs scan ./docs --format markdown --output report.md
# GitHub Actions annotations
clean-docs scan ./docs --github-annotations# Preview fixes
clean-docs scan ./docs --fix --dry-run
# Interactive (prompts for each fix)
clean-docs scan ./docs --fix
# Auto-fix all
clean-docs scan ./docs --fix --yesFor monorepos, group issues by team and create separate PRs:
# View ownership
clean-docs owners ./docs/api.md
# Scan grouped by owner
clean-docs scan . --group-by-owner
# Create PRs per team
clean-docs fix-prs . --codeowners CODEOWNERS
# Only for specific team
clean-docs fix-prs . --only-owner @myteam/docsValidate that code examples in documentation match actual source code:
# Install with snippet validation support
pip install 'clean-docs[snippets]'
# Validate snippets against source code
clean-docs validate-snippets ./docs --code-dir ./src
# Preview what would be fixed
clean-docs validate-snippets README.md --fix --dry-run
# Auto-fix outdated snippets
clean-docs validate-snippets ./docs --fix
# Adjust similarity threshold (default: 0.8)
clean-docs validate-snippets . --threshold 0.7
# Output as JSON for CI
clean-docs validate-snippets . --format jsonSupported languages: Java, Python, Scala, TypeScript, JavaScript, Go, Rust, Bazel
How it works:
- Extracts code blocks from markdown files
- Parses source code using tree-sitter to index symbols
- Matches snippets to source using file hints, symbol names, and code similarity
- Reports outdated examples with diffs and suggested fixes
Find orphaned docs and missing documentation using embeddings:
# Install with semantic support
pip install 'clean-docs[semantic]'
# Find docs with no related code
clean-docs semantic . --orphaned
# Find code without documentation
clean-docs semantic . --missing
# Both with custom threshold
clean-docs semantic . --orphaned --missing --threshold 0.6
# Specify directories
clean-docs semantic . --docs ./docs --code ./src# View stats
clean-docs cache --stats
# Show broken links
clean-docs cache --broken
# Clear expired
clean-docs cache --cleanup
# Clear all
clean-docs cache --clearCreate .clean-docs.yaml:
links:
timeout: 10 # HTTP timeout (seconds)
concurrency: 20 # Parallel checks
ignore_patterns:
- "localhost"
- "127.0.0.1"
- "example.com"
cache:
ttl_hours: 24name: Docs Check
on: [push, pull_request]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install and check
run: |
pip install -e .
clean-docs scan . --github-annotations --internal-only
- name: Report on failure
if: failure()
run: |
clean-docs scan . --format markdown >> $GITHUB_STEP_SUMMARY || true| Code | Meaning |
|---|---|
0 |
All checks passed |
1 |
Issues found (broken links, outdated snippets) |
| Type | Example |
|---|---|
| Internal | ./file.md, ../docs/guide.md |
| Anchors | #section, ./file.md#anchor |
| External | https://example.com |
| GitHub | github.com/user/repo/blob/main/file.md |
| Fixable | Example |
|---|---|
| Outdated code snippets | Updates examples to match current source |
| Missing extension | ./file β ./file.md |
| Anchor normalization | #My-Section β #my-section |
| Case sensitivity | ./File.md β ./file.md |
Manual review needed:
- External 404s
- Deleted files with no redirect
- Semantic anchor changes
- Code snippets with no source match
Clean Docs is available as an Agent Skill for AI coding assistants like Claude Code, Cursor, and others.
# Install the skill (example for skills-compatible agents)
npx skills add Algiras/clean-docsThe skill enables AI agents to check documentation quality, find broken links, and validate code snippets automatically.
See skills/clean-docs/SKILL.md for the skill definition.
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Run specific test
pytest tests/test_clean_docs.py::TestCache -vMIT License - see LICENSE.
