Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 21 additions & 39 deletions .github/prompts/docs-self-healing.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,72 +7,56 @@ Your job is to process each PR merged into `strapi/strapi` in the last 24 hours

- `$STRAPI_SOURCE` — local checkout of `strapi/strapi` (read-only, for diffs)
- `$DOC_REPO` — local checkout of `strapi/documentation` (read + write, for creating PRs)
- `$FILTERED_PRS` — JSON array of pre-filtered PRs (chores, CI, deps, tests already excluded by the workflow)
- GitHub CLI (`gh`) is authenticated via `GH_TOKEN`
- Model: `claude-sonnet-4-6` (set in the workflow YAML; optimized for cost on batch automation)

## Step 1 — Identify merged PRs (last 24 hours)
## Step 1 — Read the pre-filtered PR list

Use the GitHub API to list PRs merged into `develop` in the last 24 hours:
The workflow has already fetched and filtered merged PRs. The list is in `$FILTERED_PRS` as a JSON array:

```bash
gh api repos/strapi/strapi/pulls \
--jq '[.[] | select(.merged_at != null and .base.ref == "develop") | {number, title, body, merged_at, html_url}]' \
-f state=closed \
-f sort=updated \
-f direction=desc \
-f per_page=50
```json
[{"number": 12345, "title": "feat: add feature X", "html_url": "https://github.com/strapi/strapi/pull/12345"}, ...]
```

Filter to only those whose `merged_at` is within the last 24 hours.
Parse this list. **Do NOT re-fetch PRs from the GitHub API** — the workflow already did that.

**Rate limit:** Process a maximum of 1 PR per run (testing mode). If more qualify, log the skipped ones
to stdout and they will be picked up on the next run.

## Step 2 — Check idempotency (per PR)

Before processing each PR, check if a doc PR already exists for it by searching
the body of open PRs with the `auto-doc-healing` label:

```bash
gh pr list --repo strapi/documentation --label auto-doc-healing --state all \
--json body --jq '.[].body' | grep -q "strapi/strapi/pull/<NUMBER>"
```
## Step 2 — Read pre-fetched PR context (per PR)

If a match is found, skip the PR entirely. This ensures the workflow is idempotent
and recovers gracefully from partial failures.
The workflow has already fetched the body and diff for each PR. Read them from:

## Step 3 — Get PR context (per PR)
- `/tmp/pr-<NUMBER>-body.txt` — PR description
- `/tmp/pr-<NUMBER>.diff` — full diff

For each PR to process, fetch the description and the diff:

```bash
gh api repos/strapi/strapi/pulls/<NUMBER> --jq '.body' > /tmp/pr-<NUMBER>-body.txt
gh api repos/strapi/strapi/pulls/<NUMBER>.diff > /tmp/pr-<NUMBER>.diff
```
**Do NOT fetch these from the GitHub API** — they are already on disk.

**Diff size threshold:** If the diff exceeds 3000 lines, skip this PR and log:
"PR #<NUMBER> skipped — diff too large (X lines), flag for manual /autodoc".

## Step 4 — Run the Router (per PR)
## Step 3 — Run the Router (per PR)

**Read these files once at the start of the run** (not per PR):
- Router prompt: `$DOC_REPO/agents/prompts/router.md`
- Sidebars: `$DOC_REPO/docusaurus/sidebars.js`
- Page index: `$DOC_REPO/docusaurus/static/llms.txt`

Then, for each PR, apply the Router logic using:
- PR title and description from Step 3
- The diff from Step 3
- PR title and description from Step 2
- The diff from Step 2

The Router will produce a YAML `targets` block.

**Skip the PR if:**
- The Router finds no targets
- The Router sets `ask_user` (log the question to stdout for manual handling)
- The PR is purely: tests, dependency bumps, internal refactors, chore commits, CI changes,
translations, typo fixes in code comments

## Step 5 — Run the documentation pipeline (per PR with targets)
Note: chores, CI, deps, tests, and translations are already filtered out by the workflow
before Claude runs. The Router only sees PRs that passed the pre-filter.

## Step 4 — Run the documentation pipeline (per PR with targets)

For each PR where the Router identified targets, run the Create/Update Mode pipeline.

Expand Down Expand Up @@ -111,7 +95,7 @@ Authoring guides are small and target-specific — read them per target, not upf
**Templates:** For `create_page` targets, load the relevant template from `$DOC_REPO/agents/templates/`
based on the Router's `doc_type`.

## Step 6 — Create branch and draft PR (per PR)
## Step 5 — Create branch and draft PR (per PR)

After the Drafter has produced output for all targets:

Expand Down Expand Up @@ -151,7 +135,7 @@ git clean -fd
git reset --hard origin/main
```

## Step 7 — Write run summary
## Step 6 — Write run summary

After processing all PRs (or if none qualify), write a JSON summary to `/tmp/self-healing-summary.json`:

Expand All @@ -164,9 +148,6 @@ After processing all PRs (or if none qualify), write a JSON summary to `/tmp/sel
{"number": 12346, "title": "Fix typo in test", "reason": "Router: no doc update needed"},
{"number": 12347, "title": "Massive refactor", "reason": "Diff too large (4200 lines)"}
],
"already_processed": [
{"number": 12340, "title": "Update middleware", "reason": "Existing PR found with auto-doc-healing label"}
],
"errors": [
{"number": 12348, "title": "Add plugin Y", "error": "Drafter failed after retry"}
]
Expand All @@ -186,3 +167,4 @@ After processing all PRs (or if none qualify), write a JSON summary to `/tmp/sel
- **Max 5 PRs per run** — log skipped PRs to stdout
- **Max 3000 lines per diff** — skip and log oversized diffs
- **Never modify workflow files, configuration files, or sidebars.js**
- **NEVER run any write operation on strapi/strapi** — no issues, no comments, no PRs, no pushes, no API calls that modify state. The GH_TOKEN has write access but this workflow ONLY writes to strapi/documentation. Read-only access to strapi/strapi (diffs, PR bodies) is the only permitted use.
133 changes: 109 additions & 24 deletions .github/workflows/docs-self-healing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,25 +29,127 @@ jobs:
token: ${{ secrets.PAT_TOKEN_PIWI }}
fetch-depth: 50

- name: Check for merged PRs in last 24 hours
- name: List and filter merged PRs in last 24 hours
id: check-prs
env:
GH_TOKEN: ${{ secrets.PAT_TOKEN_PIWI }}
run: |
echo "Checking for PRs merged into strapi/strapi develop in the last 24h..."
SINCE=$(date -u -d '24 hours ago' '+%Y-%m-%dT%H:%M:%SZ' 2>/dev/null || date -u -v-24H '+%Y-%m-%dT%H:%M:%SZ')
PR_COUNT=$(gh api search/issues \

# Fetch all merged PRs with metadata
gh api search/issues \
--method GET \
-f q="repo:strapi/strapi is:pr is:merged base:develop merged:>=$SINCE" \
-f per_page=50 \
--jq '.total_count')
echo "Found $PR_COUNT merged PRs in the last 24h"
if [ "$PR_COUNT" -eq 0 ]; then
--jq '.items | [.[] | {number, title, html_url: .pull_request.html_url}]' > /tmp/all-prs.json

TOTAL=$(jq 'length' /tmp/all-prs.json)
echo "Found $TOTAL merged PRs in the last 24h"

# Filter out PRs that never need documentation.
# Based on analysis of 100+ merged PR titles in strapi/strapi.
#
# EXCLUDED: chore(*), test(*), docs:, security: package,
# *translation(s), *typo, Remove/Update yarn/README
#
# KEPT (Router decides): feat, fix, enhancement, and anything else
jq '[.[] | select(
(.title | test("^chore"; "i") | not) and
(.title | test("^test[:(\\s]"; "i") | not) and
(.title | test("^docs:"; "i") | not) and
(.title | test("^security: package"; "i") | not) and
(.title | test("translation[s]?$"; "i") | not) and
(.title | test("typo"; "i") | not) and
(.title | test("^(Remove|Update) (yarn|README)"; "i") | not)
)]' /tmp/all-prs.json > /tmp/filtered-prs.json

FILTERED=$(jq 'length' /tmp/filtered-prs.json)
EXCLUDED=$((TOTAL - FILTERED))
echo "After title filtering: $FILTERED candidates ($EXCLUDED excluded as chore/CI/deps/test)"

# Check idempotency: remove PRs that already have a doc PR
EXISTING_BODIES=$(gh pr list --repo strapi/documentation --label auto-doc-healing --state all \
--json body --jq '.[].body' 2>/dev/null || echo "")

jq --arg bodies "$EXISTING_BODIES" '[.[] | select(
($bodies | contains("strapi/strapi/pull/" + (.number | tostring))) | not
)]' /tmp/filtered-prs.json > /tmp/new-prs.json

ALREADY=$((FILTERED - $(jq 'length' /tmp/new-prs.json)))
FINAL=$(jq 'length' /tmp/new-prs.json)

if [ "$ALREADY" -gt 0 ]; then
echo "Idempotency: $ALREADY PRs already have a doc PR, skipped"
fi

echo "Final candidates for Claude: $FINAL"

# Set outputs based on final filtered list
if [ "$FINAL" -eq 0 ]; then
echo "has_prs=false" >> $GITHUB_OUTPUT
else
echo "has_prs=true" >> $GITHUB_OUTPUT
fi

# Pass the final list to Claude
{
echo "pr_list<<PR_LIST_EOF"
cat /tmp/new-prs.json
echo "PR_LIST_EOF"
} >> $GITHUB_OUTPUT

# Log detailed pre-filtering results in summary
echo "### Pre-filtering ($TOTAL merged PRs in 24h)" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY

# List excluded PRs
if [ "$EXCLUDED" -gt 0 ]; then
echo "**Excluded by title filter ($EXCLUDED):**" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
jq -r --argjson filtered "$(cat /tmp/filtered-prs.json)" \
'[.[] | select(.number as $n | $filtered | map(.number) | index($n) | not)]
| .[] | "- ~~#\(.number) — \(.title)~~"' /tmp/all-prs.json >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
fi

# List already processed PRs
if [ "$ALREADY" -gt 0 ]; then
echo "**Already processed ($ALREADY):**" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
jq -r --argjson newprs "$(cat /tmp/new-prs.json)" \
'[.[] | select(.number as $n | $newprs | map(.number) | index($n) | not)]
| .[] | "- #\(.number) — \(.title) *(doc PR already exists)*"' /tmp/filtered-prs.json >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
fi

# List candidates sent to Claude
if [ "$FINAL" -gt 0 ]; then
echo "**Sent to Claude ($FINAL):**" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
jq -r '.[] | "- **#\(.number) — \(.title)**"' /tmp/new-prs.json >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
fi

- name: Pre-fetch PR diffs and bodies
if: steps.check-prs.outputs.has_prs == 'true'
env:
GH_TOKEN: ${{ secrets.PAT_TOKEN_PIWI }}
run: |
for NUMBER in $(jq -r '.[].number' /tmp/new-prs.json); do
echo "Fetching body and diff for PR #$NUMBER..."
gh api "repos/strapi/strapi/pulls/$NUMBER" --jq '.body' > "/tmp/pr-${NUMBER}-body.txt" 2>/dev/null || echo "" > "/tmp/pr-${NUMBER}-body.txt"
gh api "repos/strapi/strapi/pulls/$NUMBER.diff" > "/tmp/pr-${NUMBER}.diff" 2>/dev/null || echo "" > "/tmp/pr-${NUMBER}.diff"

DIFF_LINES=$(wc -l < "/tmp/pr-${NUMBER}.diff")
echo " PR #$NUMBER: body $(wc -c < "/tmp/pr-${NUMBER}-body.txt") bytes, diff $DIFF_LINES lines"

# Flag oversized diffs
if [ "$DIFF_LINES" -gt 3000 ]; then
echo " ⚠️ PR #$NUMBER diff exceeds 3000 lines, will be skipped by Claude"
fi
done

- name: Load prompt from file
if: steps.check-prs.outputs.has_prs == 'true'
id: load-prompt
Expand All @@ -73,15 +175,7 @@ jobs:
STRAPI_SOURCE: ${{ github.workspace }}/.strapi-source
DOC_REPO: ${{ github.workspace }}
GH_TOKEN: ${{ secrets.PAT_TOKEN_PIWI }}

- name: Save Claude log as artifact
if: always() && steps.check-prs.outputs.has_prs == 'true'
uses: actions/upload-artifact@v4
with:
name: self-healing-log-${{ github.run_number }}
path: /tmp/self-healing-summary.json
retention-days: 30
if-no-files-found: warn
FILTERED_PRS: ${{ steps.check-prs.outputs.pr_list }}

- name: Summary
if: always()
Expand Down Expand Up @@ -121,15 +215,6 @@ jobs:
echo "" >> $GITHUB_STEP_SUMMARY
fi

# Already processed
ALREADY=$(jq -r '.already_processed | length' "$SUMMARY_FILE")
if [ "$ALREADY" -gt 0 ]; then
echo "### Already processed ($ALREADY)" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
jq -r '.already_processed[] | "- strapi/strapi#\(.number) — \(.title): \(.reason)"' "$SUMMARY_FILE" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
fi

# Errors
ERRORS=$(jq -r '.errors | length' "$SUMMARY_FILE")
if [ "$ERRORS" -gt 0 ]; then
Expand All @@ -140,6 +225,6 @@ jobs:
fi

# Totals
TOTAL=$((PROCESSED + SKIPPED + ALREADY + ERRORS))
TOTAL=$((PROCESSED + SKIPPED + ERRORS))
echo "---" >> $GITHUB_STEP_SUMMARY
echo "**Total PRs scanned:** $TOTAL | **Doc PRs created:** $PROCESSED | **Skipped:** $SKIPPED | **Errors:** $ERRORS" >> $GITHUB_STEP_SUMMARY
Loading