Improve social review: critique loop, writing guide, and workflow fixes by adamgordonbell · Pull Request #18498 · pulumi/docs

adamgordonbell · 2026-04-13T14:33:56Z

Summary

Overhauls the social media review skill and supporting CI workflows based on issues observed in #18493, where social copy took 3 round trips to get right.

Skill restructure

The single social-media-review.md file is now a directory-based skill with separated concerns:

SKILL.md — thin orchestrator with explicit goal, principles, and a 6-step flow
references/critique-rubric.md — standalone rubric for the critique sub-agent with borderline judgment examples, soft-fail threshold pairs, and gap-only failure cases
references/writing-guide.md — craft guide with setup/gap/pointer structure, cross-platform adaptation example, and diverse bad-copy examples

New behavior: draft + critique loop on FAIL

When posts fail or platforms are missing, the reviewer drafts suggested replacement copy, verifies character counts programmatically, validates via a critique sub-agent with a clean context, and iterates up to 2 revisions. Posts suggested copy with a CTA for the author to apply.

Workflow changes

claude-social-review.yml: Added workflow_dispatch for manual triggering. Added Agent and Bash(python3:*) to allowed tools
claude.yml: Swapped GITHUB_TOKEN for PULUMI_BOT_TOKEN (via ESC) so Claude pushes trigger downstream workflows

Rubric additions

No-summaries and LLM-speak soft-fail heuristics
Character limits as hard-fail with URL overhead (X: 255, LinkedIn: 2950, Bluesky: 300)
Bluesky structure rule
Programmatic character count verification

Test plan

Comparison post (Add blog post: AI coding agent orchestration frameworks compared #18493): critique loop caught LinkedIn over-revealing, revised successfully
Product feature post (Neo Plan Mode): rubric caught press-release originals, writer produced problem-first copy
Verify claude.yml ESC secrets resolve (PULUMI_BOT_TOKEN)
Test manual workflow_dispatch trigger
Test that Claude pushes trigger social review workflow

… re-trigger - Add soft fail heuristics for article summaries and LLM-speak patterns - On FAIL or missing platforms, reviewer now drafts suggested replacement copy validated via critique sub-agent loop - Add Agent tool to social review workflow for sub-agent validation - Use PULUMI_BOT_TOKEN in claude.yml so pushes trigger downstream workflows Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude · 2026-04-13T14:38:27Z

Docs Review

PR: Improve social review: rubric additions, suggested copy, and re-trigger
Files reviewed: .claude/commands/social-media-review.md, .github/workflows/claude-social-review.yml, .github/workflows/claude.yml

Overall this is a well-structured change that closes a clear loop (review → suggest → apply → re-validate). A few items to address or verify:

Issues

1. [Medium] Unbounded critique loop could run away in CI
social-media-review.md, "Drafting suggested copy" step 4 (line 108):

"If the sub-agent returns FAIL, revise and re-validate. Repeat until all platforms pass."

There's no iteration cap. If the LLM writes copy that consistently triggers its own soft-fail heuristics (easy to do with the new LLM-pattern detector), this loop runs until the CI job times out, burning tokens the whole way. Consider adding a maximum (e.g., "up to 3 attempts; if still failing, post the best attempt with a note that it didn't fully pass self-review").

2. [Medium] Missing environment key on claude.yml job — verify OIDC compatibility
claude.yml uses pulumi/esc-action@v1 with OIDC auth (lines 33-35, 94-99) but the job has no environment: key. The social review workflow (claude-social-review.yml line 18) sets environment: production for the same ESC configuration. GitHub includes the environment claim in the OIDC token only when environment is set — if the ESC OIDC trust policy checks this claim, the secret fetch will fail silently in claude.yml but work in the social review workflow. Please verify the ESC trust policy accepts tokens without an environment claim, or add environment: production (or the appropriate value) to the claude job.

3. [Low] ESC secrets fetched before access check
claude.yml lines 33-37: The "Fetch secrets from ESC" step runs before the "Check repository write access" step. While the Claude action is gated on the access check (line 74), ESC credentials are fetched on every invocation — including unauthorized ones. Consider either:

Moving the ESC step after the access check, or
Adding if: steps.check-access.outputs.has_write_access == 'true' to the ESC step (which would require reordering steps)

This is low-priority since no secrets leak to untrusted code, but it avoids unnecessary ESC calls.

4. [Low] "Keep guidance short" constraint may confuse the reviewer
social-media-review.md line 129: The constraints still say "Keep guidance short and general (no line edits)." The new drafting process produces full replacement copy, which isn't "guidance" per se, but the instruction is ambiguous. Consider clarifying, e.g., "Keep guidance notes short and general — suggested copy is provided separately under its own heading."

Observations (non-blocking)

Soft-fail heuristic length (lines 57-58): The LLM-characteristic writing patterns bullet is quite dense. Consider splitting into sub-bullets for readability (e.g., separate "em dash chains," "staccato fragments," and "constructed parallelism" into their own items).
PASS/FAIL reorder: Moving PASS before FAIL in the output format section is a nice touch — simpler case first.
BUILD-AND-DEPLOY.md: Checked — it references claude.yml and claude-social-review.yml at a high level. The current descriptions remain accurate; no update needed for these operational changes.
Agent in allowed tools (claude-social-review.yml line 94): Correctly added to support the new sub-agent critique loop.

To request further reviews or fixes, mention @claude in a comment.

pulumi-bot · 2026-04-13T14:39:48Z

Your site preview for commit 06ee672 is ready! 🎉

http://www-testing-pulumi-docs-origin-pr-18498-06ee6720.s3-website.us-west-2.amazonaws.com

- Add workflow_dispatch to social review workflow with PR number input - Refactor workflow to resolve PR info from either event type - Make social-media-review skill user-invocable for local testing - Add local context: reads blog post directly, outputs to terminal Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move CTA to CI context only, clean up drafting section to be context-neutral, make local flow self-contained. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add platform character limits as hard fail rule (X: 255, LinkedIn: 2950, Bluesky: 300), accounting for URL overhead per platform - Require python3 character count verification in drafting process - Add Bash(python3:*) to CI workflow allowed tools for counting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…d examples Restructure the social-media-review skill from a single file into a directory-based skill with separated concerns: - SKILL.md: thin orchestrator with goal, principles, and flow - references/critique-rubric.md: standalone rubric for the critique sub-agent with borderline judgment examples, soft-fail threshold pairs, and gap-only failure cases - references/writing-guide.md: craft guide with setup/gap/pointer structure, cross-platform adaptation example, and diverse bad-copy examples Key changes to the rubric: - Add Bluesky structure rule - Require programmatic character count verification - Add curiosity gap section with self-contained test - Diverse examples across post types (migration, product, tutorial, intern story) — no examples from the same blog post Key changes to the workflow: - Update SKILL.md path reference - Critique loop capped at 3 critiques / 2 revisions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The ESC action uses OIDC auth which may require the environment claim in the token. The social review workflow already sets this; claude.yml was missing it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Critique rubric: - Inline bad/good example pairs for every soft-fail heuristic - Add voice section (I vs we vs naming external companies) - Fix: character limits clarified as body limits with URL overhead - Fix: Bluesky structure "should" → "must" under hard fail - Fix: weak-opening example isolated (no longer multi-trigger) - Fix: LLM patterns use consistent FAIL/PASS labels Writing guide: - Inline bad/good pairs for every craft topic (setup, gap, pointer, voice, specificity, announcement tone, only-Pulumi, LLM-speak) - Fix: specificity good example no longer closes the gap - Fix: LinkedIn section explicitly warns against using extra space for article information rather than story - Examples differentiated from rubric (different wording/scenarios) SKILL.md: - Output format: "keep it short, scannable, 30 seconds" - Flow step 3: check paragraph structure before moving on - Flow step 4: verify ALL drafts, don't skip, wasting rounds noted - Remove examples.md references CI workflow: - Reinforce output format in the prompt itself Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rubric: add complete bad/good example pairs for every soft-fail heuristic, voice variety (named author, stat-first, product-as-subject, third-party), missing-pointer and curiosity-gap-closed sections with threshold calibration. Writing guide: add setup/gap/pointer structure, pointer style variety, named-author voice, cross-platform adaptation examples, factual accuracy rule ("stick to what the post says"), and LLM-speak avoidance patterns.

- Remove backtick markdown from example social copy (hard-fail rule) - Add backtick to rubric's banned markdown list - Replace duplicated Fraser/policy-analyze example with Boris Schlosser/CI secrets - Replace duplicated SST third-party example with Simen Olsen/Bjerk - Replace duplicated Neo announcement example with Visual Import - Swap fabricated CloudFormation migration example for real CDKTF deprecation post - Swap fabricated HCL benchmark example for real token-efficiency post - Add "no gap" Good example (Bun runtime) to gap section - Add "stick to what the post says" factual accuracy section - Soften gap language: not every post needs one - Change LLM-speak pointer from "setup takes 15 minutes" to vary pointers - Use Neo Plan Mode for "we" voice example (team effort)

adamgordonbell · 2026-04-14T12:20:23Z

@claude review

claude · 2026-04-14T12:20:59Z

Claude finished @adamgordonbell's task in 2m 15s —— View job

Code Review

Overall

This is a high-quality restructure. The directory-based layout cleanly separates orchestration from rubric from craft guide. The examples throughout are diverse and concrete, and the previous reviewer's main concerns are addressed: the critique loop now has an explicit cap (up to 2 revisions, 3 total critiques) and environment: production is set in claude.yml.

Issues

[Low] critique-rubric.md: "Structure technically valid but still hard to read" has no example section

The summary bullet at line 40 lists this as one of the nine soft-fail heuristics — but it's the only one without a dedicated ### section and FAIL/PASS pair. Every other item in the list has detailed examples to guide the sub-agent. Without examples, a critique sub-agent has no ground truth for what "structure technically valid but hard to read" actually means.

The LLM-patterns section partially covers this (staccato fragments produce valid structure that's hard to read), but that's already its own heuristic. If this was intentional — meant to serve as a catch-all covered by the other sections — it might be cleaner to remove it from the summary rather than leave it dangling. If it points at something distinct (e.g., long undivided blocks, cramped sentence packing), it needs an example. Fix this →

[Low] ESC step ordering in claude.yml (noted in previous review, not yet addressed)

The Fetch secrets from ESC step runs unconditionally before Check repository write access. ESC credentials are consumed on every trigger, including unauthorized ones. No secrets leak to untrusted code, but it's an unnecessary ESC call. Adding if: steps.check-access.outputs.has_write_access == 'true' to the ESC step would require reordering, but would be cleaner. Low priority — not blocking.

[Very Low] Triple-single-quote edge case in character count command

Both SKILL.md (step 4) and critique-rubric.md use:

python3 -c "print(len('''<copy>'''))"

If copy ever contains ''', the shell command breaks. Using """ instead of ''' would be more robust. Extremely unlikely to matter in practice.

Observations (non-blocking)

Writing guide: cross-platform adaptation example is the standout piece. The X/LinkedIn/Bluesky trio for the same 150-Terraform-module story correctly demonstrates that more room means more story depth, not more article facts. This is the right mental model and worth having.

"Stick to what the post says" section. Prevents the obvious failure mode of the LLM embellishing with invented numbers. The "Common mistakes" list is concrete and actionable.

Pointer variation guidance. "Vary the style across platforms. If X uses 'Here's how,' LinkedIn and Bluesky should use a different form." This is easy to overlook and good to make explicit.

"Curiosity gap closed" vs "Missing pointer" as separate heuristics. These are distinct and easy to conflate. Having them as separate items with separate examples prevents the critique sub-agent from treating them as the same failure.

workflow_dispatch inputs. The Resolve PR info step correctly falls back to github.event.inputs.pr_number for manual runs, and the concurrency group uses || github.event.inputs.pr_number too. Clean.

Allowed tools in CI prompt. Agent and Bash(python3:*) are now explicitly allowed — required for the critique loop to function.

Summary

One real gap (structure heuristic has no example) and two low-priority workflow issues. The skill content itself is solid — the rubric is thorough, the examples are well-chosen, and the loop behavior is now correctly bounded.

claude · 2026-04-14T12:27:54Z

Docs Review

Summary

This PR restructures the social media review skill from a single file into a directory-based skill with separated concerns (orchestrator, critique rubric, writing guide), adds a draft + critique loop for FAIL cases, and updates two GitHub Actions workflows. The skill content is well-written with strong examples. The workflow changes follow established repo patterns but warrant human review due to expanded permissions.

Skill files (SKILL.md, critique-rubric.md, writing-guide.md)

These are agent instruction files and exempt from STYLE-GUIDE.md formatting rules per AGENTS.md. The content is high quality — the examples are concrete, the rubric is well-calibrated with borderline cases, and the writing guide teaches by contrast (bad vs. good). No issues found.

One behavior change to note: the skill changed from user-invocable: false to user-invocable: true (SKILL.md:2), meaning users can now invoke /social-media-review directly. This seems intentional given the added argument: field but isn't called out in the PR description.

Workflow changes — flag for human review

.github/workflows/claude.yml — PULUMI_BOT_TOKEN swap

Lines 34–37: Adds ESC secret fetching step.
Line 81: Replaces the default GITHUB_TOKEN with PULUMI_BOT_TOKEN for the Claude Code action. The comment explains the intent (pushes trigger downstream workflows like social review), and PULUMI_BOT_TOKEN is already used this way across many other workflows in the repo (customer-managed-workflow-agent-cli.yml, pulumi-cli.yml, etc.). The pattern is established, but this does expand what actions Claude's pushes can trigger — worth confirming this is the desired scope.
Line 21: Adds environment: production (required for ESC OIDC). Consistent with claude-social-review.yml and other workflows.
Lines 96–99: Workflow-level env: block for ESC config. Currently only one job exists, but if jobs are added later they'd inherit these ESC settings. This matches the pattern in claude-social-review.yml and other workflows.

.github/workflows/claude-social-review.yml — workflow_dispatch + expanded tools

Lines 8–12: Adds workflow_dispatch input for manual triggering. The pr_number input is type: string, which accepts non-numeric values, but gh pr view would simply fail on invalid input so this is low risk.
Lines 32–48: New Resolve PR info step that handles both event types. Uses GH_TOKEN and --repo flag so it works before checkout. Logic is clean.
Line 131: claude_args now includes Agent and Bash(python3:*). These are needed for the critique sub-agent loop (step 5 in SKILL.md) and programmatic character count verification (step 4). The Bash(python3:*) permission is scoped to python3 commands only, which limits the surface area.
Line 124: Updated skill path reference from .claude/commands/social-media-review.md to .claude/commands/social-media-review/SKILL.md. Correct.

No issues requiring changes

The skill content is clean, the workflow changes follow established patterns, and the examples are well-crafted. The main thing for a human reviewer to confirm is the intent behind swapping to PULUMI_BOT_TOKEN in claude.yml and the expanded tool permissions in claude-social-review.yml.

Mention @claude if you'd like additional reviews or fixes.

adamgordonbell temporarily deployed to testing April 13, 2026 14:34 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 13, 2026 14:48 — with GitHub Actions Inactive

Clarify local vs CI behavior in social review skill

c887c92

Move CTA to CI context only, clean up drafting section to be context-neutral, make local flow self-contained. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adamgordonbell temporarily deployed to testing April 13, 2026 14:50 — with GitHub Actions Inactive

Make local vs CI detection explicit in social review skill

7b4678e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adamgordonbell temporarily deployed to testing April 13, 2026 14:51 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 13, 2026 15:23 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 13, 2026 17:22 — with GitHub Actions Inactive

adamgordonbell changed the title ~~Improve social review: rubric additions, suggested copy, and re-trigger~~ Improve social review: critique loop, writing guide, and workflow fixes Apr 13, 2026

adamgordonbell temporarily deployed to testing April 13, 2026 17:25 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 14, 2026 10:51 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 14, 2026 11:41 — with GitHub Actions Inactive

adamgordonbell temporarily deployed to testing April 14, 2026 12:19 — with GitHub Actions Inactive

adamgordonbell marked this pull request as ready for review April 14, 2026 12:25

adamgordonbell merged commit 99f1c24 into master Apr 14, 2026
8 checks passed

adamgordonbell deleted the improve/social-review-workflow branch April 14, 2026 12:26

adamgordonbell temporarily deployed to testing April 14, 2026 12:26 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve social review: critique loop, writing guide, and workflow fixes#18498

Improve social review: critique loop, writing guide, and workflow fixes#18498
adamgordonbell merged 10 commits intomasterfrom
improve/social-review-workflow

adamgordonbell commented Apr 13, 2026 •

edited

Loading

Uh oh!

claude Bot commented Apr 13, 2026

Uh oh!

pulumi-bot commented Apr 13, 2026 •

edited

Loading

Uh oh!

adamgordonbell commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

claude Bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adamgordonbell commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Skill restructure

New behavior: draft + critique loop on FAIL

Workflow changes

Rubric additions

Test plan

Uh oh!

claude Bot commented Apr 13, 2026

Docs Review

Issues

Observations (non-blocking)

Uh oh!

pulumi-bot commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamgordonbell commented Apr 14, 2026

Uh oh!

claude Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Overall

Issues

Observations (non-blocking)

Summary

Uh oh!

Uh oh!

claude Bot commented Apr 14, 2026

Docs Review

Summary

Skill files (SKILL.md, critique-rubric.md, writing-guide.md)

Workflow changes — flag for human review

No issues requiring changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adamgordonbell commented Apr 13, 2026 •

edited

Loading

pulumi-bot commented Apr 13, 2026 •

edited

Loading

claude Bot commented Apr 14, 2026 •

edited

Loading