apps/cli: generate style.css and page content section-by-section by youknowriad · Pull Request #3199 · Automattic/studio

youknowriad · 2026-04-22T21:07:49Z

Related issues

Related to: none (exploratory perf work from an interactive session)

How AI was used in this PR

The system-prompt changes in this PR were designed and written by Claude together with the PR author. Both the analysis that motivated the change (comparing several landing-page build sessions and profiling where the silent generation gaps lived) and the final prompt wording were produced through that pairing. The numbers cited below are real measurements from recorded session JSONL files, not estimates. Reviewers should focus on whether the prompt wording is clear, correct, and consistent with the rest of the system prompt — and whether the workflow described is actually the one we want the agent to follow.

Proposed Changes

Measured across five "build me a landing page" sessions, the dominant contributors to user-perceived latency in the AI agent's flow were two single-turn generations:

style.css: 60–90s producing 14–20KB of CSS in one Write call.
Page post_content: 40–50s producing a full multi-section page in one wp_cli post create/update --post_content=<huge> call.

In both cases the agent is working, but from the user's point of view it's a long stretch of dead air.

Rather than add new MCP tools, this PR leans on the existing Write/Edit/wp_cli toolset and adds two workflow rules to the system prompt:

Step 3a — style.css section-by-section. The agent first writes a small skeleton (:root { ... } + anchor comments like /* === hero === */), then fills one anchor per Edit call. Each edit is bounded to 300–2000 bytes.
Step 4a — page content section-by-section in a file. The agent creates the page empty, builds its block markup in <theme>/page-content.html via a  skeleton plus one Edit per section, and applies it with a single wp_cli post update --post_content-file=<path> at the end. Both rules explicitly forbid wrapping sections in core/html, which a prior experiment regressed on.

Also bumps the default maxTurns from 50 → 75 in startAiAgent. Session A in the measurements hit the 50 ceiling and was cut off mid-refinement; the section-by-section pattern adds more turns by design, so 75 gives headroom without being visibly different on runs that complete quickly.

Expected impact

Based on the measurements:

Longest silent gap per file drops from ~70s (whole style.css) to ~10–15s (per section).
Longest silent gap per page drops from ~45s (whole page markup) to ~10–15s (per section).
Total wall time may increase slightly (~10–20% on comparable content) due to the per-turn overhead, but the perceived experience is meaningfully better.
The agent's file artifacts (style.css, page-content.html) remain on disk as durable records that can be re-edited or re-applied without regenerating.

Not in this PR

No new MCP tools (intentional — an earlier experiment added an append_page_section tool and found the prompt-only path is cleaner and has better failure semantics).
No changes to the model, thinking configuration, or streaming. Those remain possible follow-ups if further wins are desired.

Testing Instructions

There are no new code paths — the behavioral change is entirely in the AI agent's system prompt. To verify:

Build the CLI: npm run cli:build.
Run an AI agent build flow: node apps/cli/dist/cli/main.mjs ai (or through the desktop app), and ask it to build a landing page for a product.
Observe the tool-use timeline:
- The agent should Write a small (< 2KB) skeleton for style.css first, then do 6–10 Edit calls against /* === <concern> === */ anchors to fill the stylesheet.
- The agent should create the page with empty --post_content="", Write a small skeleton for <theme>/page-content.html with  anchors, do 5–10 Edit calls to fill it, then apply once with wp_cli post update <id> --post_content-file=<absolute path>.
- No section should be wrapped in .
Confirm the final site looks correct and the design quality matches prior builds.
Session recordings under ~/Library/Application Support/Studio/sessions/ (macOS) will show the per-section Edit calls with small individual durations instead of one long silent generation.

If the agent ignores the new guidance and reverts to single-Write / single-post_content calls, the prompt may need to be tightened — note that behavior and let the PR author know.

Pre-merge Checklist

Have you checked for TypeScript, React or other console errors? (npm run typecheck — clean)
Tests pass? (npm test -- apps/cli/ai — 105 tests passing)
Manual verification on at least one AI agent build run (see Testing Instructions)

🤖 Generated with Claude Code

…on-by-section Measured across several "build me a landing page" sessions, the biggest silent gaps in the agent's flow were two single-turn generations: the theme's style.css (60-90s producing 15-20KB in one Write) and the page's post_content (40-50s producing a full multi-section page markup in one wp_cli call). These blocks of silence are the dominant contributor to the user's sense that "the agent is stuck", even when it's working. Rather than add new tools, lean on the existing Write/Edit/wp_cli flow and two workflow rules in the system prompt: - Step 3a: write style.css as a small :root + anchor-comment skeleton, then fill one `/* === <concern> === */` anchor per Edit call. - Step 4a: create the page empty, build its block markup in `<theme>/page-content.html` via a `` anchor skeleton + one Edit per section, and apply once at the end with `wp_cli post update --post_content-file=<path>`. Both rules explicitly forbid wrapping sections in `core/html`, which a prior run regressed on when given a similar tool-based workflow. Also bump the default maxTurns from 50 to 75: the first measured session hit the 50 ceiling and was cut off mid-refinement; the section-by-section pattern adds more turns, so 75 gives headroom without being visibly different on runs that complete quickly.

wpmobilebot · 2026-04-22T21:29:56Z

📊 Performance Test Results

Comparing 7414966 vs trunk

app-size

Metric	trunk	`7414966`	Diff	Change
App Size (Mac)	1491.75 MB	1491.75 MB	+0.00 MB	⚪ 0.0%

site-editor

Metric	trunk	`7414966`	Diff	Change
load	1894 ms	1553 ms	341 ms	🟢 -18.0%

site-startup

Metric	trunk	`7414966`	Diff	Change
siteCreation	8121 ms	8108 ms	13 ms	⚪ 0.0%
siteStartup	4955 ms	4952 ms	3 ms	⚪ 0.0%

Results are median values from multiple test runs.

Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff)

…dence approach Combines the complementary halves of the two approaches to the "silent mega-turn" problem: - From #3198 (Derek): the broad "one Write/Edit per turn" rule and the small-turn rule for the turn immediately after site_create. Without these, the agent packs an entire theme scaffold into one turn. - From this branch (section-by-section anchors): the concrete skeleton patterns for style.css (`/* === <concern> === */`) and page content (`` + `--post_content-file=`). Without these, the "skeleton-first" guidance alone was ignored for style.css — session 8 still wrote 12KB in a single Write (46.6s gap). Replaces the earlier verbose step 3a/4a blocks with a single compact "Working cadence" section after step 6 (roughly half the line count) while retaining the concrete anchor patterns. Also adds a short post-site_create reminder to the site-spec skill at the point where the rule matters most. Session 9 with the combined rules is in progress: longest content-generation gap measured so far is 18.3s (a 3.5KB style.css section fill). The 30s perceptual threshold is no longer breached on the generation side.

epeicher

Thanks for improving this @youknowriad! I have tested the same prompt against this branch and against trunk, and I have found the following:

In this branch, the duration of the requests is lower, the maximum duration is 22 seconds vs durations of 97 in trunk
The quality seems to be slightly better in trunk, you can see the preview generated sites:
- This branch: https://rarandalopez-qnmid-studio.wp.build
- Trunk: https://rarandalopez-hielv-studio.wp.build/
The difference is mainly in animations, but you can see there is not a big difference
I have found that in this branch, I was asked by the page style even though I provided it in the prompt. That doesn't happen in trunk:
Both versions reached max turns and asked to Continue
Both versions took the same amount of time to complete (~ 15 mins for both)
This branch produces less HTML blocks, the result is better on this branch compared to trunk

All in all, I think this improves the issues with the long running queries in the backend so I'm approving this, we can continue iterating.

sejas

I tested trunk with these changes. I was not able to get the Timeout. I tried it with different prompts similarly to Create a new site about AI development. Create a new beautiful theme in one shot.. I finished building one website, the other times I stop testing after the first answer.

It looked like it was stuck for ~1 minute on the first call after creating a site and get site info when it was creating the styles.css file, but it continued without any issue being much more responsive with smaller edits for the same file.

It reached the turns limit much faster, in 7 minutes of work.

After accepting it once, I didn't hit the turns limit anymore during my work session.

Regarding the quality of the website, it’s hard to assess whether it was affected. The site looks nice and works properly. It stopped after building the home page and left the rest of the pages empty. It took another prompt for me to generate them.

https://antoniosejas-svhop-studio.wp.build/

Home	About	Blog	Contact

epeicher approved these changes Apr 23, 2026

View reviewed changes

youknowriad merged commit 9204a0a into trunk Apr 23, 2026
10 checks passed

youknowriad deleted the claude/flamboyant-cartwright-bce19a branch April 23, 2026 10:13

wojtekn assigned epeicher and youknowriad and unassigned epeicher Apr 23, 2026

dereksmart mentioned this pull request Apr 23, 2026

apps/cli: build WP sites incrementally #3198

Closed

5 tasks

youknowriad mentioned this pull request Apr 23, 2026

apps/cli: two-phase AI build workflow with blockify skill #3207

Closed

4 tasks

sejas reviewed Apr 23, 2026

View reviewed changes

youknowriad mentioned this pull request Apr 24, 2026

apps/cli: fix wp_cli post content apply failing silently in WASM #3241

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apps/cli: generate style.css and page content section-by-section#3199

apps/cli: generate style.css and page content section-by-section#3199
youknowriad merged 2 commits intotrunkfrom
claude/flamboyant-cartwright-bce19a

youknowriad commented Apr 22, 2026

Uh oh!

wpmobilebot commented Apr 22, 2026 •

edited

Loading

Uh oh!

epeicher left a comment •

edited

Loading

Uh oh!

Uh oh!

sejas left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

youknowriad commented Apr 22, 2026

Related issues

How AI was used in this PR

Proposed Changes

Expected impact

Not in this PR

Testing Instructions

Pre-merge Checklist

Uh oh!

wpmobilebot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Performance Test Results

app-size

site-editor

site-startup

Uh oh!

epeicher left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sejas left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wpmobilebot commented Apr 22, 2026 •

edited

Loading

epeicher left a comment •

edited

Loading