Drive pull-reprint with a single reprint pull by epeicher · Pull Request #3768 · Automattic/studio

epeicher · 2026-06-10T18:09:15Z

Related issues

Required: pull: fix the flattened runtime root and delta re-pull after deferred files WordPress/reprint#268 (pull's flatten_to -> flat_document_root bridge plus the re-pull filter-guard fix that this flow depends on).
Recommended: Truncate the skipped-download list instead of deleting it WordPress/reprint#264 (skipped-list ENOENT fix) in the bundled reprint.

How AI was used in this PR

Claude Code measured the single-fork memory profile against a live WP.com Atomic site (26k files / ~570 MB), implemented the migration and tests, and verified the full flow end-to-end (fresh pull plus delta re-pull: remote edit -> re-pull -> local content / auto-login checks). Each change was reviewed and re-tested live.

Proposed Changes

Re-running studio pull-reprint on a completed pull previously printed "pulled successfully" and exited without contacting the remote, so remote edits, inserts, and deletions never reached the local site. With this PR a re-run performs a delta re-pull: files re-sync incrementally and the database is refreshed so existing-row edits and deletions propagate.

Studio now drives reprint with a single reprint pull instead of orchestrating the pipeline itself. One call runs preflight -> files-pull -> db-pull -> db-apply -> flat-docroot -> apply-runtime in one PHP-WASM fork, and reprint resets its own state for a delta re-pull via prepare_repull(). This deletes the former Studio orchestration: the five per-sub-command functions, the nine-stage machine, and the --abort delta-reset wiring (about 340 lines net). The single fork's memory_limit is raised to 1024M because the file-index high-water-mark now persists across phases (measured peak ~510M on the test site).

Fixes that remain part of making re-pulls real:

Auto-login keeps working after a re-pull. The database refresh rebuilds the DB from the remote dump, wiping the local admin user and studio_admin_username option that /studio-auto-login depends on. The pull re-applies the stored credentials over the running site's admin API, and a connection failure on that request doubles as a health check that catches a stale daemon "online" state and falls back to starting the server.
Database URLs can't drift to a dead port. If the Studio site record was re-created with a different id/port, the re-pull adopts the live site record before the pull so the URL rewrite (--new-site-url) targets the right port.
Site starts no longer crash on stale runtime mounts. Transient reprint state files mounted for the remote-uploads proxy are dropped when loading the persisted start-options.json (previously ENOENT crashed the start).

Bundle the reprint build via REPRINT_VERSION in scripts/download-wp-server-files.ts; it must include WordPress/reprint#268.

Testing Instructions

Build the reprint phar from a branch that includes pull: fix the flattened runtime root and delta re-pull after deferred files WordPress/reprint#268 (and Release 1.0.4 #264): run composer build:phar on the reprint repo.
On this repo, place reprint.phar at wp-files/reprint/reprint.phar and run npm run cli:build.
Pull a site: node apps/cli/dist/cli/main.mjs pull-reprint --verbose --url <site> and let it complete. It should invoke a single reprint pull (visible with --verbose) and start a working site.
Edit an existing post on the remote site, then re-run the same command. It should report "Updating ... (delta sync)" and the local post should show the edit.
Open the local site's WP Admin via the printed auto-login URL and confirm the edit from step 4.
Stop and start the site (studio site stop / site start). No ENOENT mount errors.

Pre-merge Checklist

Have you checked for TypeScript, React or other console errors?

Re-running `studio pull-reprint` on a completed pull previously printed "pulled successfully" and exited without contacting the remote, so remote edits never reached the local site. A re-run now resets the stage machine and re-executes every phase; reprint makes each phase incremental (files delta sync, full DB re-download + re-apply with the idempotent dump). Details: - Persist a hasCompletedOnce flag in pull.json so the non-empty site-directory guard is skipped on re-pulls (the directory holds the previous pull's output) and drop the cached preflight so connectivity and the secret-rotation retry path are re-verified. - Re-apply the stored admin credentials over the running site's admin API after the database refresh: db-apply rebuilds the DB from the remote dump, wiping the local admin user and the studio_admin_username option that /studio-auto-login depends on. The server only re-applies credentials at startup, and a re-pull deliberately skips restarting a running server. A connection failure on that request doubles as a health check — if the daemon's "online" view is stale, fall through to a server start. - Reconcile pull.json with the live site record in ensurePort() before the DB stages: if the site record was re-created with a different id/port, db-apply would rewrite the database URLs to a port nothing serves. - Drop /tmp/reprint/* mounts with missing host paths when loading the persisted start-options.json: reprint state files mounted for the remote-uploads proxy are transient, and mounting a missing path crashed server starts with ENOENT. - Fix shouldRestartFilesSyncIndex() comparing against the legacy 'files-sync' command name; reprint canonicalizes to 'files-pull' when saving state, so the check was dead code. Requires the reprint-side refresh support (WordPress/reprint#257) in the bundled reprint.phar.

wpmobilebot · 2026-06-15T10:16:12Z

📊 Performance Test Results

Comparing dc16064 vs trunk

app-size

Metric	trunk	`dc16064`	Diff	Change
App Size (Mac)	2357.83 MB	2357.79 MB	0.04 MB	⚪ 0.0%

site-editor

Metric	trunk	`dc16064`	Diff	Change
load	1077 ms	1726 ms	+649 ms	🔴 60.3%

site-startup

Metric	trunk	`dc16064`	Diff	Change
siteCreation	6493 ms	8532 ms	+2039 ms	🔴 31.4%
siteStartup	6965 ms	3882 ms	3083 ms	🟢 -44.3%

Results are median values from multiple test runs.

Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff)

…ta-resync # Conflicts: # apps/cli/commands/pull-reprint.ts

…print-delta-resync

…ls reconnect

A delta re-pull re-runs files-sync/db-sync/db-apply against a state dir where the previous pull left them "complete". Re-running as-is either throws (files-sync rejects a --filter change while a sync still looks in progress) or silently no-ops, so the database never refreshes. Issue `<command> --abort` before each on a re-pull to reset the sub-command state so every phase re-runs: files-sync keeps the local index and deltas, db-sync re-downloads, db-apply re-applies. Gated on hasCompletedOnce (the persisted "this pull finished once" flag) rather than the local isRepull, because the first re-pull attempt already reset the stage to "initialized"; a retry must still abort. Each command is aborted only while its stage is still pending, so a mid-pull resume never wipes a phase that already re-ran. Runs after preflight, which files-sync/db-sync --abort require. This removes the dependency on reprint's sub-commands auto-refreshing, so the re-pull works against a stock reprint build.

…ta-resync

Copilot

Pull request overview

This PR changes the studio pull-reprint orchestration so re-running a previously completed pull performs a real delta re-pull (re-entering reprint phases, clearing “complete” subcommand state via --abort, and re-running DB refresh), rather than exiting early without syncing remote changes.

Changes:

Reset completed pull sessions back to initialized on re-run, clear cached preflight, and abort completed reprint subcommands so files re-sync incrementally and DB refresh re-applies.
Improve resilience around reprint state/version differences and stale runtime mounts (preventing ENOENT crashes on server start).
Add Vitest coverage for delta re-pull reset behavior, exporter re-enable on preflight retry, and admin-credentials re-apply behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
apps/cli/lib/wordpress-server-manager.ts	Filters out stale `/tmp/reprint/*` mounts from persisted start options to avoid ENOENT crashes.
apps/cli/lib/pull/reprint-state.ts	Accepts both `files-sync` and canonicalized `files-pull` when interpreting reprint state.
apps/cli/commands/pull-reprint.ts	Implements delta re-pull flow (stage reset, preflight refresh, `--abort` state clearing), updates port adoption logic, and re-applies admin credentials when reusing a running server.
apps/cli/commands/tests/pull-reprint.test.ts	Adds tests covering delta re-pull behavior and recovery paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

On a re-pull where the daemon process is still up, the site-started "already running" branch skipped updateSiteLatestCliPid and updateSiteAutoStart; both only ran on the start branch. The running-status check requires site.latestCliPid to equal the live process pid, so a stale or missing pid (e.g. after ensurePort adopts a re-created site record) made the site report as stopped. Mirror the start branch and `studio site start`'s already-running path: refresh latestCliPid when online and keep autoStart enabled.

adamziel · 2026-06-18T12:05:24Z

+	const isRepull = studioMetadata.stage === 'completed';
+	if ( isRepull ) {
+		studioMetadata.stage = 'initialized';
+		studioMetadata.hasCompletedOnce = true;


Would it be helpful if reprint exposed this information? Such as php reprint.php import-metadata | jq '.hasCompletedOnce'? It's already stored there so that would be less state and less opportunity for one of these values to go out of sync.

Yes, I think that would be better as reprint "knows better" about the pull, and reducing the state to the minimum seems a good option.

adamziel · 2026-06-18T12:07:06Z

+
+		// Re-verify connectivity (and give the secret-rotation retry path
+		// a chance to run) instead of trusting the cached preflight from
+		// the original pull, which may be days old.


Did a cached preflight come up in your testing? In what way did it cause trouble?

I had errors for sites that were pulled some time before that were "missing multipart boundary" due to HTML being returned. The way to reproduce that is to pull a site, then wait for 60 minutes (or reduce the sliding window), and then pull again

Oh, so this is an issue with the rotating token we use on WP.com. And it seems like that token is saved in preflight.json, which is why deleting that file helps. 👍

adamziel · 2026-06-18T12:12:26Z

+			// Rotating the secret does not bump `reprint_exporter_enabled`, so
+			// re-open the gate explicitly; otherwise the retry hits the same
+			// closed window and ?reprint-api falls through to an HTML page.
+			await enableReprintExporter(


good catch!

adamziel · 2026-06-18T12:15:11Z

The code change mostly makes sense, thank you @epeicher! I'm only confused by one thing: If we're moving to reliance on pull, shouldn't we be able to remove all the manual tool calls, e.g. applyDownloadedDatabase()? pull covers all that ground in a way we don't need to manage in Studio.

epeicher · 2026-06-18T14:07:29Z

The code change mostly makes sense, thank you @epeicher! I'm only confused by one thing: If we're moving to reliance on pull, shouldn't we be able to remove all the manual tool calls, e.g. applyDownloadedDatabase()? pull covers all that ground in a way we don't need to manage in Studio.

As part of this PR, I have not moved to relying on pull, but to relying on <sub-command> --abort to clear the completed state first. I could iterate to rely completely on pull (I know we discussed that, but I kept the changes minimal), let me try that and update the PR.

Replace the per-sub-command orchestration (downloadEssentialSiteFiles, refreshFlattenedSiteDirectory, downloadRemoteDatabase, applyDownloadedDatabase, generateRuntimeConfiguration) and the clearCompletedSubcommandState/--abort delta-reset with one `reprint pull` call. reprint owns the pipeline ordering (files-pull -> db-pull -> db-apply -> flat-docroot -> apply-runtime) and resets its own state for a delta re-pull via prepare_repull(), so the Studio-side --abort wiring and per-phase stage gating go away. - runFullPull() issues the single pull with the same sqlite geometry the old db-apply used (target sqlite under the raw content dir) plus --flatten-to, --runtime=playground-cli, --start-runtime=none and --output-dir, mounting the site + runtime dirs up front. - ensurePort moves before the pull so --new-site-url is available. - Collapse the stage machine from 9 stages to 5 (initialized -> pulled -> site-registered -> site-started -> completed). - Bump the PHP-WASM memory_limit to 1024M: the single long-lived fork holds the file-index high-water-mark across phases. Requires reprint's flatten_to->flat_document_root bridge and the re-pull filter-guard fix (WordPress/reprint#268).

Drop the historic 'replaces the former clearCompletedSubcommandState' framing from the pull comment and the runFullPull docblock; describe what the single-pull flow does now.

epeicher · 2026-06-18T17:19:28Z

The latest changes 0b27bdf, use a composite pull command instead of using multiple sub-commands as suggested. After some testing, this is working fine and the code is greatly reduced 🙌

The changes require WordPress/reprint#268 to be applied and generate a reprint.phar from it.

adamziel

I haven't tested it, but the code change looks sound 👍

epeicher · 2026-06-19T16:22:19Z

After merging of WordPress/reprint#268, these changes have been tested and work as expected. I have been able to pull both initial and incremental changes from a remote. Merging this.

github-actions Bot assigned epeicher Jun 10, 2026

Fix prettier formatting of the re-pull directory guard

19388cd

epeicher force-pushed the cli/pull-reprint-delta-resync branch from 05d8026 to 19388cd Compare June 11, 2026 15:22

epeicher marked this pull request as ready for review June 15, 2026 09:37

Merge branch 'trunk' into cli/pull-reprint-delta-resync

2451a0e

epeicher requested review from a team and fredrikekelund June 15, 2026 09:55

Merge branch 'trunk' into cli/pull-reprint-delta-resync

83ea363

epeicher added 3 commits June 15, 2026 13:02

Merge remote-tracking branch 'origin/trunk' into cli/pull-reprint-del…

cbc3a14

…ta-resync # Conflicts: # apps/cli/commands/pull-reprint.ts

Merge branch 'trunk' of github.com:Automattic/studio into cli/pull-re…

d9896ec

…print-delta-resync

Re-enable reprint exporter on preflight retry so cached-secret re-pul…

9bd3e1f

…ls reconnect

epeicher mentioned this pull request Jun 18, 2026

Mark --filter=skipped-earlier runs complete WordPress/reprint#265

Closed

epeicher added 2 commits June 18, 2026 11:47

Merge remote-tracking branch 'origin/trunk' into cli/pull-reprint-del…

d559b00

…ta-resync

epeicher requested a review from adamziel June 18, 2026 10:34

epeicher mentioned this pull request Jun 18, 2026

Refresh on re-run for completed pull sub-commands WordPress/reprint#260

Closed

epeicher requested a review from Copilot June 18, 2026 10:52

Copilot started reviewing on behalf of epeicher June 18, 2026 10:53 View session

Copilot AI reviewed Jun 18, 2026

View reviewed changes

Comment thread apps/cli/commands/pull-reprint.ts

adamziel reviewed Jun 18, 2026

View reviewed changes

epeicher changed the title ~~Make pull-reprint re-runs perform a delta re-pull~~ Drive pull-reprint with a single reprint pull Jun 18, 2026

Describe current behavior in pull-reprint comments

dc16064

Drop the historic 'replaces the former clearCompletedSubcommandState' framing from the pull comment and the runFullPull docblock; describe what the single-pull flow does now.

adamziel approved these changes Jun 19, 2026

View reviewed changes

epeicher merged commit 850fe2f into trunk Jun 19, 2026
13 checks passed

epeicher deleted the cli/pull-reprint-delta-resync branch June 19, 2026 16:22

Conversation

epeicher commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related issues

How AI was used in this PR

Proposed Changes

Testing Instructions

Pre-merge Checklist

Uh oh!

wpmobilebot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Performance Test Results

app-size

site-editor

site-startup

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

adamziel Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

epeicher Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

adamziel Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

epeicher Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

adamziel Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

adamziel Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

adamziel commented Jun 18, 2026

Uh oh!

epeicher commented Jun 18, 2026

Uh oh!

epeicher commented Jun 18, 2026

Uh oh!

adamziel left a comment

Choose a reason for hiding this comment

Uh oh!

epeicher commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

epeicher commented Jun 10, 2026 •

edited

Loading

wpmobilebot commented Jun 15, 2026 •

edited

Loading