Skip to content

feat: support multi-repo orchestration in polyrepo workspaces#516

Draft
loopyd wants to merge 26 commits into
HenryLach:mainfrom
loopyd:feat/polyrepo
Draft

feat: support multi-repo orchestration in polyrepo workspaces#516
loopyd wants to merge 26 commits into
HenryLach:mainfrom
loopyd:feat/polyrepo

Conversation

@loopyd
Copy link
Copy Markdown

@loopyd loopyd commented Apr 22, 2026

Summary

Add end-to-end polyrepo orchestration support so a single /orch run can plan, execute, merge, resume, and integrate work that spans multiple repositories in one workspace.

Type of Change

  • feat (new behavior)
  • fix (bug fix)
  • docs (documentation only)
  • refactor (no behavior change)
  • test (test-only changes)
  • chore (maintenance)

Changes

Added the supporting: planner runtime, persistence, settings, supervisor, resume, and documentation updates needed to run polyrepo batches end-to-end, improve submodule resolution in worktrees. imrpvoe TUI UX for merge, plan workflows, provide comprehensive test coverage to reduce fagility.

Validation

  • cd extensions && npx vitest run
  • Manual smoke test performed (if applicable)
  • taskplane doctor run (if config/CLI affected)
  • Focused validation run: cd extensions && node --test tests/engine-worker-thread.test.ts tests/merge-panel-widget.test.ts tests/orch-plan-widget.test.ts

Documentation

  • Docs updated for user-facing changes
  • Reference docs updated (commands/config/task format), if applicable
  • CHANGELOG.md updated (for release-relevant changes)

Checklist

  • Changes are scoped and focused
  • No secrets/private data introduced
  • Cross-links and paths in docs are valid
  • Breaking changes are clearly called out

Related Issues

Progress for #51

loopyd added 6 commits April 21, 2026 09:25
Consolidate workspace sync helpers into workspace.ts, centralize workspace sync types and messages, and add regression coverage for blocking and UI states.
Move blocking-summary helpers into messages.ts so the blocking regression test does not have to load the full extension module.
Render multiline /orch-plan results into a persistent widget and add focused coverage for widget line formatting.
@loopyd loopyd requested a review from HenryLach as a code owner April 22, 2026 04:25
@loopyd loopyd marked this pull request as draft April 22, 2026 06:46
@loopyd
Copy link
Copy Markdown
Author

loopyd commented Apr 22, 2026

Drafted while additional work is done. Regressions came up in testing and are now being addressed.

@loopyd
Copy link
Copy Markdown
Author

loopyd commented Apr 22, 2026

New work on this PR for #51

What this adds

  • Hardens polyrepo task routing and repo attribution across discovery, wave planning, execution, persistence, and resume flows.

  • Tightens cross-repo transactional merge handling so rollback-safe-stop behavior survives stale or degraded persisted metadata.

  • Extends user-visible/operator-visible coverage so paused resume outcomes and recovery warnings are visible in summary output and through the orch_resume tool surface.

  • Adds targeted integration coverage,, segment frontier behavior, atomic merge rollback, resume safe-stop, degraded recovery, and resume alert rehydration.

Behavioral changes

  • Discovery/routing now preserves repo intent more explicitly, including repo-scoped file scope handling and downstream execution attribution.
  • Resume now forces rollback safe-stop when merge retry metadata indicates rollback failure, even if persisted state is stale or partially degraded.
  • Resume recovery degrades more safely when transaction files are missing but repo-level merge results survive.
  • Workspace-mode resume keeps repo-scoped attribution during degraded recovery instead of collapsing to a generic merge failure.
  • Batch summary output now preserves rollback safe-stop reasons and persistence warnings in the generated summary artifact.
  • The registered orch_resume tool is now covered behaviorally: it initializes through the real extension registration path, returns the async launch acknowledgement, propagates force, and rejects duplicate resume while the batch is still launching.

@loopyd loopyd mentioned this pull request Apr 22, 2026
2 tasks
@loopyd
Copy link
Copy Markdown
Author

loopyd commented Apr 22, 2026

Rolled in Issue #517 + tighten here.

Runtime checkpointing is now protected against unsafe submodule state. A task that appears to succeed will no longer checkpoint if it leaves a submodule dirty or pointing at a local-only commit that is not reachable from a preferred remote. Instead of silently committing a risky superproject gitlink, the checkpoint step now fails the task explicitly and preserves the unsafe state for inspection.

Additional merge gap closures

Merge now has a defense-in-depth backstop for submodule refs. After a successful lane merge, Taskplane validates merged gitlinks and blocks branch advancement if a submodule commit would be unreachable for downstream clones. When that happens, it reuses the existing rollback path to reset back to the captured pre-merge head, records rollback status in the transaction record, and emits recovery guidance if rollback itself cannot complete cleanly.

Additional polyrepo pause/resume gap closures:

Pause/resume persistence now carries the richer segment metadata needed for repo-scoped execution. Explicit segment DAG metadata and step-to-segment mappings are now serialized into persisted task records, schema-validated on load, and restored into reconstructed task stubs so resumed segment-scoped work retains the same repo ordering and step filtering without needing rediscovery.

Resume recovery is now more resilient to degraded persisted state. If task-level segmentIds are missing but persisted segment records still exist, resume can reconstruct ordered segment IDs from those records, rebuild the segment frontier, restore the next active segment, expand continuation rounds correctly, and compute the resume point from recovered segment state instead of trusting stale task-level status.

Expansion of test coverage

Test coverage was expanded to prove both the positive and negative paths. The commit adds a new post-execution submodule integration suite, extends merge rollback integration coverage for published versus unpublished submodule gitlinks, adds source-contract assertions that merge-time unreachable-gitlink validation is wired into the transactional rollback seam, and adds resume regressions for segment metadata round-tripping plus degraded-persistence frontier and resume-point recovery.

Minor housekeeping: tmp was added to ignore rules, since it is produced by behavioral tests (merge, pause, resume and gitlink behavioral tests require impartially mocked directories using some synthetic test data)

@loopyd
Copy link
Copy Markdown
Author

loopyd commented Apr 22, 2026

Live testing in a fork of a production repo now begins. This is a repo with several sub modules in various directories that require cross-repo logic this PR emits.. I will report back when these tests complete. If successful this will be moved from draft for review.

loopyd added 7 commits April 22, 2026 05:44
…heck

During checkpointing, the parent repo stages a gitlink change which
bleeds into every shared-worktree submodule as 'M <other-submodule-path>'
in git status --porcelain output. These are transient index-level
artifacts from advancing subproject pointers, not actual code changes
inside those submodules.

Previously only .pi/tasks/ artifact paths were filtered. Now also filter
any status lines that point to known submodule paths, treating them as
expected transient state during checkpointing rather than unsafe dirty work.
…heck

In merge worktrees, submodules may have stale local refs from the
branch's original checkout. Before checking if a gitlink commit is
reachable from origin (via ls-remote and merge-base), ensure remotes
are fresh with 'git fetch --all'. This prevents false-positive merge
failures when submodule commits exist on origin but weren't fetched
into the merge worktree's local refs.
- Clear .DONE files from all 180+ completed tasks
- Uncheck all STATUS.md checkboxes, reset status to Pending
- Remove .reviews directories (review artifacts)
- Reset Review Counter and Iteration counters to 0
- Update CONTEXT.md to reflect clean state
Replace isCommitReachableOnRemote with checkSubmoduleCommitReachable
that handles forked submodules in merge worktrees:

1. Fast path: direct ls-remote match against origin/HEAD (always
   points to the latest remote tip, regardless of branch names)
2. Named ref tips via ls-remote + merge-base ancestry check
3. Local tracking refs as fallback after fetch

Also fetch remotes before any reachability checks in both
detectUnreachableGitlinks and detectUnsafeSubmoduleStates so that
local refs are fresh even in stale merge worktrees.
…tignore respect

Layer 1 (isArtifactStatusLine): Segment matching catches nested __pycache__/
paths like scripts/__pycache__/ in bof3-disk, plus *.pyc, node_modules, build dirs.

Layer 2 (filterGitIgnoredStatusLines): Uses 'git check-ignore --no-index' to
respect ALL .gitignore rules at every level in submodule trees — recursive.

Pipeline: filterGitIgnoredStatusLines → isArtifactStatusLine → checkpoint decision
This handles bof3-disk's scripts/__pycache__/ even though it has no __pycache__
in its root .gitignore.
loopyd added 5 commits April 22, 2026 14:38
When a worker completes a task that was already done by a prior batch run,
STATUS.md shows all checkboxes checked but no new progress is made. The
onPrematureExit callback previously escalated this as 'no progress' → stall
timeout → failure after 3 iterations.

Added check: if totalChecked >= totalSteps across all steps, let the worker
exit normally. This fixes tasks like DCMP-004 and INV-002 that were completed
by prior batch runs.
The earlier edit accidentally inserted garbage text (}$/i.test(sha)),)
into the middle of a .filter() call, causing ParseError on load.
Removed the 7-line corrupt block that included duplicate code.
…rch_resume calls

After batch reset, check phase and wave state before resuming.
Do not call orch_resume() repeatedly if the batch is already in a valid
executing state with running tasks.

Changes:
- Added Step 1: Check current batch state before resetting
- Added Step 6: Resume batch with correct state logic
- Updated Notes: Key fix for redundant resume calls
- Add state-aware resume logic to prevent redundant orch_resume calls
- Fix artifact status line filtering for clean segment matching
- Improve submodule dirty state filtering
- Update tsconfig and execution routing
Add robust fallback for checkSubmoduleCommitReachable when run from
merge worktrees. The issue was that merge-base could fail due to stale
local refs even when the commit exists on origin, causing false-positive
gitlink validation failures.

Fix adds two fallback checks:
1. cat-file -e to verify the commit object exists locally
2. Re-check ls-remote HEAD as a last resort if merge-base fails

This resolves the persistent 'Post-merge submodule gitlink validation
failed in lane 2' error that was occurring in consecutive clean-state
batches.

Related: TP-037 (submodule gitlink bugfix loop)
loopyd added 4 commits April 22, 2026 19:27
…e aggressively

When merge-base fails due to stale local refs in merge worktrees,
check if the commit appears anywhere in ls-remote output (even if
not exactly matched to HEAD). This handles cases where the commit
is on a branch/tag tip but not at HEAD.
Add filtering for __pycache__, .pytest_cache, .mypy_cache, node_modules,
build/, dist/, .pyc, and .egg-info paths. These are transient Python
build artifacts that should not block checkpointing when detected as
uncommitted changes in submodules.
When checking if paths are ignored by .gitignore from within a submodule,
also check the superproject's root .gitignore. This ensures that patterns
like __pycache__/ defined in the root are properly applied to paths within
submodules, preventing false positives during unsafe submodule detection.

References:
- https://git-scm.com/docs/gitignore
- GitHub docs on .gitignore resolution in submodules
@loopyd
Copy link
Copy Markdown
Author

loopyd commented Apr 23, 2026

Needs work, I ran out of time/AI quota to work on this so I am going to pass on it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant