Skip to content

two-stage-terraform-pipeline#27

Merged
jameswillis99 merged 13 commits into
masterfrom
two-stage-terraform
Nov 3, 2025
Merged

two-stage-terraform-pipeline#27
jameswillis99 merged 13 commits into
masterfrom
two-stage-terraform

Conversation

@jameswillis99
Copy link
Copy Markdown
Collaborator

No description provided.

@jameswillis99 jameswillis99 merged commit cb9070f into master Nov 3, 2025
1 check passed
@jameswillis99 jameswillis99 deleted the two-stage-terraform branch November 3, 2025 12:02
field123 added a commit that referenced this pull request Feb 27, 2026
…ved, 21 new tests

- Fix #22: resolveComponentUuid exact matching (matchEntityName instead of .includes)
- Fix #19: API client session state leak (clearSessionState on project.set)
- Fix #27: undo stack bounded to MAX_UNDO_DEPTH=50 (oldest dropped on overflow)
- Fix #24: MODEL_PRICING versioned IDs (claude-sonnet-4, claude-haiku-4, claude-opus-4)
- Fix #26: reporter test coverage (saveReport, printSummary, loadOverrides, saveOverride)

21 new tests across 4 files. All 1402 unit tests pass. Build and typecheck clean.
field123 added a commit that referenced this pull request Feb 27, 2026
…ng (#150)

* chore: plans

* chore: reprioritize plan — dev host variant sync is now P10

Renumbered P10-P14: Dev Host Variant Sync (was P14) promoted to
P10 top priority. Eval runner robustness, grader quality, scenario
coverage, and infrastructure shifted to P11-P14.

* feat: add dev host variant sync (P10) — automatic code component variant discovery

The Plasmic persisted project bundle does not contain code component variant
data (e.g., "Selected", "Disabled" states). Studio gets this by connecting
to the dev host via iframe, but the MCP has no browser. This adds an HTTP-based
sync that fetches variant metadata from a running dev host on project.set and
project.refresh.

New package: @elasticpath/plasmic-registry
- Reads globalThis.__PlasmicComponentRegistry (same global @plasmicapp/host writes to)
- Serializes component metadata, stripping functions and React elements
- Zero runtime dependencies, works in Node.js and browser
- 21 unit tests (serialize + read-registry)

Dev host API route: plasmicpkgs-dev/app/api/plasmic-registry
- Server-compatible registration file (plasmic-init-server.ts)
- GET /api/plasmic-registry returns full serialized component metadata

MCP sync module: packages/plasmic-mcp/src/devhost-sync.ts
- fetchDevHostRegistry(): HTTP fetch with 5s timeout, non-fatal on failure
- syncVariantMetadata(): populates codeComponentMeta.variants on matching CCs
- ensureVariantObjects(): creates Variant objects on wrapper components
- syncFromDevHost(): orchestrator called from project.set and project.refresh
- Flexible $dev suffix matching for component names
- 22 unit tests covering fetch, sync, variant creation, and full flow

Session/types extended with hostUrl, devHostSynced, syncedVariantComponents.

* feat: complete P11 eval scenario coverage expansion (70 → 135 scenarios)

Add 65 new eval scenarios covering all 103 MCP actions (previously 38/103).
Coverage now ~98% across all 8 STRAP domains:

- P11.1: 11 node scenarios (remove, move, clone, reorder, rich-text, attrs, visibility, image, detach-mixin, add/remove-animation)
- P11.2: 16 design scenarios (remove/duplicate-token, list/update/remove-mixin, list/update/remove-animation, list/create/update/remove-theme, set-active-theme, list/rename/remove-asset)
- P11.3: 12 data scenarios (update/remove-query, list/create/update/remove-data-token, list/create/update/remove-split, get-code-meta, list-functions)
- P11.4: 10 component scenarios (delete, convert-to-page, convert-to-component, list/add/update/remove-prop, list/add/remove-state)
- P11.5: 8 variant scenarios (list/create/add/remove-global-group, rename-global, update-screen, rename, remove)
- P11.6: 5 inspect scenarios (subtree, export, style-properties, preview-url, page-meta)
- P11.7: 2 interaction scenarios (update, remove)
- P11.8: 1 project scenario (set)
- P11.9: INDEX.md regenerated (66 simple + 50 medium + 19 complex)

* feat: complete P12 eval runner robustness — 8 bug fixes preventing hangs, data loss, and misleading results

P12.1: Tool execution timeout — onToolCall wrapped in Promise.race with
remaining wall-clock countdown. Hanging Plasmic API calls no longer block
the entire eval process indefinitely.

P12.2: saveReport fallback — writeFileSync wrapped in try/catch with
stderr JSON dump. Disk-full or permission errors no longer silently lose
all eval results.

P12.3: Visual capture wall-clock cap — capture() wrapped in 30s
Promise.race. Inner logic extracted to captureInner() so the timeout
covers navigation + screenshot + mobile capture end-to-end.

P12.4: InMemoryTransport server cleanup — server.close() called before
nulling in close(). MCP server and transport resources no longer leak
across scenarios.

P12.5: MAX_TURNS exhaustion flag — new maxTurnsExhausted boolean on
ConversationResult. Runner checks it alongside timedOut/incomplete. The
25-turn limit exit no longer silently looks like success.

P12.6: Playwright tracing unbounded growth — new resetTracing() method
stops and restarts tracing after each successful capture. Trace buffer
no longer grows for the entire eval run.

P12.7: console.error suppression leak — dynamic imports and server
creation wrapped in try/finally. console.error is always restored even
when initialization throws.

P12.8: desktopPath on screenshot failure — desktopPath initialized as
null in the nav-failure path, only set after successful screenshot write.
The LLM judge no longer reads nonexistent files.

5 new tests: tool call timeout, MAX_TURNS exhaustion (3 cases), runner
maxTurnsExhausted handling. Total: 1317 unit + 137 integration = 1454.

* feat: complete P13 eval grader quality fixes — 9 bugs preventing reliable eval results

P13.1: Property grader coerces style values to String() before .toLowerCase() — numeric values (e.g., lineHeight: 1.5) no longer throw TypeError.

P13.2: Existence grader uses exact name matching by default (params.exact !== false) across all entity types (component, page, node, token, variant, mixin). Prevents false positives where "Card" would incorrectly match "CreditCard".

P13.3: Existence grader searches only the relevant list — entityType "page" searches data.pages only, "component" searches data.components only. Prevents a component named "Contact" from satisfying a page existence check.

P13.4: Tool-params grader uses exact string matching by default. Pass substring: true for cases where substring matching is intentional. Prevents "red" from matching "bordered".

P13.5: LLM judge wraps readFileSync in try/catch — race conditions or permission errors no longer crash the judge.

P13.6: LLM judge score regex uses \d+ instead of \d — "SCORE: 10" is now correctly rejected instead of being parsed as score 1.

P13.7: Review-flags always adds low-quality when qualityScore <= 2, independently of judge-disagrees. Before this fix, success=true + score=1 would get only judge-disagrees but never low-quality.

P13.8: loadPreviousReport validates Array.isArray(report.scenarios) — a report with missing or non-array scenarios no longer crashes applyReviewFlags.

P13.9: Data grader validates name, queryType, and event — not just count. A scenario asking for "add a REST query named fetchUsers" now validates the specific query exists.

New eval-llm-judge.test.ts (16 tests) for parseJudgeResponse and formatTranscriptForJudge. Total: 1356 unit tests across 29 suites.

* feat: complete P14 eval infrastructure improvements — 7 items for reliable eval runs

P14.1: Partial re-run report merging — skipped-as-passed results from previous
report are merged into newly-run results, giving accurate full-suite success rates.

P14.2: Dirty-tree detection — getGitSha() appends "-dirty" when working tree has
uncommitted changes, preventing incorrect skip logic on dirty trees.

P14.3: Scenario content hashing — SHA256 hash of scenario content stored in reports.
Resume/skip compares hashes so modified scenarios are re-run, not skipped.

P14.4: CLI argument validation — --tier validated against known values with clear
error; unrecognized flags produce warnings instead of being silently ignored.

P14.5: Scenario validator now loads both mock and integration scenarios, closing the
gap where integration-only scenarios were never validated by eval:validate.

P14.6: Regression detection flag — "regression" review flag fires when a scenario
was passing in the previous run but now fails.

P14.7: High-retry-count flag — "high-retries" review flag fires when a passing
scenario required >3 error-retry cycles, surfacing fragile scenarios.

25 new tests added across 3 test files. All 1381 unit tests pass.

* feat: complete P15 remaining bug fixes and test gaps — 5 issues resolved, 21 new tests

- Fix #22: resolveComponentUuid exact matching (matchEntityName instead of .includes)
- Fix #19: API client session state leak (clearSessionState on project.set)
- Fix #27: undo stack bounded to MAX_UNDO_DEPTH=50 (oldest dropped on overflow)
- Fix #24: MODEL_PRICING versioned IDs (claude-sonnet-4, claude-haiku-4, claude-opus-4)
- Fix #26: reporter test coverage (saveReport, printSummary, loadOverrides, saveOverride)

21 new tests across 4 files. All 1402 unit tests pass. Build and typecheck clean.

* feat: add devhost-sync integration tests and fix typeTag detection bug

findWrapperComponents() checked _type but real WAB model instances use
a typeTag getter. Changed to typeTag ?? _type fallback pattern (matching
edit-tools.ts convention). Added 10 integration tests against real WAB
model classes verifying syncVariantMetadata, ensureVariantObjects,
listVariants, and resolveVariant work with MobX-observed instances.

* feat: add cycle guard to node-resolver and .env.example for eval system

flattenWithPaths now tracks visited UUIDs with a Set and enforces a
MAX_TREE_DEPTH=200 limit, preventing infinite recursion on malformed
or corrupted WAB models (#23). 5 new tests cover self-referencing nodes,
multi-node cycles, diamond graphs, and UUID-less node traversal.

Created evals/.env.example documenting all required and optional
environment variables for the eval system (#25), improving onboarding
for new developers who previously had to read cli.ts source code.

All 1407 unit + 147 integration tests pass.

* feat: fix dev host wiring and add missing test coverage (P18)

Fix @elasticpath/plasmic-registry not in Yarn workspaces — the API route
import would fail at runtime with module-not-found. Add package to root
workspaces and plasmicpkgs-dev dependencies.

Add fetchDevHostRegistry() timeout (AbortError) unit test. Add 6 API
route handler tests in plasmicpkgs-dev covering response shape, variant
data, serialization safety, empty registry, and error handling.

* feat: add dev host sync docs to skill files and fix server registration gap (P19)

Spec-vs-implementation audit found skill docs had zero mention of dev host
sync, preventing users from discovering CC variant styling prerequisites.
Also fixed plasmic-init-server.ts missing registerShopify that client had.

- plasmic.md: project.set/refresh docs + new Dev Host Variant Sync section
- plasmic-edit.md: variant workflow guidance for CC variant troubleshooting
- plasmic-inspect.md: variant.list note about dev host sync requirement
- plasmic-init-server.ts: add registerShopify matching client registration
- README.md: expand project.refresh re-sync behavior documentation

* feat: add 62 server handler tests covering 48 previously-untested actions (P20)

Server handler test coverage was at ~37% (161/223 tests covering
~55/103 actions). Added 62 new tests across all 8 STRAP domains:
component props/states (9 actions), node (7 actions), design (17 actions),
data (9 actions), interaction (4 actions), variant globals (5 actions).

Also added 47 missing edit-tools mock declarations, devhost-sync mock
module, and fixed syncFromDevHost mock to return proper SyncResult shape.
All 1617 tests pass across 31 suites.

* feat: harden error handling across MCP server (P21)

Replace err:any with err:unknown type-narrowing in 16 catch blocks
across server.ts, api-client.ts, save-manager.ts, and edit-tools.ts.
Non-Error thrown values now produce readable messages instead of
"undefined". Add undo-manager rollback when save fails after applying
in-memory undo. Fix session recovery after failed reload in
create-page/create/clone handlers. Distinguish ENOENT from JSON
parse errors in auth file reading. Add process-level catch on main().
6 new tests. All 1623 tests pass.

* feat: add CC variant test coverage and fix spec naming (P22)

Spec audit found 3 gaps between the dev host variant sync spec and
implementation. Fixed spec field name (syncedComponents →
syncedVariantComponents) to match code. Added 2 unit tests verifying
the full updateStyles → resolveVariant → CC variant path by key name
and display name. Added 2 integration tests using synthetic wrapper
components to exercise ensureVariantObjects creation and idempotency
where the fixture lacks TplComponent-rooted wrappers. All 1478 unit +
149 integration tests pass.

* feat: harden server with session preservation, null guards, validation, and security (P23)

Session state preservation: create-page/create/clone model reload now calls
syncFromDevHost and passes hostUrl/devHostSynced/syncedVariantComponents to
setSession, preventing silent loss of dev host variant data after component
creation.

Null-safe revision handling: project.undo and project.end-batch now use
optional chaining on result.save?.revisionNum, preventing TypeError when
save result is null.

Input validation: Five update handlers (update-mixin, update-animation,
update-data-token, update-split, interaction.update) now require at least
one updatable field. component.extract now validates non-empty name.

Security: inspect.export sanitizes componentUuid to [a-zA-Z0-9_-] before
constructing temp file path, preventing path traversal.

Response consistency: component.clone error uses JSON format, convert-to-page
and convert-to-component include message field, variant.list wraps result with
componentUuid/componentName, data.list-queries uses standard error format.

16 new tests. All 1494 unit + 149 integration tests pass.

* feat: code quality and safety hardening across MCP server (P24)

Fix 13 bugs found via comprehensive audit: removeToken only replacing
first token occurrence, isAncestorOf missing slot traversal allowing
cycles, setImage crash on empty vsettings and CSS injection via
unescaped URLs, dead code unreachable non-serializable detection,
missing parameter validation in node.add/set-image/update-token,
missing requireSession in end-batch/undo, obsolete tool name
references across 6 files, deriveLayoutType ignoring reverse flex
directions, and devhost-sync crash on malformed variant data.
12 new tests. All 1655 tests pass.

* feat: rename to @elasticpath/plasmic-mcp-registry with full five-registry support (P25)

Renames packages/plasmic-registry to packages/plasmic-mcp-registry and
adds readers for all five Plasmic globalThis registries: components,
contexts, functions, tokens, and traits.

New serializers (serializeContextMeta, serializeFunctionMeta) strip
non-serializable fields via JSON roundtrip. New readers (getContextRegistry,
getFunctionRegistry, getTokenRegistry, getTraitRegistry) follow the same
defensive pattern as the existing component reader. getFullRegistry()
returns all five in one call.

Adds withPlasmicRegistry() Next.js config wrapper that auto-detects
Plasmic packages and adds them to serverExternalPackages to prevent
RSC boundary errors.

Updates plasmicpkgs-dev route to call getFullRegistry() and return the
full { components, contexts, functions, tokens, traits } response shape.

75 tests in plasmic-mcp-registry (54 new), 6 in plasmicpkgs-dev (updated),
1655 in plasmic-mcp (all passing).

* feat: add FullRegistryResponse parsing, TTL cache, and typeTag bug fix (P26)

Parse all five Plasmic registries (components, contexts, functions, tokens,
traits) in fetchDevHostRegistry with backward compatibility for old endpoints.

Add in-memory cache with 60s default TTL (configurable via
PLASMIC_REGISTRY_CACHE_TTL_MS) to avoid redundant fetches across 5 call
sites. Cache is cleared explicitly on project.refresh via clearRegistryCache().

Fix getCodeComponentVariantMetas to use typeTag ?? _type pattern, matching
devhost-sync.ts, so variant resolution works on real WAB model instances.

Wrap plasmicpkgs-dev/next.config.js with withPlasmicRegistry() to auto-detect
and externalize Plasmic packages for serverExternalPackages.

* feat: enrich MCP tools with dev host registry data (P27)

Store full registry data (contexts, functions, tokens, traits) in the
session after each dev host sync, making it available to MCP tool
handlers. Previously the data was fetched but discarded after variant
sync.

Enrichments:
- design.list-tokens: includes devHostTokens from registered packages
- data.list-functions: includes devHostFunctions from registered packages
- project.set/refresh: includes devHostRegistry summary with counts
  for contexts, functions, tokens, and traits

Also marks P5 defensive JSON handling as complete (already implemented).

* feat: apply registry defaultStyles and parentComponentName validation on node.add (P28)

Fix registryData being silently dropped from session after component.create-page,
component.create, and component.clone operations. All three handlers now preserve
registryData in their setSession calls, matching the existing correct pattern in
project.set and project.refresh.

When adding code component instances via node.add, the MCP server now looks up the
component in session.registryData to apply defaultStyles (e.g. width, padding) and
validate parentComponentName constraints. Warnings are surfaced non-fatally in the
JSON response. This enables the AI model to create properly styled component
instances and receive guidance about component placement rules.

* feat: populate slot defaultValue from registry and fix registerShopify asymmetry (P29)

When plasmicElementToTpl creates a TplComponent, it now iterates registry
props for slot-type entries with defaultValue. For each slot without
explicit content, defaultValue PlasmicElement trees are recursively
converted to TplNodes and wired as Arg+RenderExpr in the base variant
setting. This ensures components like Button or Card render with
meaningful placeholder content out of the box.

Also fixes missing registerShopify(PLASMIC) call in plasmic-init-client.tsx
that was imported but never invoked, causing Shopify components to appear
in the registry API but not in the canvas host.

5 new tests for slot defaultValue population covering: basic population,
named slots, explicit children priority, missing model slots, and
non-slot prop handling.

* fix: resolve TypeScript strict type errors in devHostTokens/devHostFunctions enrichment (P30)

Replace Record<string, unknown> type widening with spread pattern in
design.list-tokens and data.list-functions handlers. The interface
ListCustomFunctionsResult lacks an index signature, causing tsc
--noEmit to fail. Both handlers now build enriched results via
object spread, preserving type safety while allowing optional
devHost* fields.

* feat: type-safe registryData and PLASMIC_DEV_HOST_URL env var fallback (P31)

Replace `session.registryData?: any` with `FullRegistryData | null` via
strongly-typed interfaces (RegistryComponent, RegistryContext, RegistryFunction,
RegistryToken, RegistryTrait) that mirror canonical types from the registry
package without adding a runtime dependency. This eliminates cascading `as any`
casts in server.ts, edit-tools.ts, and devhost-sync.ts.

Add PLASMIC_DEV_HOST_URL environment variable as fallback when the Plasmic
project has no configured hostUrl, completing the spec requirement for
env-based dev host configuration. Project settings always take priority.

* fix: eliminate all as-any casts in server.ts and edit-tools.ts (P32)

Remove 18 as-any casts across the two largest source files:
- server.ts: 5 redundant parameter casts removed, toggle validation added
  for variant.create-global-group
- edit-tools.ts: 6 PlasmicElement union property casts removed (TS
  narrows correctly), EventHandler cast narrowed from any to typed,
  5 readonly array includes patterns fixed

Zero as-any casts remain in server.ts and edit-tools.ts.
1701 tests passing, typecheck and build clean.

* feat: surface all five registry data types to Claude (P33/P34)

P33: data.get-code-meta now enriches response with devHostMeta containing
the full component registration (typed props, variants, defaultStyles,
parentComponentName) when a matching registry component is found. Uses
$dev suffix-aware name matching.

P34: project.get-meta now includes devHostContexts (global context
providers with props and globalActions) and devHostTraits when registry
data is available. Previously these were synced and counted but never
surfaced in any tool response.

Also removes the last as-any cast in production code (model-loader.ts
narrowToSite) — zero as-any across all production source files.

1710 tests passing (31 suites), build and typecheck clean.

* fix: restore plasmicpkgs-dev consumer integration for plasmic-mcp-registry

- Restore plasmic-register.ts with EP commerce registration (registerElasticPath)
- Restore plasmic-init-client.tsx to use @/plasmic-register side-effect import
- Remove redundant plasmic-init-server.ts (plasmic-register.ts serves both)
- Fix package.json: file: link for registry, restore EP dep and --port 3001
- Update route.ts to use @/plasmic-register + getFullRegistry()
- Add @/ path alias to vitest.config.ts for Next.js-style imports
- Update test mock from plasmic-init-server to plasmic-register
- Update PROMPT_plan.md for registry package scope

* feat: add withRegistryCapture and fix webpack externals for SSR

- Add capture.ts with withRegistryCapture() that wraps PLASMIC loader
  to also call @plasmicapp/host registration functions on the server,
  populating globalThis registries for the MCP API route
- Split webpack externals from serverExternalPackages patterns:
  @plasmicapp/host and @plasmicapp/query use serverExternalPackages
  only (preserves shared React for SSR), while @plasmicpkgs/* and
  @elasticpath/plasmic-* use both (webpack externals needed for
  monorepo packages where serverExternalPackages has no effect)
- Export registerAllPackages() from plasmic-register.ts so the API
  route can re-register with captured PLASMIC without polluting the
  shared registration file
- Add @plasmicapp/host as peer dependency of plasmic-mcp-registry

* chore: clean up ralph specs and implementation plan

* chore: generalise ralph prompt files to project-wide paths
field123 added a commit that referenced this pull request Mar 27, 2026
…lobal-contexts

Add inspect.list-global-contexts action that reads site.globalContexts
(matching Studio's LeftProjectSettingsPanel read pattern). For each
TplComponent in the array, returns component name, UUID, source package,
and configured prop values extracted from vsettings[0].args.

Also adds globalContextCount and globalContexts summary to project.get-meta.

Closes the last remaining MCP gap (#27).
field123 added a commit that referenced this pull request Mar 27, 2026
…lobal-contexts (#199)

Add inspect.list-global-contexts action that reads site.globalContexts
(matching Studio's LeftProjectSettingsPanel read pattern). For each
TplComponent in the array, returns component name, UUID, source package,
and configured prop values extracted from vsettings[0].args.

Also adds globalContextCount and globalContexts summary to project.get-meta.

Closes the last remaining MCP gap (#27).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants