Skip to content

Introduce "template" as a VM state#229

Open
sjmiller609 wants to merge 6 commits into
mainfrom
hypeship/template-as-state
Open

Introduce "template" as a VM state#229
sjmiller609 wants to merge 6 commits into
mainfrom
hypeship/template-as-state

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

@sjmiller609 sjmiller609 commented May 13, 2026

Summary

Models "Template" as a state in the instance state machine. A Standby instance is auto-promoted to Template the first time it's forked from a snapshot. ForkCount is bumped on each subsequent fork and decremented when a fork is deleted. Templates cannot wake while ForkCount > 0; un-promote (Template -> Standby) and delete (Template -> Stopped) are both refused until forks drain. RestoreInstance on a Template with zero forks transparently un-promotes and then restores.

Three new fields live on StoredMetadata: IsTemplate, ForkCount, ForkOfTemplate. A fourth field HotPagesPath is reserved for the future UFFD prefetch path and is cleared on un-promote.

State derivation in query.go returns StateTemplate when there's a snapshot, no socket, and IsTemplate=true. Template is a derived state alongside Standby/Stopped — no extra files on disk, no separate index.

The running-fork flow (from_running=true) skips promotion: it restores the source back to Running afterward, and a template can't wake.

Fork delete decrements the parent's ForkCount under the parent's lock after the fork's data is gone. Worst case on partial failure is drift (parent count higher than reality), fixed by a future reconciliation pass.

Test plan

  • go build ./... clean
  • Existing instance tests pass (go test ./lib/instances/...)
  • New unit tests in templates_test.go cover: deriveState returns Template, Restore refused on Template with forks, Restore demotes Template with zero forks, Delete refused on Template with forks, Delete-fork decrements parent ForkCount
  • State-transition tests cover the new Standby<->Template and Template->Stopped edges
  • End-to-end fork-from-Template flow (not exercised in unit tests; needs real snapshot/VMM)

Note

Medium Risk
Adds a new instance lifecycle state and new API endpoints that change fork/restore/delete behavior; mistakes could block restores/deletes or mis-handle template fork tracking.

Overview
Introduces a new Template instance state (derived from a standby snapshot) and extends the state machine to allow Standby -> Template and Template -> Standby/Stopped while disallowing waking templates directly.

Adds lifecycle operations to explicitly manage templates: new API endpoints POST /instances/{id}/promote-template and POST /instances/{id}/demote-template, plus instances.Manager methods PromoteToTemplate/DemoteTemplate with state/"live fork" guards.

Updates forking semantics to allow forks from templates (treating template/standby as snapshot-based), record parent linkage via ForkOfTemplate, and ensure forks do not inherit template-only metadata; deletion and demotion now refuse when a template has live forks, enforced via a new countTemplateForks scan and covered by new templates_test.go.

Reviewed by Cursor Bugbot for commit 0d0ab12. Bugbot is set up for automated code reviews on this repo. Configure here.

sjmiller609 and others added 2 commits May 13, 2026 00:43
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds StateTemplate to the instance state machine. A Standby instance is
auto-promoted to Template the first time it's forked from a snapshot,
and ForkCount is bumped on each subsequent fork. Templates can't wake
while ForkCount > 0; un-promote (Template -> Standby) and delete
(Template -> Stopped) are both refused until forks drain.

Fork bookkeeping lives on StoredMetadata (IsTemplate, ForkCount,
ForkOfTemplate, plus a reserved HotPagesPath for the prefetch path).
Deleting a fork decrements the parent template's ForkCount under the
parent's lock; deletion of the fork's own data has already happened, so
worst case is refcount drift that a future reconciliation pass fixes.

The running-fork flow keeps skipping promotion: it restores the source
back to Running afterward, and a template can't wake.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

✱ Stainless preview builds for hypeman

This PR will update the hypeman SDKs with the following commit message.

feat: Model template as an instance state instead of a separate registry

Edit this comment to update it. It will appear in the SDK's changelogs.

hypeman-typescript studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ❗test ✅

npm install https://pkg.stainless.com/s/hypeman-typescript/98a12ef59b07e556650e6d9fabaaf7cac04acaea/dist.tar.gz
hypeman-go studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ✅test ✅

go get github.com/stainless-sdks/hypeman-go@7e4dccadb0ce4cf062288caf68a3ecb8c5625520
hypeman-openapi studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-05-13 19:45:15 UTC

Comment thread lib/instances/delete.go Outdated
sjmiller609 and others added 2 commits May 13, 2026 17:15
Drops the persisted ForkCount field from StoredMetadata and the
decrement bookkeeping in DeleteInstance. Live forks of a template are
now counted by scanning metadata for ForkOfTemplate matches via a new
countTemplateForks helper. The fork-of-template field itself remains
the single source of truth, so there's no drift to reconcile.

Template promotion on fork only flips IsTemplate when not already set;
deletion of a template still refuses when forks exist, but the count
is computed from disk rather than read from a denormalized field.
Previously ForkInstance auto-promoted a Standby source to Template the
first time it was forked from a snapshot, and RestoreInstance auto-demoted
a Template before waking it. That implicit lifecycle blurred the rules: a
Standby and a "Standby that has been forked once" behaved differently,
and callers had to know that restoring a Template was a two-step
operation under the hood.

Replace it with explicit PromoteToTemplate / DemoteTemplate manager
methods (and matching POST /instances/{id}/promote-template and
/demote-template endpoints). Promotion is now Standby -> Template only;
demotion is Template -> Standby only and refuses while live forks
reference the template. ForkInstance only records the parent linkage if
the source is already a Template, and RestoreInstance no longer
auto-demotes — callers must demote first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread cmd/api/api/instances.go
Comment thread lib/instances/manager.go
Comment thread lib/instances/query.go
@sjmiller609 sjmiller609 changed the title Model template as an instance state instead of a separate registry Introduce "template" as a VM state May 13, 2026
@sjmiller609 sjmiller609 requested a review from hiroTamada May 13, 2026 19:30
sjmiller609 and others added 2 commits May 13, 2026 15:30
Silently continuing past an unreadable metadata file could undercount
forks of a template, allowing DemoteTemplate or DeleteInstance to free
a template whose pages are still mapped by a live fork.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sjmiller609 sjmiller609 marked this pull request as ready for review May 13, 2026 19:30
@firetiger-agent
Copy link
Copy Markdown

Firetiger deploy monitoring skipped

This PR didn't match the auto-monitor filter configured on your GitHub connection:

Any PR that changes the kernel API. Monitor changes to API endpoints (packages/api/cmd/api/) and Temporal workflows (packages/api/lib/temporal) in the kernel repo

Reason: PR modifies instance state machine logic in packages/api/lib/instances, not the API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) specified in the filter.

To monitor this PR anyway, reply with @firetiger monitor this.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0d0ab12. Configure here.

Comment thread lib/instances/query.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant