Skip to content

Deployment tracking, replicated payload delivery, and rollback #641

@kriszyp

Description

@kriszyp

Deployment tracking, replicated payload delivery, and rollback

This plan becomes the body of a new issue in HarperFast/harper. Once approved, we'll pause the four in-flight PRs (#530, #531, #536, harper-pro #146) and rebuild the deploy_component story around a replicated hdb_deployment system table that doubles as the payload-delivery channel.


Context

The original deploy_component effort split into four PRs solving four problems:

In-flight PR What it does
#530 Multipart-streaming upload from CLI → origin (no 2 GB Buffer cap)
#531 SSE-based progress events from server back to CLI
#536 Staging the upload to a temp file on origin so peers can also receive it
harper-pro #146 Direct-HTTPS peer relay because the WS operation transport has a frame cap

The user's two insights collapse this story into a single coherent design:

  1. The deploy_component handler should write a persistent record from the very first byte — so Studio (or any client) can observe and audit deploys independent of the CLI, errors are queryable after the fact, and rollback has a target to roll back to.
  2. If we're already storing the payload as a blob attribute on that record, we shouldn't need a separate staging file or a separate peer-relay channel — Harper's replication (replication/replicationConnection.ts) already supports streamed, chunked, back-pressured blob transfer (the BLOB_CHUNK = 146 message type, sender at lines 1840–1896, receiver at lines 788–858). The 100 MB cap is per chunk, not per blob, so multi-GB payloads work today. Update/upsert records should use patch #146's direct-HTTPS relay was solving a problem Harper had already solved elsewhere.

So this issue isn't "tracking on top of the deploy stack" — it's the new spine, and the staging/relay PRs go away.


Design

1. New replicated system table: hdb_deployment

Attribute Type Notes
deployment_id string (UUID v4) Hash attribute. Returned synchronously to the caller.
project string Component name. Indexed.
package_identifier string | null Result of derivePackageIdentifier() (Application.ts:446); null for raw-payload deploys.
payload_hash string | null SHA-256 of the tarball, computed once on origin during ingest.
payload_size number | null Uncompressed bytes (precomputed by getPackagedDirectorySize).
payload_blob Blob | null The tarball itself. Always populated on initial create so peers can replicate it. Pruned per-node after success per §3.
status enum pendingextractinginstallingloadingreplicatingrestartingsuccess | failed | rolled_back
phase string Current phase label, mirrors ProgressEmitter phase events.
event_log object[] Bounded JSON array (~200 entries) of {t, event, data}. Includes phase transitions and install summary; raw stdout/stderr aggregated.
peer_results object[] [{node, status, error?, started_at, completed_at}] populated after replication settles.
origin_node string Hostname of the coordinating node.
restart_mode string | null immediate / rolling / null.
started_at, completed_at timestamp Lifecycle bookends.
user string hdb_user.username of the requester.
restorable boolean Computed: payload_blob != null || package_identifier resolvable now.
rollback_of string | null Points at the prior deployment_id this row restored, when applicable.
error object | null On failure: {message, code, phase, stack?}.
__createdtime__, __updatedtime__ auto Standard LMDB metadata.

Secondary indexes: project, status, started_at, payload_hash. Registered via the standard pattern in utility/hdbTerms.ts (SYSTEM_TABLE_NAMES) with an upgrade directive in upgrade/directivesManager.ts.

2. Lifecycle and how delivery works now

CLI deploy_component
  │  multipart upload (busboy parser from #530 — kept)
  ▼
Origin Fastify handler
  │  start writing row: status=pending, deployment_id=<new UUID>
  │  pipe multipart file part directly into row.payload_blob.write()
  │  emit SSE event `deployment_id` so the CLI/Studio can subscribe
  │  ...as bytes flow, ProgressEmitter emits `upload` events
  │  when the file part ends:
  │    sha256 finalized → row.payload_hash set
  │    row committed       → replication ships row + chunked BLOB_CHUNKs to peers
  ▼
Origin operations.js deployComponent
  │  status=extracting → application = new Application({payload: row.payload_blob.stream()})
  │  status=installing → installApplication runs, emitter forwards lines
  │  status=loading
  │  status=replicating → replicateOperation({operation: deploy_component, deployment_id})
  │                          ↓
  │                       Peer ops handler receives operation
  │                          ↓
  │                       Peer reads `hdb_deployment[deployment_id]` (replicated row)
  │                       Peer reads row.payload_blob.stream() — awaits chunks if still arriving
  │                       Peer extracts, installs, loads, reports back
  │  recorder.recordPeers(result.replicated) → peer_results array
  │  status=restarting → restart_mode applied
  ▼
status=success (or status=failed in catch with error{})
  │  always in a finally: recorder closes the row, ends SSE
  ▼
storageReclamation tick (per-node, async)
  │  prunes payload_blob on rows where size > local threshold (§3)

The ProgressEmitter from #531 stays, but it now has two subscribers: the SSE writer (live to CLI) and the DeploymentRecorder (persistent). The recorder coalesces writes — every phase transition flushes immediately; install stdout/stderr aggregates every ~500 ms or 50 lines to avoid write storms.

No more _stagedPayloadPath. No more relayDeployToNode. The replicated row + blob does both jobs.

3. Payload retention: per-node, disk-pressure driven

Per the user's direction:

  • The blob is always stored at create time (every node, no size limit) — this is required for delivery.
  • After a deploy reaches a terminal state (success, failed, rolled_back), each node independently considers pruning its local copy of payload_blob.
  • Pruning hooks into the existing server/storageReclamation.ts onStorageReclamation() mechanism. This is already non-replicating — a node drops its local blob without that drop propagating, which is exactly the semantics we want: different nodes may keep different deploys' payloads based on local disk pressure.
  • The reclamation handler ranks candidates by (size DESC, age DESC, success_age DESC) and drops blobs above a soft local floor (default 50 MiB, configurable as deployments.localBlobSoftCap) first; smaller blobs are kept longer.
  • No advertised "we always keep N" guarantee — a payload is restorable if (it still has a local blob on the node serving the rollback) OR (package_identifier is still resolvable). Studio can show "payload available on M of N nodes" by querying each peer.
  • No payloadRetentionCount knob (the previous draft proposed one — the disk-pressure model supersedes it).

A delete_deployment_payload operation remains, for explicit admin pruning.

4. Rollback semantics

rollback_deployment {deployment_id}:

  1. Loads the row from any node. If restorable === false cluster-wide → 4xx.
  2. Constructs a synthetic deploy_component request internally:
    • Blob path: re-reads payload_blob.stream() from whichever node has it (replication will pull if missing locally, since the deploy_component handler runs on the node it was called on and that node already replicated to its peers when the original deploy ran — see §6 caveat).
    • Reference path: re-resolves package_identifier.
  3. Creates a new row, rollback_of = <source deployment_id>, runs the deploy lifecycle as normal.

Rollbacks are first-class deploys — they replicate, they show up in list_deployments, they themselves can be rolled back later.

5. Operations API

All gated by the existing permission model: super_user OR the operation named in the role's allowed operations list (confirmed by user — no new role needed). Five operations total; rollback folds into deploy_component, and subscribe folds into get_deployment via content negotiation.

Operation Args Behavior
deploy_component existing args + optional rollback_from?: deployment_id When rollback_from is set, skip the multipart upload: source the payload from the referenced row's payload_blob (or re-resolve its package_identifier if the blob is pruned locally). Run through the standard lifecycle. New row gets rollback_of = rollback_from. SSE behavior unchanged.
list_deployments {project?, status?, since?, until?, limit?, offset?} {deployments, total}, newest first.
get_deployment {deployment_id} Content-negotiated. Accept: application/json (default) → full record. Accept: text/event-stream → SSE: replays event_log on connect, then tails live ProgressEmitter events until terminal state, then done. For already-terminal deploys the replay closes immediately. Added to SSE_PROGRESS_OPERATIONS so the existing branch in server/serverHelpers/serverHandlers.js handles the switch.
get_deployment_payload {deployment_id} Streams the tarball from the local row's payload_blob. 404 if pruned locally; client retries another node.
delete_deployment_payload {deployment_id} Nulls the local blob. Does not replicate the deletion — per-node, like the auto-pruner.

subscribe_deployment and rollback_deployment are intentionally absent.

6. Race condition: peer reads before blob arrives

When the origin calls replicateOperation(deploy_component, {deployment_id}), two things race to each peer: the operation message and the row+blob chunks. The peer's handler must await row.payload_blob.stream() (or .bytes()) — the existing Blob API already blocks on incomplete file writes (resources/blob.ts bytes() / stream()).

Harper's replication already has idle-based timeout for in-flight blobs: a BLOB_CHUNK stream is destroyed after ~60 s of no chunks arriving (replication/replicationConnection.ts:2255-2260). That's the correct shape — a 2 GB blob over a slow link should not time out as long as bytes are flowing.

We surface that existing knob as deployments.peerReceiveIdleTimeoutMs (default 60 s). When the underlying blob stream is destroyed by the idle watchdog, row.payload_blob.stream() rejects, the peer records {status: 'failed', error: 'blob stream idle timeout'} in peer_results, and origin marks the deploy failed if every peer's blob never finished arriving. No new wall-clock timeout layered on top.


Implementation slices

The four existing PRs collapse into three new slices. We pause #530, #531, #536, harper-pro #146 (move to draft, leave a comment linking to the new issue). The CLI-side work in #530 and #531 (multipart-builder, sseConsumer, deployRenderer) survives largely unchanged; the server-side rebuild happens here.

Slice A — Foundation: schema + blob-backed multipart receive

  • Add hdb_deployment to SYSTEM_TABLE_NAMES, upgrade directive to create on existing installs.
  • Server-side multipart handler (server/serverHelpers/multipartParser.ts — kept from feat(deploy): stream deploy_component as multipart/form-data #530) is rewired to write the file part directly into a new hdb_deployment row's payload_blob attribute, computing sha256 alongside.
  • Add _deploymentId to req. Return the ID as the first SSE event so the CLI prints it and so Studio can subscribe.
  • deployComponent reads the payload from row.payload_blob.stream() instead of from the request body. (Single code path; no _stagedPayloadPath branch.)
  • list_deployments, get_deployment operations.
  • Tests: deploy succeeds end-to-end on a single node; row populated with hash, size, status=success.

Slice B — Lifecycle, recorder, replicated peer deploys

  • DeploymentRecorder class subscribes to ProgressEmitter, writes lifecycle transitions + bounded event_log.
  • Peer-side deployComponent handler reads hdb_deployment[deployment_id] and streams payload from row.payload_blob. replicateOperation keeps the same signature; the operation body carries just {deployment_id} plus existing flags.
  • Origin gathers peer_results from replicateOperation return.
  • get_deployment becomes content-negotiated: add it to SSE_PROGRESS_OPERATIONS; on Accept: text/event-stream replay event_log then tail emitter events.
  • delete_deployment_payload operation.
  • 3-node cluster integration test: deploy from A, observe row+blob replication, observe peer_results, query history from B, attach get_deployment SSE from C mid-deploy and confirm replay+tail.

Slice C — Rollback, reclamation, retention

  • Hook into onStorageReclamation (server/storageReclamation.ts) with a handler that prunes payload_blob on terminal-state rows; ranks by size + age.
  • Config: deployments.localBlobSoftCap (default 50 MiB), deployments.peerReceiveIdleTimeoutMs (default 60 s, surfacing the existing blob-stream idle threshold).
  • Extend deploy_component to accept rollback_from?: deployment_id. Two source paths (blob, reference). New row written with rollback_of set.
  • restorable computed on read (joins local blob presence + package-identifier resolvability check, lazy).
  • Tests: roll back a small (blob-backed) deploy via deploy_component {rollback_from}; roll back a large (pruned, package-identifier-only) deploy; refuse a rollback when neither source is available; reclamation drops large blobs first under simulated disk pressure.

The CLI-side rendering work (deployRenderer.ts, sseConsumer.ts, multipart-builder, getPackagedDirectorySize) lifts from #530/#531 essentially unchanged and lands within Slice A.


Critical files / integration points


Open items to call out in the issue

  1. Closing the existing PRs: feat(deploy): live SSE progress for deploy_component #531, feat(deploy): stage streamed payloads to a temp file for replication #536, harper-pro Update/upsert records should use patch #146 close in favor of Slice B. feat(deploy): stream deploy_component as multipart/form-data #530's CLI-side bits get cherry-picked into Slice A; its server-side multipart handler is rewired (not replaced).
  2. Authorization on payload reads: get_deployment_payload for a previous deploy could expose source code to a user with list_deployments permission. Should we gate it behind a stricter read_deployment_payload op? (Plan defaults to: same permission as deploy_component.)
  3. Peer-receive idle timeout default: 60 s mirrors Harper's current blob-stream idle threshold; verify that's still adequate for cross-region replicas on a 1 GB payload.
  4. get_deployment SSE for already-terminal deploys: replay event_log + close immediately. Studio benefits from fetching the full event timeline of historical deploys via the same channel as live ones.
  5. deploy_component {rollback_from} arg validation: if rollback_from is set, the multipart body MUST be absent. Rejecting both-supplied early prevents a class of confusing failures.

Verification

  • Slice A: deploy a small component locally; list_deployments shows the row with status=success, payload_hash matches, payload_blob retrievable via get_deployment_payload.
  • Slice B: 3-node cluster; deploy from node A; peer_results populated for B and C; while a slow deploy is in flight, subscribe_deployment from a second client receives the same events the CLI receives, then receives a final done. Kill the CLI mid-deploy; row reaches a terminal state regardless.
  • Slice C: rollback a blob-backed deploy, hash matches the original; rollback an npm: deploy after blob pruning, package re-resolves; simulate disk pressure (fill the data dir, trigger runReclamationHandlers), confirm the largest oldest blobs go first and smaller ones remain.
  • End-to-end: deploy a 2 GB tarball to a 3-node cluster; observe BLOB_CHUNK traffic in replication logs; confirm peers complete extraction after the blob finishes streaming; confirm localBlobSoftCap triggers post-success pruning on every peer.

Should (partially) address #564

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions