Skip to content

deploying-to-harper-fabric: Document .env deployment behavior, env namespacing, verification, and static file caching #20

@jcohen-hdb

Description

@jcohen-hdb

Context

We deployed a Harper Fabric app to a two-node cluster ~40 times during development. The deployment rule covers the basics (env vars, harperdb deploy_component, GitHub Actions), but several production-critical behaviors aren't documented. Some of these cost us hours of debugging.

Caveat: These are patterns we developed for our specific deployment. Harper may have better built-in solutions for some of these — if so, the skill should document them.


1. .env deploys WITH the component — dev config can leak to prod

What we hit: Harper deploys the .env file alongside the component code. If you develop locally with AUTH_REQUIRED=false and run npm run deploy, production gets auth disabled. We shipped an unauthenticated prod build before catching this.

What we did: Created a .env.prod file with production settings and wrote a deploy script that swaps .env.env.prod before deploying, then restores the original on exit (even on failure):

{
  "deploy": "bash -c 'cp .env .env.backup && cp .env.prod .env && trap \"cp .env.backup .env && rm -f .env.backup\" EXIT && dotenv -- npm run deploy:component'"
}

What the skill should do: Warn that .env is deployed with the component. Either document the swap pattern, or document the recommended way to manage dev vs prod configuration. The current deploy rule shows a simple dotenv -- npm run deploy:component which will deploy whatever's in .env — including dev settings.


2. process.env is shared across all components on the same node

What we hit: Two apps on the same cluster both set SESSION_COOKIE_NAME in their .env files. Whichever component loaded last won, and both apps ended up using the same session cookie name. Users were getting silently logged out of one app when logging into the other.

What we did: Prefixed ALL env vars with an app-specific namespace (e.g., MYAPP_) and wrote the config loader to check for the prefixed version first, falling back to unprefixed:

sessionCookieName: envStr("MYAPP_SESSION_COOKIE_NAME", envStr("SESSION_COOKIE_NAME", "myapp_session"))

What the skill should do: Warn that process.env is shared across components and recommend namespacing. This isn't obvious — most frameworks give each app its own process. A single sentence like "If your Fabric cluster runs multiple components, prefix your env vars to avoid collisions" would have saved us a confusing debugging session.


3. Deployment verification: replicated: [] means only one node updated

What we hit: After deploying, we noticed one of our two nodes was serving stale code. The deploy output had said replicated: [] (empty array) but we didn't know to check for that.

What we did: Now we always verify the deploy output includes replicated: [{ node: "..." }] with at least one entry per replica node.

What the skill should do: Add a "Verify your deployment" section showing what successful output looks like for single-node and multi-node clusters, and what replicated: [] means.


4. Static files are cached in memory — changes require full restart

What we hit: After deploying updated JS/CSS/HTML, the app kept serving the old versions. We spent over an hour trying different sync approaches before discovering that Harper caches static files in memory.

What we did: For local dev: pkill -9 -f harperdb && npm run dev after any static file change. For prod: hard-refresh browser after deploy (Cmd+Shift+R).

What the skill should do: Add this to serving-web-content.md and/or the deployment rule. Even a one-liner — "Harper caches static files in memory. After changing HTML/CSS/JS, restart the server locally or hard-refresh the browser after deploying." — would prevent a lot of confusion.


5. Never run cluster management commands on Fabric-managed clusters

What we hit: When replication seemed broken, we tried running cluster_status and add_node API operations to diagnose. This broke replication for ALL databases on the cluster, not just our app's.

What we did: Stopped running cluster management commands entirely. If replication looks broken, we escalate to the Fabric team instead.

What the skill should do: Add a clear warning: "On Fabric-managed clusters, do NOT run add_node, update_node, remove_node, or cluster_status operations. These are managed by Fabric. Running them manually can break replication across all databases on the cluster."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions