v3.0.2
Patch Changes
-
77b0f22: Fix JS
Datebinding in rawsql\`fragments — caused runtime failures onpostgres-js/neon-serverlessdrivers, which (unlikenode-postgres) don't natively encodeDate` in positional params when drizzle hasn't propagated column type info.Three sites affected:
reconcile.ts—${runs.updatedAt} < ${olderThan}rewritten via drizzle's typedlt(col, date)so the column'stimestamptzencoder runs. TheEXISTSsubqueries (no JS values, onlyNOW()) stay raw.queries.ts— cursor tuple compare(createdAt, id) < (...)casts the JS-Date param to::timestamptzin SQL. Tuple compare can't go throughlt, so the cast is the cheapest correct fix.adapters/graphile/index.ts—add_job(... run_at => ${opts.runAt} ...)cast to::timestamptzfor the same reason. Affected every delayed enqueue (sleeps, retries,delaystart opt).
Consumers using postgres-js or neon-serverless no longer need to spin up a separate
node-postgreshandle for the engine's pool.A single
ts(date)helper insrc/util/sql-params.tscentralizes the cast — every Date param in a rawsql\`` fragment goes through it. Easier to grep for, easier to extend (uuid/bigint/etc.) if the next driver-portability footgun shows up. -
fcc8f99: Three more boot-time footgun warnings + structural cleanup.
Warnings (operator-tunable defaults that silently bite under load):
flow.config.unbounded_step_timeout— nodefaultStepTimeoutMsset; a hung step pins a worker slot indefinitely. SetdefaultStepTimeoutMs(or passStepOpts.timeoutMson every step).flow.config.no_retention— noretentionconfigured;workflow.eventsand terminalworkflow.runsgrow forever. SetEngineOpts.retentionor run your own prune cron.- (already shipped last patch)
flow.config.stuck_shorter_than_step_timeout— reconciler would resurrect a still-running step.
Stderr fallback for warnings. When
EngineOpts.loggerisn't provided, the engine now uses a logger that pipeswarn/errortoprocess.stderr(debug/info stay silent). Previously the default was a full noop — boot validators warned into the void. Consumers who genuinely want silence still get it by passing their own no-op logger.Internal restructure. Extracted
src/engine/internal-crons.ts(reconciler + retention cron builders) andsrc/engine/loggers.ts(fallback + console presets).engine.ts464 → 413 lines;createEnginereads more linearly. Default magic numbers consolidated into named constants, using the existingtoMs("1m")/toMs("10m")duration helpers for self-documenting time values. -
645bc2a: Boot validator + docs for restart behavior.
Validator — warns at engine boot when
runningStuckMs < defaultStepTimeoutMs. The mismatch produces a real bug class: a step running between the two bounds is indistinguishable from a crashed process, so the reconciler resurrects it and you get two concurrent attempts of the same run.createEngine({ runningStuckMs: 60_000, // 1 min defaultStepTimeoutMs: 30 * 60_000, // 30 min ← BAD: step can outlive stuck threshold }); // warns: flow.config.stuck_shorter_than_step_timeout
Docs — new "Restart behavior" section in
docs/guide.mdcovers:- What survives a restart (
runningruns → reconciler;sleepingruns → graphile;awaiting_signalruns → DB rows + NOTIFY; idempotency keys; cron advisory locks). - What doesn't (
handle.result/handle.waitin-process Promise waiters die on crash; caller must retry). - At-least-once step semantics — make external calls idempotent.
- Crash-recovery latency =
runningStuckMs(default 10 min); tune lower for tighter recovery, but respect the new validator. - Multi-instance / rolling deploy safety (FOR UPDATE SKIP LOCKED + cross-instance NOTIFY).
- What survives a restart (