From bbcf19ecdaf3ac295887b9c3cccf425d63549fe7 Mon Sep 17 00:00:00 2001 From: Pierre Brisorgueil Date: Sat, 2 May 2026 20:24:30 +0200 Subject: [PATCH 1/4] docs(billing/crons): add jitter & sharding guidance to README --- modules/billing/crons/README.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/modules/billing/crons/README.md b/modules/billing/crons/README.md index 45599aa50..7250ae9df 100644 --- a/modules/billing/crons/README.md +++ b/modules/billing/crons/README.md @@ -53,6 +53,34 @@ spec: Repeat the manifest for `billing.extrasExpiration.js` and `billing.dunningSweep.js`, adjusting `name` and `schedule`. +## Jitter & sharding + +Devkit-shipped crons run on identical UTC schedules across all consumer deployments. To avoid thundering-herd against a shared DB or external API: + +### Recommended pattern — startup jitter + +Wrap the cron handler in a `setTimeout` of 0–N seconds (N = jitter window) computed at process start, persisted across restarts: + +```js +// illustrative — replace cron.schedule with your scheduler API +const jitterMs = Math.floor(Math.random() * 60_000); // 0–60s window +cron.schedule('0 2 * * 1', async () => { + await new Promise(r => setTimeout(r, jitterMs)); + await BillingResetService.resetAllDue(); +}); +``` + +### When to shard + +If your tenant count > 10k OR the operation touches a single table that doesn't tolerate concurrent writes well: +- Shard by `organizationId` modulo N (e.g. 8 shards, each at a different hour offset: `0 2-9 * * 1`) +- Or use a per-tenant queue with worker pool + +### Constraints + +- Don't jitter more than the operation's idempotency window — if reset is idempotent within 1h, jitter ≤ 30min. Beyond that, late-running jobs miss their window. +- Don't jitter critical SLA-bound jobs (alerts, notifications) — jitter undermines time-sensitivity. + ## Dependency: meterMode flag All scripts check `config.billing.meterMode` at startup. Downstream projects must set this flag to `true` in their project config to activate billing crons. The devkit default is `false` — all crons are no-ops until explicitly enabled. From 6c194dfe42826cc3292db11a4fcbc2c78e36d4ae Mon Sep 17 00:00:00 2001 From: Pierre Brisorgueil Date: Sat, 2 May 2026 20:32:04 +0200 Subject: [PATCH 2/4] docs(billing/crons): fix jitter & sharding guidance accuracy MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove misleading "persisted across restarts" prose — K8s CronJob scripts are invoked once per execution, so jitter is naturally per-invocation - Replace cron.schedule() snippet (no-cron-dep model) with a self-executing top-level await delay pattern matching the actual entrypoint structure - Add optional stable per-pod jitter derivation from HOSTNAME hash - Add SHARD_INDEX/SHARD_TOTAL env-var example in CronJob manifest + script filter snippet to complete sharding implementation guidance --- modules/billing/crons/README.md | 38 +++++++++++++++++++++++++++------ 1 file changed, 32 insertions(+), 6 deletions(-) diff --git a/modules/billing/crons/README.md b/modules/billing/crons/README.md index 7250ae9df..e1ac08ca2 100644 --- a/modules/billing/crons/README.md +++ b/modules/billing/crons/README.md @@ -59,23 +59,49 @@ Devkit-shipped crons run on identical UTC schedules across all consumer deployme ### Recommended pattern — startup jitter -Wrap the cron handler in a `setTimeout` of 0–N seconds (N = jitter window) computed at process start, persisted across restarts: +These scripts are invoked once per CronJob execution and exit immediately after. Add a random delay at the top of your entrypoint to spread load across deployments: ```js -// illustrative — replace cron.schedule with your scheduler API +// Add at the top of your cron entrypoint (before the main logic) +// Jitter is re-randomized on each CronJob invocation — this is intentional for K8s CronJobs. +// For a stable per-pod offset, derive from process.env.HOSTNAME instead (see note below). const jitterMs = Math.floor(Math.random() * 60_000); // 0–60s window -cron.schedule('0 2 * * 1', async () => { - await new Promise(r => setTimeout(r, jitterMs)); - await BillingResetService.resetAllDue(); -}); +await new Promise(r => setTimeout(r, jitterMs)); +await BillingResetService.resetAllDue(); ``` +> **Stable per-pod jitter (optional):** If you want the same pod to always fire at the same offset within the window, derive jitter from the pod hostname instead of `Math.random()`: +> ```js +> const seed = process.env.HOSTNAME ?? 'default'; +> const hash = [...seed].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); +> const jitterMs = Math.abs(hash) % 60_000; +> ``` + ### When to shard If your tenant count > 10k OR the operation touches a single table that doesn't tolerate concurrent writes well: - Shard by `organizationId` modulo N (e.g. 8 shards, each at a different hour offset: `0 2-9 * * 1`) - Or use a per-tenant queue with worker pool +To implement shard-based filtering, pass a `SHARD_INDEX` and `SHARD_TOTAL` env vars in the CronJob manifest: + +```yaml +env: + - name: SHARD_INDEX + value: "0" # 0..N-1 + - name: SHARD_TOTAL + value: "8" +``` + +Then filter in the script: + +```js +const shardIndex = parseInt(process.env.SHARD_INDEX ?? '0', 10); +const shardTotal = parseInt(process.env.SHARD_TOTAL ?? '1', 10); +// Only process orgs whose ID hashes to this shard +const orgs = await Org.find({ $where: `this._id % ${shardTotal} === ${shardIndex}` }); +``` + ### Constraints - Don't jitter more than the operation's idempotency window — if reset is idempotent within 1h, jitter ≤ 30min. Beyond that, late-running jobs miss their window. From d8a1b921bc2f787b7bf12169df35d1bf25fac525 Mon Sep 17 00:00:00 2001 From: Pierre Brisorgueil Date: Sat, 2 May 2026 20:34:11 +0200 Subject: [PATCH 3/4] docs(billing/crons): replace $where shard filter with safe client-side hash MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit $where executes JS in MongoDB and is deprecated/disabled in newer versions. Replace with a client-side hash on _id strings — same O(n) complexity, no server-side JS dependency. --- modules/billing/crons/README.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/modules/billing/crons/README.md b/modules/billing/crons/README.md index e1ac08ca2..61bb88e78 100644 --- a/modules/billing/crons/README.md +++ b/modules/billing/crons/README.md @@ -93,13 +93,18 @@ env: value: "8" ``` -Then filter in the script: +Then filter in the script by hashing a stable field (e.g. the string representation of `_id`) against the shard count: ```js const shardIndex = parseInt(process.env.SHARD_INDEX ?? '0', 10); const shardTotal = parseInt(process.env.SHARD_TOTAL ?? '1', 10); -// Only process orgs whose ID hashes to this shard -const orgs = await Org.find({ $where: `this._id % ${shardTotal} === ${shardIndex}` }); +// Only process orgs assigned to this shard (stable hash on _id string) +const allOrgs = await Org.find({}, '_id').lean(); +const orgs = allOrgs.filter(o => { + const id = o._id.toString(); + const hash = [...id].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); + return Math.abs(hash) % shardTotal === shardIndex; +}); ``` ### Constraints From 2013d78ed9d19c83a8746297a40314bc62c786ed Mon Sep 17 00:00:00 2001 From: Pierre Brisorgueil Date: Sat, 2 May 2026 20:39:14 +0200 Subject: [PATCH 4/4] docs(billing/crons): fix CJS compatibility + variable shadowing + scalability note - Wrap jitter snippet in async IIFE (CJS entrypoints don't support top-level await) - Rename stable-jitter variables (hostHash/stableJitterMs) to avoid const redeclaration if both snippets appear in the same file - Add scalability caveat on in-memory shard filter; point toward server-side $mod for high tenant counts --- modules/billing/crons/README.md | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/modules/billing/crons/README.md b/modules/billing/crons/README.md index 61bb88e78..833443ccf 100644 --- a/modules/billing/crons/README.md +++ b/modules/billing/crons/README.md @@ -62,19 +62,21 @@ Devkit-shipped crons run on identical UTC schedules across all consumer deployme These scripts are invoked once per CronJob execution and exit immediately after. Add a random delay at the top of your entrypoint to spread load across deployments: ```js -// Add at the top of your cron entrypoint (before the main logic) +// Wrap in an async IIFE — cron entrypoints are CommonJS, so top-level await is not available. // Jitter is re-randomized on each CronJob invocation — this is intentional for K8s CronJobs. // For a stable per-pod offset, derive from process.env.HOSTNAME instead (see note below). -const jitterMs = Math.floor(Math.random() * 60_000); // 0–60s window -await new Promise(r => setTimeout(r, jitterMs)); -await BillingResetService.resetAllDue(); +(async () => { + const jitterMs = Math.floor(Math.random() * 60_000); // 0–60s window + await new Promise(r => setTimeout(r, jitterMs)); + await BillingResetService.resetAllDue(); +})(); ``` -> **Stable per-pod jitter (optional):** If you want the same pod to always fire at the same offset within the window, derive jitter from the pod hostname instead of `Math.random()`: +> **Stable per-pod jitter (optional):** If you want the same pod to always fire at the same offset within the window, derive jitter from the pod hostname instead of `Math.random()`. Use a distinct variable name to avoid shadowing if both snippets appear in the same file: > ```js > const seed = process.env.HOSTNAME ?? 'default'; -> const hash = [...seed].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); -> const jitterMs = Math.abs(hash) % 60_000; +> const hostHash = [...seed].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); +> const stableJitterMs = Math.abs(hostHash) % 60_000; > ``` ### When to shard @@ -98,12 +100,14 @@ Then filter in the script by hashing a stable field (e.g. the string representat ```js const shardIndex = parseInt(process.env.SHARD_INDEX ?? '0', 10); const shardTotal = parseInt(process.env.SHARD_TOTAL ?? '1', 10); -// Only process orgs assigned to this shard (stable hash on _id string) +// Only process orgs assigned to this shard (stable hash on _id string). +// Note: this loads all _id values into memory. For very large tenant counts, +// prefer a server-side filter (e.g. MongoDB $expr + $mod on a numeric shard key). const allOrgs = await Org.find({}, '_id').lean(); const orgs = allOrgs.filter(o => { const id = o._id.toString(); - const hash = [...id].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); - return Math.abs(hash) % shardTotal === shardIndex; + const h = [...id].reduce((a, c) => ((a << 5) - a + c.charCodeAt(0)) | 0, 0); + return Math.abs(h) % shardTotal === shardIndex; }); ```