Webhooks Part 2#2618
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds a complete webhook subsystem: DB schema/migrations, WAL-driven event generation, cached webhook model, multi-threaded processor and sender (DoH + SSRF checks), JWT payloads/signing, public payload route, client SDK + admin/platform integration, dashboard UI, sandbox receivers, tests, and supporting tooling/config updates. ChangesBackend webhook core & infra
Client SDK, admin/platform integration, dashboard UI, sandbox
Tooling, flags, config, tests, formatting, hooks
sequenceDiagram
participant Dash as Dashboard/Admin UI
participant Server as Instant API
participant DB as Postgres
participant Invalidator as Invalidator
participant Processor as Webhook Processor
participant Sender as Webhook Sender
participant Client as Webhook Receiver
Dash->>Server: Create/Update/Delete webhook via dash API
Server->>DB: INSERT/UPDATE webhooks
DB->>Invalidator: WAL entries for app data changes
Invalidator->>Processor: notify WebhookEvents (local or gRPC)
Processor->>DB: claim_webhook_events() to atomically claim work
Processor->>Sender: send-webhook (sign, DoH resolve, POST)
Sender->>Client: POST payload (Instant-Signature + JWT)
Client->>Sender: Response (status/body)
Sender->>Processor: return WebhookAttempt result
Processor->>DB: record attempt & update webhook_events status
Estimated code review effort 🎯 5 (Critical) | ⏱️ ~120+ minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/resources/migrations/107_add_webhooks.down.sql`:
- Around line 1-2: The DROP FUNCTION statements for explain_claim_webhook_events
and claim_webhook_events are missing argument type signatures; update them to
include the exact parameter type lists from their corresponding CREATE FUNCTIONs
(omit parameter names and DEFAULT clauses) — e.g., replace "drop function
explain_claim_webhook_events;" and "drop function claim_webhook_events;" with
DROP FUNCTION explain_claim_webhook_events(<param_types>); and DROP FUNCTION
claim_webhook_events(<param_types>); so PostgreSQL can unambiguously identify
the overloads (alternatively use IF EXISTS with signatures if preferred).
In `@server/resources/migrations/107_add_webhooks.up.sql`:
- Around line 83-88: The migration introduces a function named
claim_webhook_events but the caller still invokes claim_webhook_payloads; to
avoid breaking runtime callers either add a compatibility wrapper function named
claim_webhook_payloads that forwards to claim_webhook_events (preserving the
same signature: p_machine_id, p_max_apps, p_max_per_app and returning setof
webhook_events) or rename/update the Clojure caller in
server/src/instant/model/webhook.clj to call claim_webhook_events; implement the
wrapper in the same migration so existing callers continue to work until the
client code is updated.
- Around line 100-127: The claim query currently only filters by status =
'pending' so rows with a future retry are being reclaimed; update both the
next_app CTE and the per-app lateral subquery (locked) in the webhook_events
claim SQL to add "next_attempt_after <= now()" alongside "status = 'pending'",
and make the identical change in the explain_claim_webhook_events query so the
retry backoff is honored when selecting and explaining claimed webhook_events.
In `@server/src/instant/model/webhook.clj`:
- Around line 72-78: The 2-arity variant of get-by-app-id-and-webhook-id! drops
the caller's conn and calls get-by-app-id-and-webhook-id!* without it, causing
transactional/write callers to fall back to the read pool; update the non-read
branch to call (get-by-app-id-and-webhook-id!* conn params) so the provided conn
is forwarded to get-by-app-id-and-webhook-id!* (ensure the function signatures
align and the conn is used consistently).
- Around line 42-59: The cache webhook-with-etypes-cache holds :etypes derived
from attrs and can become stale when attrs change; update the cache eviction
logic (the code that currently calls evict-webhook-from-cache for "webhooks" WAL
rows) to also handle WAL rows for "attrs": on an attrs change, find affected
webhooks (those in the same app that reference the changed attr id via their
id-attr-ids) and call evict-webhook-from-cache for each {:app-id app-id
:webhook-id webhook-id}, or at minimum evict all entries for the app when attrs
change; use the existing evict-webhook-from-cache helper and the webhook/query
identifiers (webhook-with-etypes-cache, evict-webhook-from-cache,
get-by-app-id-and-webhook-id-q) to locate and implement this behavior.
In `@server/src/instant/system_catalog.clj`:
- Line 63: You changed the shortcodes used by encode-system-uuid for keys like
"content-type" and "userInfo", which will change generated UUIDs and break
resolution of existing persisted IDs; either revert those shortcode changes or
implement backward-compat handling: keep the previous shortcode values in the
mapping (as legacy aliases) and add logic in encode-system-uuid and the
lookup/resolution path to accept/translate legacy shortcodes (or run a migration
to reassign persisted entities to the new UUIDs). Update the mapping used by
encode-system-uuid to contain both current and legacy shortcode entries and
ensure any functions that resolve system UUIDs check aliases so old IDs remain
resolvable.
In `@server/src/instant/webhook_jwt.clj`:
- Around line 84-85: The validation error handlers currently pass the full JWT
(token-string) into ex/throw-validation-err! which can leak credentials; update
the branches that call ex/throw-validation-err! (the checks around .verify
parsed-jwt tink-verifier and the similar checks at lines ~100-101) to omit or
redact token-string from the exception data—e.g., pass nil or a fixed
placeholder like "<redacted>" instead of token-string in the invalid value
position so the thrown validation error no longer contains the raw bearer token.
- Around line 82-85: SignedJWT/parse can throw java.text.ParseException for
malformed compact JWTs which currently surfaces as an internal error; wrap the
call to SignedJWT/parse in a try/catch that catches ParseException and calls
ex/throw-validation-err! with the same shape used for invalid JWTs (e.g., :jwt
token-string [{:message "Invalid JWT."}]), then proceed to verify the parsed JWT
with .verify using tink-verifier as before; update the code around
SignedJWT/parse / .verify to ensure all malformed/invalid tokens are converted
to the validation error via ex/throw-validation-err!.
In `@server/test/instant/webhook_jwt_test.clj`:
- Around line 59-75: The test webhook-payload-jwt-rejects-tampered-token
currently mints a token with specific claims but calls
webhook-jwt/verify-webhook-payload-jwt with fresh random app-id/webhook-id which
lets claim-mismatch hide signature verification regressions; change the
let-binding to capture the original claims map (e.g. bind claims (hash-map
:app-id ... :webhook-id ... :isn ...)), use webhook-jwt/webhook-payload-jwt on
that claims binding to produce token, and pass the same claims binding into
webhook-jwt/verify-webhook-payload-jwt when verifying the tampered token so the
failure can only come from signature verification.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b796237c-4ce8-43bf-b61a-281f0d6b24e6
📒 Files selected for processing (15)
server/deps.ednserver/resources/migrations/107_add_webhooks.down.sqlserver/resources/migrations/107_add_webhooks.up.sqlserver/src/instant/isn.cljserver/src/instant/jdbc/cache_evict.cljserver/src/instant/jdbc/sql.cljserver/src/instant/jdbc/wal.cljserver/src/instant/model/webhook.cljserver/src/instant/system_catalog.cljserver/src/instant/util/crypt.cljserver/src/instant/webhook_jwt.cljserver/src/instant/webhook_routes.cljserver/test/instant/model/webhook_test.cljserver/test/instant/webhook_jwt_test.cljserver/test/instant/webhook_routes_test.clj
| "codeHash" "codehash" | ||
| "content-disposition" "cdisp" | ||
| "content-type" "c-type" | ||
| "content-type" "ctype" |
There was a problem hiding this comment.
I updated the encoder to throw if we use an unsupported character. This was using the unsupported character -. Removing the char will return the same value because we silently dropped it before (c-type became ctype before we encoded, so this is just doing the same thing). (same for userInfo below).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/src/instant/webhook_jwt.clj`:
- Around line 76-77: The JWT verifier currently passes nil for required claims
so exp is optional; update the DefaultJWTClaimsVerifier construction to pass the
required-claims HashSet (defined as required-claims which contains
JWTClaimNames/EXPIRATION_TIME) instead of nil so the exp claim is enforced by
DefaultJWTClaimsVerifier; locate the place where DefaultJWTClaimsVerifier is
invoked and replace the nil required-claims argument with required-claims.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a3290586-c434-42a5-9f72-3f38b34d680c
📒 Files selected for processing (5)
server/deps.ednserver/resources/migrations/107_add_webhooks.down.sqlserver/src/instant/model/webhook.cljserver/src/instant/webhook_jwt.cljserver/test/instant/webhook_jwt_test.clj
✅ Files skipped from review due to trivial changes (2)
- server/deps.edn
- server/resources/migrations/107_add_webhooks.down.sql
🚧 Files skipped from review as they are similar to previous changes (2)
- server/test/instant/webhook_jwt_test.clj
- server/src/instant/model/webhook.clj
|
View Vercel preview at instant-www-js-webhooks-2-jsv.vercel.app. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/src/instant/model/webhook.clj`:
- Around line 48-56: Remove the stray debug call to tool/def-locals inside
add-attr-listener: the function add-attr-listener currently invokes
(tool/def-locals) which references an unrequired namespace and will cause a
runtime error; delete that call so the compute lambda only builds/returns the
ConcurrentHashMap/newKeySet, adds the map {:app-id (:app_id webhook) :webhook-id
(:id webhook)} and returns the set. Ensure no other references to
tool/def-locals remain in add-attr-listener.
In `@server/src/instant/util/token.clj`:
- Around line 51-53: The current coercion treats any string starting with "eyJ"
as a JWT; replace the (string/starts-with? s "eyJ") heuristic with a strict
compact-JWT shape check (three base64url segments separated by two dots). In the
token coercion code where JWTString is constructed, validate that s matches a
regex like "^[A-Za-z0-9_-]+\\.[A-Za-z0-9_-]+\\.[A-Za-z0-9_-]+$" (or split on '.'
and ensure there are exactly 3 non-empty segments) before returning (JWTString.
s); otherwise fall back to the non-JWT path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 37eb2e68-7122-4bc1-9afa-f1e337b66396
📒 Files selected for processing (8)
server/deps.ednserver/resources/migrations/107_add_webhooks.down.sqlserver/src/instant/jdbc/cache_evict.cljserver/src/instant/model/webhook.cljserver/src/instant/util/token.cljserver/src/instant/webhook_jwt.cljserver/src/instant/webhook_routes.cljserver/test/instant/webhook_jwt_test.clj
✅ Files skipped from review due to trivial changes (3)
- server/deps.edn
- server/resources/migrations/107_add_webhooks.down.sql
- server/src/instant/webhook_routes.clj
🚧 Files skipped from review as they are similar to previous changes (3)
- server/src/instant/jdbc/cache_evict.clj
- server/test/instant/webhook_jwt_test.clj
- server/src/instant/webhook_jwt.clj
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@server/src/instant/model/webhook.clj`:
- Around line 172-176: The code currently calls attr-model/get-by-app-id to
fetch the live attrs and then uses uuids->labels for a historical wal-record,
causing label drift; change the flow to resolve labels using the schema snapshot
at the wal-record's isn (not the live schema) by adding or using a function like
attr-model/get-by-app-id-at-isn (or loading the persisted label mapping tied to
wal-record/isn) and pass that snapshot into topics/extract-entities-before /
uuids->labels so historical events serialize with stable keys for retries and
pull-based delivery.
In `@server/src/instant/webhook_routes.clj`:
- Around line 49-56: The get-payload handler currently returns sensitive webhook
snapshots via response/ok without cache headers; modify get-payload so the final
Ring response includes the header "Cache-Control" set to "no-store" (i.e., wrap
or assoc the response returned by response/ok to add headers). Update the return
in get-payload (which builds {:data data :idempotency-key ...}) to include the
Cache-Control header while keeping use of req->app-id-and-webhook-authed! and
webhook-model/payload-idempotency-key unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 06b2102a-bb03-45d2-a01a-e603700e272f
📒 Files selected for processing (8)
server/deps.ednserver/resources/migrations/107_add_webhooks.down.sqlserver/src/instant/jdbc/cache_evict.cljserver/src/instant/model/webhook.cljserver/src/instant/util/token.cljserver/src/instant/webhook_jwt.cljserver/src/instant/webhook_routes.cljserver/test/instant/webhook_jwt_test.clj
✅ Files skipped from review due to trivial changes (2)
- server/deps.edn
- server/test/instant/webhook_jwt_test.clj
🚧 Files skipped from review as they are similar to previous changes (2)
- server/src/instant/jdbc/cache_evict.clj
- server/resources/migrations/107_add_webhooks.down.sql
There was a problem hiding this comment.
🧹 Nitpick comments (1)
server/src/instant/webhook_routes.clj (1)
58-61: ⚡ Quick winPreserve existing headers when adding cache directives.
Line 58 currently replaces the full headers map. Prefer merging so future/pre-set headers aren’t accidentally dropped.
Suggested patch
- (assoc :headers {"Cache-Control" "no-store, private" - "Pragma" "no-cache" - "Expires" "0" - "Vary" "Authorization"})))) + (update :headers merge {"Cache-Control" "no-store, private" + "Pragma" "no-cache" + "Expires" "0" + "Vary" "Authorization"}))))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@server/src/instant/webhook_routes.clj` around lines 58 - 61, The current use of (assoc :headers {...}) in the webhook response replaces any pre-existing headers; change it to merge the cache directives into the existing headers instead (e.g., use update or merge so you combine the existing :headers map with {"Cache-Control" "no-store, private", "Pragma" "no-cache", "Expires" "0", "Vary" "Authorization"}). Locate the assoc :headers call in the webhook response builder within webhook_routes.clj and replace the direct assoc with a merge/update that preserves existing header keys while adding/overriding only the cache-related entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@server/src/instant/webhook_routes.clj`:
- Around line 58-61: The current use of (assoc :headers {...}) in the webhook
response replaces any pre-existing headers; change it to merge the cache
directives into the existing headers instead (e.g., use update or merge so you
combine the existing :headers map with {"Cache-Control" "no-store, private",
"Pragma" "no-cache", "Expires" "0", "Vary" "Authorization"}). Locate the assoc
:headers call in the webhook response builder within webhook_routes.clj and
replace the direct assoc with a merge/update that preserves existing header keys
while adding/overriding only the cache-related entries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f9ac3c2e-ad37-4981-b8c3-94885365621e
📒 Files selected for processing (2)
server/src/instant/util/crypt.cljserver/src/instant/webhook_routes.clj
|
|
||
| ;; Map of attr-id -> webhook-ids, used to invalidate the webhook | ||
| ;; cache if the underlying attrs change. | ||
| (def ^{:tag 'ConcurrentHashMap} attr-listeners (ConcurrentHashMap.)) |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (26)
client/www/pages/dash/index.tsx (1)
269-269: 💤 Low valueConsider consolidating the duplicate
useFlag('webhooks')calls.Both
Dashboard(line 269) andDashboardContent(line 956) independently read the same feature flag. While this provides defense-in-depth, you could passwebhooksEnabledas a prop toDashboardContentto avoid the duplicate hook call and make the data flow more explicit.Also applies to: 956-956
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/www/pages/dash/index.tsx` at line 269, Dashboard currently calls useFlag('webhooks') in both Dashboard and DashboardContent; remove the duplicate hook by reading the flag once in Dashboard (const webhooksEnabled = useFlag('webhooks')) and pass webhooksEnabled as a prop to DashboardContent, update DashboardContent's props interface/parameter to accept webhooksEnabled, and replace the internal useFlag('webhooks') call inside DashboardContent with the passed prop to make the data flow explicit and avoid duplicate hooks.server/test/instant/webhook_processor_test.clj (1)
55-59: 💤 Low valueConsider cleaning up the
webhooksrow in thefinallyblock.The test cleans up
webhook_events(lines 56-59) but leaves thewebhooksrow inserted at line 23. Ifwith-empty-appdoesn't automatically clean up test data, you may want to add:(finally (sql/do-execute! (aurora/conn-pool :write) ["delete from webhook_events where webhook_id = ? and isn = ? and partition_bucket = ?" webhook-id isn partition-bucket]) (sql/do-execute! (aurora/conn-pool :write) ["delete from webhooks where id = ?" webhook-id]))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/test/instant/webhook_processor_test.clj` around lines 55 - 59, Add cleanup of the inserted webhooks row in the same finally block that deletes webhook_events: after the existing sql/do-execute! call for webhook_events, call sql/do-execute! with (aurora/conn-pool :write) and a parameterized delete like ["delete from webhooks where id = ?" webhook-id] so the webhooks row created earlier (inserted around with-empty-app / webhook-id) is removed as well.server/src/instant/nippy.clj (1)
278-294: 💤 Low valueConsider whether
WebhookEventsserialization should usenippy/with-cache.The
WalRecordencoder (line 222) wraps serialization innippy/with-cacheto deduplicate repeated values. Ifevent-primary-keysinWebhookEventscontain many duplicate values (e.g., samewebhook_idacross multiple events), adding caching could reduce serialized size. If not, the current implementation is fine.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/nippy.clj` around lines 278 - 294, The WebhookEvents custom serializer currently freezes each primary key field without caching; if event-primary-keys often contain repeated values (e.g., same webhook_id/isn), wrap the serialization in nippy/with-cache similar to the WalRecord encoder so repeated values are deduplicated during freeze. Modify the nippy/extend-freeze for WebhookEvents to call nippy/with-cache around the body that calls nippy/freeze-to-out! and the per-key writes (write-uuid, write-isn, nippy/freeze-to-out! for partition_bucket) so identical values are cached, leaving extend-thaw unchanged unless you add complementary cache-aware thawing logic.client/packages/webhooks/package.json (1)
56-56: ⚡ Quick winConsider upgrading Vitest to the latest stable version.
Vitest 0.21.0 is from 2022. The current stable release is 2.x or later, which includes significant improvements and better TypeScript 5.x support. Upgrading would ensure compatibility with modern tooling and benefit from performance improvements.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/package.json` at line 56, The package.json currently pins Vitest at "vitest": "^0.21.0" which is very old; update that dependency entry to a current stable Vitest 2.x (or the latest stable release) in client/packages/webhooks/package.json, then run your package manager (npm/yarn/pnpm) to update the lockfile and install, run the test suite/CI to catch breaking changes, and adjust any Vitest config or TypeScript settings (vite.config/test, vitest.config, tsconfig) to address API or TypeScript 5.x compatibility issues introduced by the major upgrade.server/resources/migrations/109_add_webhooks.up.sql (4)
81-104: 💤 Low valueTrigger only fires on INSERT — UPDATE of
webhook_id/app_idis silently allowed.
check_refs_insert_triggerisbefore insert, so a strayUPDATE webhook_events SET webhook_id = …(orapp_id) could land on a non-existent webhook without the FK enforcement that real FK constraints would give you. Today nothing in the codebase appears to do that, but if you ever migrate rows or backfill, this guard won't catch it. Consider adding a correspondingBEFORE UPDATE OF webhook_id, app_idtrigger using the same function, or document the invariant.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/resources/migrations/109_add_webhooks.up.sql` around lines 81 - 104, The trigger only fires on INSERT, so update-only changes to webhook_id or app_id can bypass the check; add a BEFORE UPDATE trigger that fires when webhook_id or app_id are changed and calls the same check function (check_webhook_events_refs), e.g. create trigger check_refs_update_trigger before update of webhook_id, app_id on webhook_events for each row execute function check_webhook_events_refs(); ensure the new trigger name is unique and mirrors the insert trigger behavior so updates are validated the same way.
177-255: 💤 Low value
explain_claim_webhook_eventsduplicates the claim SQL; risk of drift.The SQL body is copied verbatim from
claim_webhook_events. Any future tuning (predicate, ordering, locking) needs to be applied in two places or the EXPLAIN output will diverge from production. Consider generating both from a single text fragment (e.g., afunction_formathelper), or at minimum adding a comment in both functions pointing at the other so reviewers remember to update both.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/resources/migrations/109_add_webhooks.up.sql` around lines 177 - 255, The explain_claim_webhook_events function duplicates the SQL from claim_webhook_events which risks drift; extract the shared SQL into a single helper (e.g., a new function get_claim_webhook_events_sql() or function_format helper) that returns the claim UPDATE ... RETURNING query as text, then have claim_webhook_events EXECUTE that helper's text and have explain_claim_webhook_events EXECUTE 'EXPLAIN ... ' || helper_text (or wrap helper text appropriately) so both use the same source; if you cannot refactor now, add clear cross-reference comments in both explain_claim_webhook_events and claim_webhook_events pointing at the other to remind reviewers to update both.
118-173: 💤 Low valueLong-pending events outside the 2-bucket window become unreachable.
buckets := array[current_bucket, (current_bucket + 12) % 13]only scans the current and immediately previous 30-day bucket, so apendingrow that sat unhandled for more than ~60 days (e.g., due to a sustained outage of the processor) lives in a partition the claim function will never look at, even after recovery.free_stuck_events!(kicker) and 13-bucket retention should mask this in normal operation, but if there's no other path that scans all buckets you may silently drop those events when the next truncation cycle reuses the partition. Worth either documenting this 60-day bound as the operational SLO for pending freshness, or havingfree_stuck_events!widen the bucket scan.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/resources/migrations/109_add_webhooks.up.sql` around lines 118 - 173, The claim_webhook_events function only scans two buckets via buckets := array[current_bucket, (current_bucket + 12) % 13], which leaves pending events older than ~60 days unreachable; update the logic in claim_webhook_events to build a full scan range (e.g., generate an array of all 13 bucket ids in rotation starting at current_bucket) and use that array instead of the two-element buckets variable so the FOR loop and the WITH queries (next_app / locked) can consider every partition, or alternatively document the 60-day operational SLO and ensure free_stuck_events! is adjusted to widen its bucket scan accordingly.
110-110: 💤 Low valueIndex column ordering for the retry-claim path.
The retry claim presumably filters by
next_attempt_after <= now() AND partition_bucket = ANY(...) AND status = 'error'. The partial index(next_attempt_after, partition_bucket, status) where status = 'error'makesstatusredundant (already implied by the WHERE), so the trailing column is dead weight. Postgres can still use the index efficiently for a range scan onnext_attempt_afterwith a secondary filter onpartition_bucket, but if the retry-claim function looks more like… where partition_bucket = ANY(buckets) AND next_attempt_after <= now() ORDER BY next_attempt_after, swapping to(partition_bucket, next_attempt_after)may be more selective. Worth verifying withexplain_claim_webhook_events-style EXPLAIN once retry claiming is in.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/resources/migrations/109_add_webhooks.up.sql` at line 110, The partial index on webhook_events currently lists columns as (next_attempt_after, partition_bucket, status) but the WHERE clause already restricts status = 'error' making the trailing status column redundant; replace the index with one keyed as (partition_bucket, next_attempt_after) and keep the same WHERE status = 'error' to better support queries that filter by partition_bucket and order/scan by next_attempt_after. Update the CREATE INDEX statement in 109_add_webhooks.up.sql to drop the redundant status column and swap column order, then validate the change with an EXPLAIN of the retry-claim path (use your explain_claim_webhook_events-style EXPLAIN) to confirm the planner uses the new index efficiently.client/packages/webhooks/src/index.ts (7)
1031-1037: 💤 Low valueHandler lookup:
handlers?.[etype]?.[action]returns theWebhookHandlerFncorrectly, but consider guarding whenetypeis a prototype name.
record.etypecomes from the server payload; if it ever contained"__proto__"or"constructor",handlers?.[etype]would resolve againstObject.prototypeand could return a non-function (e.g.Object) that's then called as a handler. Today the server controlsetype(it's from your schema), so this is a theoretical concern, but aObject.hasOwn(handlers, etype)check is cheap insurance.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 1031 - 1037, The handler lookup may accidentally resolve prototype properties if record.etype is "__proto__" or "constructor"; update the loop that builds handler from handlers/etype/action (the code referencing payload.data, record, etype, action, handlers, and WebhookHandlerFn) to first check Object.hasOwn(handlers, etype) (or Object.hasOwn(handlers?.[etype], action) as appropriate) before indexing into handlers[etype], and only fall back to handlers?.[etype]?.$default or handlers?.$default when the etype key is an own property; this guards against prototype pollution returning non-function values being called as handlers.
529-539: ⚡ Quick win
defaultJsonFetchonly treatsstatus === 200as success.If any of the manager endpoints (
create,enable,disable,resendEvent) ever return201 Createdor204 No Content, this throws anInstantAPIErrorfor what should be a successful response. The current server endpoints presumably return 200, but typical REST conventions for POSTs are 201/204 — worth either:
- accepting any 2xx (
res.ok), or- documenting that all Instant management endpoints return exactly 200.
For 204,
res.json()will also throw (empty body), so this needs to be handled if any endpoint can return 204.♻️ Suggested change
- if (res.status === 200) { - return res.json(); + if (res.ok) { + if (res.status === 204) return undefined; + return res.json(); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 529 - 539, defaultJsonFetch currently treats only status === 200 as success and calls res.json(), causing errors for 2xx responses like 201 or 204; update defaultJsonFetch to treat any res.ok (any 2xx) as success and handle 204 No Content specially by returning a sensible empty value (e.g., null or an empty object) instead of calling res.json(), while still delegating non-2xx responses to jsonReject; reference the defaultJsonFetch function and jsonReject helper when making this change.
1017-1046: 💤 Low value
processPayloadruns handlers concurrently; rejected handlers don't cancel siblings.
Promise.allrejects on the first rejection, but already-started handler promises keep executing in the background — their side effects (DB writes, external API calls) will still complete afterprocessRequesthas already returned an error to Instant (which will then retry the whole event). Document this behavior clearly so users design their handlers idempotently against duplicate runs from retry, or considerPromise.allSettledplus an aggregate error for surfacing all failures.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 1017 - 1046, processPayload currently starts all handler promises and awaits Promise.all, which causes already-started handlers to keep running after the first rejection and can hide other failures; update processPayload to use Promise.allSettled on the collected handler promises, then inspect results and if any settled with status "rejected" throw a single aggregated Error (or a custom AggregateError) that includes all rejection reasons so processRequest can surface all failures; reference the function names processPayload and processRequest and the local variable results (handler promise array) when making this change.
494-503: 💤 Low value
validateTaccepts arbitrarily future-dated signatures.
age = receivedAt - t; onlyage > toleranceis checked, so a signature witht = now + 10_000_000(clock skew, server bug, or hostile signer) yields negativeageand passes. Signing coverstso an attacker can't fabricate this, but defense-in-depth (and easier debugging when clocks drift) suggests also rejectingage < -tolerance.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 494 - 503, The validateT function currently only rejects signatures older than tolerance but allows arbitrarily future-dated t; update validateT to also reject when age < -tolerance (i.e., t is more than tolerance seconds in the future). When that condition occurs throw an InstantError (same class used now) with a clear message like "Webhook signature is too far in the future" and include the same context object (tolerance, receivedAt, t) so debugging includes the offending values; keep the existing old-age check and semantics otherwise.
441-447: 💤 Low value
hexToUint8Arraysilently mishandles odd-length input.For an odd-length hex string (e.g.
"abc"),Math.ceil(3/2) = 2, and the last byte is parsed from a single char ('c'→0x0c, not0xc0). For legitimate hex signatures (always even length) this never fires, but if a malformed/garbledv1value reaches this function the parse will produce a different-but-not-obviously-wrong byte sequence and signature verification will silently fail with the generic "did not validate" error. Reject odd-length input up front to make this case diagnosable.♻️ Suggested change
function hexToUint8Array(hexString: string): Uint8Array<ArrayBuffer> { + if (hexString.length % 2 !== 0) { + throw new InstantError('Signature hex string has odd length', { + length: hexString.length, + }); + } const bytes = new Uint8Array(Math.ceil(hexString.length / 2));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 441 - 447, The hexToUint8Array function currently parses odd-length hex strings into incorrect bytes; add an upfront validation in hexToUint8Array to reject odd-length input (e.g., if (hexString.length % 2 !== 0) throw new Error("Invalid hex string: odd length")) so malformed/garbled v1 values fail fast and produce a clear error instead of silently producing an incorrect byte sequence; keep the rest of the parsing logic in hexToUint8Array unchanged.
1244-1257: 💤 Low valueRe-serializing parsed JSON is best-effort and can silently fail signature verification.
JSON.stringify(parsedObject)is not guaranteed to reproduce the exact byte sequence the server signed — key order, numeric formatting, and whitespace can all differ. For Instant's minimal{ payloadUrl, token }body this typically round-trips, but if the server ever adds a field or changes formatting (or a middleware mutates the parsed object), every webhook will silently fail validation. The fallback error message at Line 1300-1303 is a great UX rescue, but consider also logging aconsole.warnthe first timebodyWasReserializedis hit so users notice the misconfiguration before delivery fails.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 1244 - 1257, The code currently re-serializes parsed JSON into rawBody and sets bodyWasReserialized, which can silently break signature verification; update the branch that sets bodyWasReserialized to also emit a one-time warning so users notice misconfiguration: add a module-level flag (e.g., reserializeWarningLogged) and inside the block where rawBody = JSON.stringify(bodyValue); bodyWasReserialized = true; call console.warn (only if the flag is false) describing that the request body was already parsed and re-serialization may alter bytes used for signature verification, then set the flag to true; keep the existing InstantError throw-on-failure behavior unchanged so failures still surface.
449-492: ⚡ Quick win
parseSignatureHeaderis brittle to whitespace in header values.
"t=123, kid=abc"splits to[" kid", "abc"], so" kid"won't hit the'kid'case and the function will reportkidas missing. The Instant server currently emits the header without spaces, but HTTP intermediaries occasionally normalize header values, and this kind of failure mode produces a misleading "Invalid Instant-Signature header" error. Either trim each part, or use a regex that allows optional whitespace.♻️ Suggested change
- for (const part of h.split(',')) { - const [k, v] = part.split('='); + for (const rawPart of h.split(',')) { + const part = rawPart.trim(); + const eqIdx = part.indexOf('='); + if (eqIdx < 0) continue; + const k = part.slice(0, eqIdx).trim(); + const v = part.slice(eqIdx + 1).trim(); switch (k) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 449 - 492, The parseSignatureHeader function is brittle to whitespace: when header parts include spaces (e.g., "t=123, kid=abc") the key string won't match and kid/v1 are reported missing; fix parseSignatureHeader by trimming each part and trimming the key and value after splitting on '=' (or use a regex that allows optional whitespace) so keys 't', 'kid', and 'v1' are recognized even with surrounding spaces before validating and potentially throwing InstantError with header h and missingKeys; update the parsing loop that sets t, kid, v1 accordingly.server/src/instant/webhook_sender.clj (5)
95-95: 💤 Low valueTop-level
def client (make-client)runs at namespace load.This creates the OkHttpClient (executor + virtual threads + DoH bootstrap) the moment the namespace is required, before any application-level initialization runs. If
make-clientever depends on config (e.g., proxy settings, smokescreen flags) that's not yet available at load time, this becomes hard to test/override. Tests already usewith-redefsonvalidate-urlto bypass it, but the underlyingclientcan't be swapped per-test. Consider wrapping in adelay/defonceso initialization is deferred to first use.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_sender.clj` at line 95, The top-level def client invokes make-client at namespace load which eagerly constructs the OkHttpClient (executor/virtual threads/DoH) before app-level config/tests can modify settings; change client to be lazily initialized (e.g., wrap make-client in a delay or use defonce with a thunk) so make-client runs on first deref/use instead of require; update call sites that currently reference client to deref it (or call the accessor) and keep the symbol names client and make-client so tests can with-redefs or swap the delayed/once value as needed.
141-179: 💤 Low value
send-webhookdoesn't guard againstHttpUrl/parsereturningnull.If a row's
sink.urlis ever invalid (post-validation corruption, manual DB edit, future code path that skipsvalidate-url),parsed-urlisnulland(.url builder null)throws inside the builder beforeensure-safe-host!runs. The outercatch Exceptiondoes fall through tomake-exception-attemptwith["unknown" "Unknown error."], so the row is recorded as a generic failure instead of a clearly-classified"dns"/"validation"error — which makes operational triage harder. Worth adding an explicit null check that returns a typedWebhookAttemptwitherror-type "url"or similar.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_sender.clj` around lines 141 - 179, The send-webhook function must guard against HttpUrl/parse returning nil: after computing parsed-url in send-webhook, add an explicit nil check before using it in the Request$Builder and before calling ensure-safe-host!; if parsed-url is nil return a WebhookAttempt (created like the existing success/error branch) representing an immediate failure with start timestamp, zero duration, success? false, a suitable HTTP code (e.g. 0 or nil as used elsewhere), a concise response-text, and pass error-type "url" and an explanatory error-message via the same path used by make-exception-attempt so callers and telemetry (and functions like ensure-safe-host!, make-exception-attempt, and tracing) can classify it as a URL/validation error rather than an unknown exception.
48-59: 💤 Low value
invokeAnysemantics worth confirming: only the winner's result is used; losers' errors are silently dropped.
ExecutorService.invokeAnyreturns the result of the first successfully completed task and cancels the rest. If Cloudflare returns[badIP, goodIP]and finishes first, you use Cloudflare's result and never compare to Google's. Combined with the post-resolutionbad-ip?filter that's still safe against SSRF, but operationally if Cloudflare is consistently slow you'll silently lose its visibility (and vice versa). Confirm whether you want this race-for-first behavior vs. e.g. preferring a fixed primary with fallback.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_sender.clj` around lines 48 - 59, race-dns-resolve currently uses ExecutorService.invokeAny which returns only the first successful task and silently drops others; decide whether you want "race-to-first" or "primary-with-fallback" semantics and change the implementation accordingly: if you want a strict race keep invokeAny but add a comment and tracing to document that only the fastest resolver wins (and retain the bad-ip? post-filter), otherwise replace invokeAny with a deterministic approach such as submitting the primary resolver first and falling back to others on failure, or use .invokeAll to run all lookups, collect each task's result/errors, and then select based on your preference (e.g., prefer primary tag or prefer most recent/good IP) while preserving tracer/add-data! for winner and recording errors for losers.
65-78: 💤 Low valueAvoid relying on undocumented mutability of DnsOverHttps return value.
The
okhttp3.Dnsinterface declareslookup()as returningList<InetAddress>, notArrayList. OkHttp's current implementation may return a mutableArrayList, but that's an implementation detail. If a future OkHttp release switches toList.copyOf()or an immutable wrapper,Collection/.removeIfwill throwUnsupportedOperationException. Defensively create a mutable copy:♻️ Suggested change
(reify Dns (lookup [_ hostname] - (let [ips (race-dns-resolve executor dns-clients hostname)] - ;; Mutates in place (dns resolver returns an ArrayList) - (Collection/.removeIf ips bad-ip?) - ips))))) + (let [ips (race-dns-resolve executor dns-clients hostname) + filtered (java.util.ArrayList. ^Collection ips)] + (Collection/.removeIf filtered bad-ip?) + filtered)))))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_sender.clj` around lines 65 - 78, The lookup implementation in make-dns-resolver mutates the List returned by race-dns-resolve which may be immutable; instead, create a new mutable copy of the ips list before calling Collection/.removeIf—i.e., after calling race-dns-resolve in lookup, construct a mutable list from that result, apply removeIf with bad-ip? against the mutable copy, and return that copy; reference make-dns-resolver, the lookup method and race-dns-resolve/bad-ip? when making the change.
181-192: ⚡ Quick winRemove the dead
try/catcharoundHttpUrl/parseand explicitly check fornull.OkHttp's
HttpUrl.parse(String)returnsnullfor malformed input—it never throws an exception. Thetry/catchat lines 182–185 is dead code that doesn't catch anything. When an invalid URL is passed,nullflows intoensure-safe-host!at line 187, causing aNullPointerExceptionwhen calling(.host nil). This NPE is then caught by the outer(catch Exception _)at line 191–192, surfacing as"Could not resolve URL."— which is misleading because the URL was never resolvable, it was malformed.Check for
nilexplicitly afterHttpUrl/parseso the validation error accurately reflects the cause:♻️ Suggested change
(defn validate-url [^String url] - (let [url (try - (HttpUrl/parse url) - (catch Exception _ - (ex/throw-validation-err! :webhook {:url url} [{:message "Invalid URL."}])))] + (let [parsed (HttpUrl/parse url) + _ (when (nil? parsed) + (ex/throw-validation-err! :webhook {:url url} [{:message "Invalid URL."}])) + url parsed] (try (ensure-safe-host! url) - (when (empty? (.lookup (.dns client) - (HttpUrl/.host url))) + (when (empty? (.lookup (.dns client) (HttpUrl/.host url))) (throw (Exception. "Could not resolve URL."))) (catch Exception _ (ex/throw-validation-err! :webhook {:url url} [{:message "Could not resolve URL."}])))))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_sender.clj` around lines 181 - 192, In validate-url, remove the dead try/catch around HttpUrl/parse and instead call HttpUrl/parse once, check for nil explicitly and call ex/throw-validation-err! :webhook {:url url} [{:message "Invalid URL."}] when parse returns nil; if non-nil, pass the parsed HttpUrl to ensure-safe-host! and continue the DNS resolution check using (.host parsed) so a malformed URL yields "Invalid URL." and only unresolved but well-formed hosts hit the "Could not resolve URL." validation error.server/src/instant/webhook_processor.clj (4)
67-81: 💤 Low value
should-repeat?returnsnilwheneventsis empty / no limit hit; consider returningfalse.
when-letyieldsnilon emptyevents, and the innerwhenalso yieldsnilwhen neither limit is reached. The caller ((when (should-repeat? …) ::repeat)) treats both as falsey, so this is functionally correct but the predicate name implies a boolean. Either rename to make the tri-state explicit or normalize toboolean.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_processor.clj` around lines 67 - 81, The predicate should-repeat? currently returns nil when events is empty or no limit is hit; update should-repeat? to always return a boolean (true/false) instead of nil by normalizing the result: ensure the loop returns true when a limit is hit and false otherwise (e.g., replace the outer when-let with an if-let or wrap the final result with boolean), targeting the should-repeat? function and its local symbols events, max-apps, max-per-app, acc, next-count, next-acc so callers receive an explicit false instead of nil.
91-108: 💤 Low valueMagic numbers
max-apps/max-per-apphardcoded inwork!.The retry path uses
max-events = 100. These limits aren't exposed viaflags(unlike worker counts), so operational tuning requires a redeploy. Consider promoting them todefs neardefault-worker-count(or flag-driven) so they're discoverable and adjustable consistently with worker counts.
25-33: 💤 Low value
notify-eventsdiscardsevent-primary-keyswhile the queue rebroadcast still relies on(.offer q 0)fromnotify-ready.The current dispatch is fine for now (workers re-poll the DB), but the 1-arg version silently drops events when
@processisnil(e.g., during startup races beforestart-globalhas run). Oncestart-globalis called the kicker will catch up after ~2 minutes (10 in dev), so this is bounded but worth either:
- documenting the startup-latency window in the docstring, or
- enqueuing into a small in-memory buffer that
startdrains on boot.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_processor.clj` around lines 25 - 33, notify-events currently drops event-primary-keys when `@process` is nil (startup races); fix by buffering keys in a small in-memory atom and draining them on start: create an atom (e.g., pending-events), have the 1-arg notify-events push incoming event-primary-keys into pending-events if `@process` is nil, and in start-global (or the function that initializes process) drain pending-events and call notify-events with the collected keys or invoke the existing notify-ready logic so those events are processed; reference symbols: notify-events, event-primary-keys, process, pending-events (new), start-global, notify-ready.
110-119: 💤 Low valueUnbounded
notify-readyecho can pile up but is bounded by queue capacity — verify intent.
(.offer q 0)is non-blocking and silently drops once the queue (capacity =worker-count) is full. That's the intended throttle, but instart-retry-workerthe double(notify-retry)(notify-retry)after a repeat means a single repeating worker can saturate the queue and starve others' wake-up signals on a busy retry path. Confirm this ramp-up behavior is the desired contention model.Also applies to: 139-151
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/webhook_processor.clj` around lines 110 - 119, The retry notification can duplicate and overflow the ArrayBlockingQueue because (.offer q 0) is non-blocking and silently drops when full; in start-worker and start-retry-worker (functions start-worker, start-retry-worker) avoid calling notify-ready/notify-retry twice on a single repeat — either change the repeat path to call the notifier exactly once, or switch to a bounded blocking call (e.g., use put semantics or retry-on-fail until enqueued) or check the offer return value and only retry notification when appropriate; update the loop in start-worker and the retry loop in start-retry-worker to de-duplicate notifications so a single repeating worker cannot saturate the queue and starve others.server/test/instant/model/webhook_test.clj (2)
601-601: 💤 Low value
partition-bucket-for-time (Instant/now)can flake at bucket boundaries.If a test happens to start within seconds of a 30-day bucket rollover, the bucket computed here can differ from a subsequent
Instant/nowcapture inside production code (e.g., insiderecord-attempt!). Probability is small but non-zero. Pinning to a fixedInstant(e.g., reusedInstant/nowcaptured once and passed through) would make this deterministic.Also applies to: 662-662
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/test/instant/model/webhook_test.clj` at line 601, The test calls history/partition-bucket-for-time with Instant/now twice causing flakiness at 30-day bucket boundaries; capture Instant/now once at test start (e.g., bind to a local symbol like fixed-now) and use that fixed Instant when calling history/partition-bucket-for-time and when passing time into code under test (e.g., record-attempt!), and update the other occurrence noted (around the other occurrence at the same file region) to reuse the same fixed Instant so bucket computations are deterministic.
614-642: 💤 Low valueManual
update … set status = 'processing'couples this test to retry-claim internals.The comment correctly notes this simulates retry-claim re-locking, but if the production retry path ever changes which columns it touches (e.g., starts also writing
machine_idornext_attempt_after), this loop will silently pass even though the real code path is broken. Consider invoking the actual claim function (webhook-model/claim-retry-events!) here so the test exercises the production locking semantics, or document the invariant being asserted (status transitions only).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/test/instant/model/webhook_test.clj` around lines 614 - 642, The test currently uses a manual SQL update (sql/do-execute!) to re-lock the row which couples the test to retry-claim internals; replace that manual update with a call to the real claim function (webhook-model/claim-retry-events!) so the test exercises production locking semantics — invoke webhook-model/claim-retry-events! with the same connection/context and machine-id/partition-bucket/isn/webhook-id used in the loop after webhook/record-attempt! and before the next status check, and keep the existing assertions against status and next_attempt_after (or document the invariant if you prefer to assert only transitions) so the test fails if the production claim behavior changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@client/www/components/components/ui/item.tsx`:
- Around line 54-72: Item is missing the ARIA role for list children; update the
Item component so the rendered element (Comp) has role="listitem" by default
while still allowing callers to override it via props (i.e., if props.role is
provided, use that, otherwise set role to 'listitem'). Modify the return in
function Item (where Comp is Slot or 'div') to pass the role prop through
(role={props.role ?? 'listitem'}) so both direct divs and asChild Slot receive
the proper ARIA role matching ItemGroup's role="list".
In `@server/src/instant/dash/routes.clj`:
- Around line 1721-1736: The cursor currently stores created-at as toEpochMilli
which drops sub-millisecond precision; update encode-events-cursor and
decode-events-cursor to serialize the full Instant (seconds and nanoseconds)
instead of a single epoch milli. In encode-events-cursor, get the Instant from
created-at, write .getEpochSecond as a long and .getNano as an int into the
ByteBuffer (allocate +12 for 8+4 bytes plus isn bytes) before the isn bytes, and
use the same Base64 encode logic. In decode-events-cursor, read the epochSecond
(long) and nano (int) from the buffer and reconstruct the Instant with
Instant/ofEpochSecond(epochSecond, nano) before returning :created-at, and keep
isn reconstruction via isn/<-bytes unchanged. Ensure buffer sizes and .get/.put
usages match (long then int then byte[]).
In `@server/src/instant/smokescreen.clj`:
- Around line 38-44: has-ipv6-embedding? currently misses IPv4-mapped IPv6
addresses (::ffff:0:0/96), so update the function to detect that mapping as well
by adding a check for the IPv4-mapped prefix (e.g. include (.contains
ipv4-mapped-prefix ip)) or call the library helper (if present) such as
isIPv4Mapped on the IPAddress; modify the conjunction in has-ipv6-embedding? to
include this new check alongside nat64-well-known-prefix, six->four-prefix, and
teredo-prefix so ::ffff:127.0.0.1 and similar addresses are treated as embedded
IPv4.
In `@server/src/instant/util/test.clj`:
- Around line 373-390: The startup timeout check that derefs (:started-promise
opts#) must be moved out of the let bindings and into the try block so the
finally always runs on timeout; after creating worker# (future (wal/start-worker
opts#)) and pumper# (future ... reading from wal-chan#), perform the existing (=
::timeout (deref (:started-promise opts#) 10000 ::timeout)) check inside the try
(before ~@body), throwing the same ex-info and capturing {:worker-error (when
(future-done? worker#) (try `@worker`# (catch Throwable t# t#)))} as before; leave
inv/stop, `@worker`# and `@pumper`# in the finally unchanged.
In `@server/src/instant/webhook_processor.clj`:
- Around line 235-243: Multiple flag-listeners for :webhook-worker-count and
:webhook-retry-worker-count can concurrently call restart, causing interleaved
stop-global/start-global and orphaned workers; fix by serializing restart calls
or coalescing the listeners. Update the _watch-worker-count-change setup (the
flags/add-flag-listener callbacks) so they do not call restart concurrently:
either combine both listeners into a single watcher that reads both flags and
calls restart once when either changes, or wrap the restart path
(stop-global/start-global and any mutations of the process atom) with a locking
guard to ensure only one restart runs at a time; ensure webhook-processor still
initializes with safe default worker counts and that the watcher enforces the
final :webhook-worker-count and :webhook-retry-worker-count after flags are
initialized.
- Around line 184-206: In shutdown (the shutdown fn) you must avoid the race
where chime's kicker thread can call notify-ready/notify-retry between (.clear
q) and offering stop-signal; change the order so you still (reset! shutdown?
true) first, then immediately (.close kicker) to prevent the kicker callback
from running, and only after closing the kicker perform (.clear q) and (.offer q
stop-signal ...) and the same for retry-q; keep the existing future deref/cancel
logic for workers and retry-workers and then (.shutdownNow executor) — this
ensures no new items are pushed after you clear and insert stop-signals.
---
Nitpick comments:
In `@client/packages/webhooks/package.json`:
- Line 56: The package.json currently pins Vitest at "vitest": "^0.21.0" which
is very old; update that dependency entry to a current stable Vitest 2.x (or the
latest stable release) in client/packages/webhooks/package.json, then run your
package manager (npm/yarn/pnpm) to update the lockfile and install, run the test
suite/CI to catch breaking changes, and adjust any Vitest config or TypeScript
settings (vite.config/test, vitest.config, tsconfig) to address API or
TypeScript 5.x compatibility issues introduced by the major upgrade.
In `@client/packages/webhooks/src/index.ts`:
- Around line 1031-1037: The handler lookup may accidentally resolve prototype
properties if record.etype is "__proto__" or "constructor"; update the loop that
builds handler from handlers/etype/action (the code referencing payload.data,
record, etype, action, handlers, and WebhookHandlerFn) to first check
Object.hasOwn(handlers, etype) (or Object.hasOwn(handlers?.[etype], action) as
appropriate) before indexing into handlers[etype], and only fall back to
handlers?.[etype]?.$default or handlers?.$default when the etype key is an own
property; this guards against prototype pollution returning non-function values
being called as handlers.
- Around line 529-539: defaultJsonFetch currently treats only status === 200 as
success and calls res.json(), causing errors for 2xx responses like 201 or 204;
update defaultJsonFetch to treat any res.ok (any 2xx) as success and handle 204
No Content specially by returning a sensible empty value (e.g., null or an empty
object) instead of calling res.json(), while still delegating non-2xx responses
to jsonReject; reference the defaultJsonFetch function and jsonReject helper
when making this change.
- Around line 1017-1046: processPayload currently starts all handler promises
and awaits Promise.all, which causes already-started handlers to keep running
after the first rejection and can hide other failures; update processPayload to
use Promise.allSettled on the collected handler promises, then inspect results
and if any settled with status "rejected" throw a single aggregated Error (or a
custom AggregateError) that includes all rejection reasons so processRequest can
surface all failures; reference the function names processPayload and
processRequest and the local variable results (handler promise array) when
making this change.
- Around line 494-503: The validateT function currently only rejects signatures
older than tolerance but allows arbitrarily future-dated t; update validateT to
also reject when age < -tolerance (i.e., t is more than tolerance seconds in the
future). When that condition occurs throw an InstantError (same class used now)
with a clear message like "Webhook signature is too far in the future" and
include the same context object (tolerance, receivedAt, t) so debugging includes
the offending values; keep the existing old-age check and semantics otherwise.
- Around line 441-447: The hexToUint8Array function currently parses odd-length
hex strings into incorrect bytes; add an upfront validation in hexToUint8Array
to reject odd-length input (e.g., if (hexString.length % 2 !== 0) throw new
Error("Invalid hex string: odd length")) so malformed/garbled v1 values fail
fast and produce a clear error instead of silently producing an incorrect byte
sequence; keep the rest of the parsing logic in hexToUint8Array unchanged.
- Around line 1244-1257: The code currently re-serializes parsed JSON into
rawBody and sets bodyWasReserialized, which can silently break signature
verification; update the branch that sets bodyWasReserialized to also emit a
one-time warning so users notice misconfiguration: add a module-level flag
(e.g., reserializeWarningLogged) and inside the block where rawBody =
JSON.stringify(bodyValue); bodyWasReserialized = true; call console.warn (only
if the flag is false) describing that the request body was already parsed and
re-serialization may alter bytes used for signature verification, then set the
flag to true; keep the existing InstantError throw-on-failure behavior unchanged
so failures still surface.
- Around line 449-492: The parseSignatureHeader function is brittle to
whitespace: when header parts include spaces (e.g., "t=123, kid=abc") the key
string won't match and kid/v1 are reported missing; fix parseSignatureHeader by
trimming each part and trimming the key and value after splitting on '=' (or use
a regex that allows optional whitespace) so keys 't', 'kid', and 'v1' are
recognized even with surrounding spaces before validating and potentially
throwing InstantError with header h and missingKeys; update the parsing loop
that sets t, kid, v1 accordingly.
In `@client/www/pages/dash/index.tsx`:
- Line 269: Dashboard currently calls useFlag('webhooks') in both Dashboard and
DashboardContent; remove the duplicate hook by reading the flag once in
Dashboard (const webhooksEnabled = useFlag('webhooks')) and pass webhooksEnabled
as a prop to DashboardContent, update DashboardContent's props
interface/parameter to accept webhooksEnabled, and replace the internal
useFlag('webhooks') call inside DashboardContent with the passed prop to make
the data flow explicit and avoid duplicate hooks.
In `@server/resources/migrations/109_add_webhooks.up.sql`:
- Around line 81-104: The trigger only fires on INSERT, so update-only changes
to webhook_id or app_id can bypass the check; add a BEFORE UPDATE trigger that
fires when webhook_id or app_id are changed and calls the same check function
(check_webhook_events_refs), e.g. create trigger check_refs_update_trigger
before update of webhook_id, app_id on webhook_events for each row execute
function check_webhook_events_refs(); ensure the new trigger name is unique and
mirrors the insert trigger behavior so updates are validated the same way.
- Around line 177-255: The explain_claim_webhook_events function duplicates the
SQL from claim_webhook_events which risks drift; extract the shared SQL into a
single helper (e.g., a new function get_claim_webhook_events_sql() or
function_format helper) that returns the claim UPDATE ... RETURNING query as
text, then have claim_webhook_events EXECUTE that helper's text and have
explain_claim_webhook_events EXECUTE 'EXPLAIN ... ' || helper_text (or wrap
helper text appropriately) so both use the same source; if you cannot refactor
now, add clear cross-reference comments in both explain_claim_webhook_events and
claim_webhook_events pointing at the other to remind reviewers to update both.
- Around line 118-173: The claim_webhook_events function only scans two buckets
via buckets := array[current_bucket, (current_bucket + 12) % 13], which leaves
pending events older than ~60 days unreachable; update the logic in
claim_webhook_events to build a full scan range (e.g., generate an array of all
13 bucket ids in rotation starting at current_bucket) and use that array instead
of the two-element buckets variable so the FOR loop and the WITH queries
(next_app / locked) can consider every partition, or alternatively document the
60-day operational SLO and ensure free_stuck_events! is adjusted to widen its
bucket scan accordingly.
- Line 110: The partial index on webhook_events currently lists columns as
(next_attempt_after, partition_bucket, status) but the WHERE clause already
restricts status = 'error' making the trailing status column redundant; replace
the index with one keyed as (partition_bucket, next_attempt_after) and keep the
same WHERE status = 'error' to better support queries that filter by
partition_bucket and order/scan by next_attempt_after. Update the CREATE INDEX
statement in 109_add_webhooks.up.sql to drop the redundant status column and
swap column order, then validate the change with an EXPLAIN of the retry-claim
path (use your explain_claim_webhook_events-style EXPLAIN) to confirm the
planner uses the new index efficiently.
In `@server/src/instant/nippy.clj`:
- Around line 278-294: The WebhookEvents custom serializer currently freezes
each primary key field without caching; if event-primary-keys often contain
repeated values (e.g., same webhook_id/isn), wrap the serialization in
nippy/with-cache similar to the WalRecord encoder so repeated values are
deduplicated during freeze. Modify the nippy/extend-freeze for WebhookEvents to
call nippy/with-cache around the body that calls nippy/freeze-to-out! and the
per-key writes (write-uuid, write-isn, nippy/freeze-to-out! for
partition_bucket) so identical values are cached, leaving extend-thaw unchanged
unless you add complementary cache-aware thawing logic.
In `@server/src/instant/webhook_processor.clj`:
- Around line 67-81: The predicate should-repeat? currently returns nil when
events is empty or no limit is hit; update should-repeat? to always return a
boolean (true/false) instead of nil by normalizing the result: ensure the loop
returns true when a limit is hit and false otherwise (e.g., replace the outer
when-let with an if-let or wrap the final result with boolean), targeting the
should-repeat? function and its local symbols events, max-apps, max-per-app,
acc, next-count, next-acc so callers receive an explicit false instead of nil.
- Around line 25-33: notify-events currently drops event-primary-keys when
`@process` is nil (startup races); fix by buffering keys in a small in-memory atom
and draining them on start: create an atom (e.g., pending-events), have the
1-arg notify-events push incoming event-primary-keys into pending-events if
`@process` is nil, and in start-global (or the function that initializes process)
drain pending-events and call notify-events with the collected keys or invoke
the existing notify-ready logic so those events are processed; reference
symbols: notify-events, event-primary-keys, process, pending-events (new),
start-global, notify-ready.
- Around line 110-119: The retry notification can duplicate and overflow the
ArrayBlockingQueue because (.offer q 0) is non-blocking and silently drops when
full; in start-worker and start-retry-worker (functions start-worker,
start-retry-worker) avoid calling notify-ready/notify-retry twice on a single
repeat — either change the repeat path to call the notifier exactly once, or
switch to a bounded blocking call (e.g., use put semantics or retry-on-fail
until enqueued) or check the offer return value and only retry notification when
appropriate; update the loop in start-worker and the retry loop in
start-retry-worker to de-duplicate notifications so a single repeating worker
cannot saturate the queue and starve others.
In `@server/src/instant/webhook_sender.clj`:
- Line 95: The top-level def client invokes make-client at namespace load which
eagerly constructs the OkHttpClient (executor/virtual threads/DoH) before
app-level config/tests can modify settings; change client to be lazily
initialized (e.g., wrap make-client in a delay or use defonce with a thunk) so
make-client runs on first deref/use instead of require; update call sites that
currently reference client to deref it (or call the accessor) and keep the
symbol names client and make-client so tests can with-redefs or swap the
delayed/once value as needed.
- Around line 141-179: The send-webhook function must guard against
HttpUrl/parse returning nil: after computing parsed-url in send-webhook, add an
explicit nil check before using it in the Request$Builder and before calling
ensure-safe-host!; if parsed-url is nil return a WebhookAttempt (created like
the existing success/error branch) representing an immediate failure with start
timestamp, zero duration, success? false, a suitable HTTP code (e.g. 0 or nil as
used elsewhere), a concise response-text, and pass error-type "url" and an
explanatory error-message via the same path used by make-exception-attempt so
callers and telemetry (and functions like ensure-safe-host!,
make-exception-attempt, and tracing) can classify it as a URL/validation error
rather than an unknown exception.
- Around line 48-59: race-dns-resolve currently uses ExecutorService.invokeAny
which returns only the first successful task and silently drops others; decide
whether you want "race-to-first" or "primary-with-fallback" semantics and change
the implementation accordingly: if you want a strict race keep invokeAny but add
a comment and tracing to document that only the fastest resolver wins (and
retain the bad-ip? post-filter), otherwise replace invokeAny with a
deterministic approach such as submitting the primary resolver first and falling
back to others on failure, or use .invokeAll to run all lookups, collect each
task's result/errors, and then select based on your preference (e.g., prefer
primary tag or prefer most recent/good IP) while preserving tracer/add-data! for
winner and recording errors for losers.
- Around line 65-78: The lookup implementation in make-dns-resolver mutates the
List returned by race-dns-resolve which may be immutable; instead, create a new
mutable copy of the ips list before calling Collection/.removeIf—i.e., after
calling race-dns-resolve in lookup, construct a mutable list from that result,
apply removeIf with bad-ip? against the mutable copy, and return that copy;
reference make-dns-resolver, the lookup method and race-dns-resolve/bad-ip? when
making the change.
- Around line 181-192: In validate-url, remove the dead try/catch around
HttpUrl/parse and instead call HttpUrl/parse once, check for nil explicitly and
call ex/throw-validation-err! :webhook {:url url} [{:message "Invalid URL."}]
when parse returns nil; if non-nil, pass the parsed HttpUrl to ensure-safe-host!
and continue the DNS resolution check using (.host parsed) so a malformed URL
yields "Invalid URL." and only unresolved but well-formed hosts hit the "Could
not resolve URL." validation error.
In `@server/test/instant/model/webhook_test.clj`:
- Line 601: The test calls history/partition-bucket-for-time with Instant/now
twice causing flakiness at 30-day bucket boundaries; capture Instant/now once at
test start (e.g., bind to a local symbol like fixed-now) and use that fixed
Instant when calling history/partition-bucket-for-time and when passing time
into code under test (e.g., record-attempt!), and update the other occurrence
noted (around the other occurrence at the same file region) to reuse the same
fixed Instant so bucket computations are deterministic.
- Around line 614-642: The test currently uses a manual SQL update
(sql/do-execute!) to re-lock the row which couples the test to retry-claim
internals; replace that manual update with a call to the real claim function
(webhook-model/claim-retry-events!) so the test exercises production locking
semantics — invoke webhook-model/claim-retry-events! with the same
connection/context and machine-id/partition-bucket/isn/webhook-id used in the
loop after webhook/record-attempt! and before the next status check, and keep
the existing assertions against status and next_attempt_after (or document the
invariant if you prefer to assert only transitions) so the test fails if the
production claim behavior changes.
In `@server/test/instant/webhook_processor_test.clj`:
- Around line 55-59: Add cleanup of the inserted webhooks row in the same
finally block that deletes webhook_events: after the existing sql/do-execute!
call for webhook_events, call sql/do-execute! with (aurora/conn-pool :write) and
a parameterized delete like ["delete from webhooks where id = ?" webhook-id] so
the webhooks row created earlier (inserted around with-empty-app / webhook-id)
is removed as well.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 166ae66f-f9a5-404a-8216-91748c53312a
⛔ Files ignored due to path filters (1)
client/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (69)
client/README.mdclient/package.jsonclient/packages/admin/package.jsonclient/packages/admin/src/index.tsclient/packages/platform/package.jsonclient/packages/platform/src/api.tsclient/packages/platform/src/index.tsclient/packages/webhooks/README.mdclient/packages/webhooks/__tests__/src/helpers.test.tsclient/packages/webhooks/__tests__/src/processPayload.test.tsclient/packages/webhooks/__tests__/src/validate.test.tsclient/packages/webhooks/package.jsonclient/packages/webhooks/src/__types__/typeUtils.tsclient/packages/webhooks/src/__types__/typesTests.tsclient/packages/webhooks/src/index.tsclient/packages/webhooks/tsconfig.dev.jsonclient/packages/webhooks/tsconfig.jsonclient/packages/webhooks/tsconfig.test.jsonclient/sandbox/react-nextjs/app/api/webhooks/route.tsclient/sandbox/react-nextjs/pages/api/webhooks-pages.tsclient/sandbox/react-nextjs/pages/play/webhooks.tsxclient/scripts/publish_packages.cljclient/www/components/components/ui/card.tsxclient/www/components/components/ui/collapsible.tsxclient/www/components/components/ui/item.tsxclient/www/components/components/ui/tabs.tsxclient/www/components/dash/WebhookEvents.tsxclient/www/components/dash/Webhooks.tsxclient/www/components/icons/WebhookIcon.tsxclient/www/lib/flags.tsclient/www/lib/types.tsclient/www/package.jsonclient/www/pages/dash/index.tsxclient/www/pages/dash/new.tsxserver/.clj-kondo/config.ednserver/.clj-kondo/hooks/with_test_replication_slot.cljserver/cljfmt.ednserver/deps.ednserver/resources/migrations/109_add_webhooks.down.sqlserver/resources/migrations/109_add_webhooks.up.sqlserver/src/instant/config.cljserver/src/instant/core.cljserver/src/instant/dash/routes.cljserver/src/instant/flags.cljserver/src/instant/grpc.cljserver/src/instant/isn.cljserver/src/instant/jdbc/sql.cljserver/src/instant/jdbc/wal.cljserver/src/instant/jdbc/wal_entry.cljserver/src/instant/model/app.cljserver/src/instant/model/history.cljserver/src/instant/model/webhook.cljserver/src/instant/nippy.cljserver/src/instant/rate_limit.cljserver/src/instant/reactive/invalidator.cljserver/src/instant/smokescreen.cljserver/src/instant/util/json.cljserver/src/instant/util/test.cljserver/src/instant/webhook_processor.cljserver/src/instant/webhook_routes.cljserver/src/instant/webhook_sender.cljserver/src/tool.cljserver/test/instant/jdbc/sql_test.cljserver/test/instant/model/webhook_test.cljserver/test/instant/nippy_test.cljserver/test/instant/smokescreen_test.cljserver/test/instant/webhook_jwt_test.cljserver/test/instant/webhook_processor_test.cljserver/test/instant/webhook_sender_test.clj
✅ Files skipped from review due to trivial changes (13)
- client/packages/platform/package.json
- server/resources/migrations/109_add_webhooks.down.sql
- server/src/instant/grpc.clj
- client/www/pages/dash/new.tsx
- client/packages/webhooks/tests/src/processPayload.test.ts
- client/packages/webhooks/tests/src/helpers.test.ts
- server/cljfmt.edn
- client/README.md
- client/packages/admin/package.json
- client/packages/webhooks/src/types/typeUtils.ts
- server/src/instant/rate_limit.clj
- client/www/components/components/ui/collapsible.tsx
- server/test/instant/smokescreen_test.clj
🚧 Files skipped from review as they are similar to previous changes (2)
- server/src/instant/webhook_routes.clj
- server/src/instant/model/webhook.clj
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (2)
server/src/instant/util/json.clj (1)
13-14: ⚡ Quick winConsider documenting the purpose and usage of the raw ObjectMapper.
This file introduces a raw Jackson
ObjectMapperalongside existing Cheshire-based JSON operations. The raw instance won't use the custom encoders defined in this file (e.g., forInstant,NullValue, protobuf types), which could lead to confusion or inconsistent JSON output if used for serializing application data.Based on the PR context (JWT operations with nimbus-jose-jwt), this is likely intended for parsing external JWT payloads rather than serializing application data, which would make the custom encoder divergence a non-issue. However, adding a docstring clarifying the intended use case would help prevent misuse and explain why two JSON serialization paths exist.
Additionally, note that while
ObjectMapperread operations are thread-safe with a fixed configuration, any future modifications to this shared instance (e.g.,registerModule,configure) would not be thread-safe.📝 Suggested documentation
+(comment + "Raw Jackson ObjectMapper for JWT/nimbus-jose-jwt operations. + Does not use custom Cheshire encoders defined in this namespace. + Do not modify this instance after creation (not thread-safe). + For general JSON serialization, use ->json/<-json (Cheshire-based).") (def ^{:tag ObjectMapper} object-mapper (ObjectMapper.))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/util/json.clj` around lines 13 - 14, The raw ObjectMapper var (object-mapper) is undocumented and can be confused with the Cheshire-based encoders (e.g., Instant, NullValue, protobuf encoders); add a docstring above the object-mapper explaining its intended purpose (use for external JWT/JOSE parsing with nimbus-jose-jwt, not for application-wide serialization), warn that it does not apply the custom encoders defined in this file, and note that altering this shared ObjectMapper (registerModule/configure) is not thread-safe—recommend creating a dedicated ObjectMapper when you need custom modules or mutable configuration instead of mutating the shared object-mapper.server/src/instant/config.clj (1)
401-405: ⚡ Quick winHandle unknown lock keys explicitly.
casecurrently fails implicitly for unsupported keys. Adding an explicit default branch gives clearer runtime diagnostics when a new lock namespace is introduced.Proposed patch
(defn pg-lock-ns "Creates a unique namespace for a pg advisory lock." [k] - (int (case k - :webhook 1))) + (int (case k + :webhook 1 + (throw (ex-info "Unknown pg advisory lock namespace" + {:key k})))))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/config.clj` around lines 401 - 405, The pg-lock-ns function uses a case on k that currently falls through for unknown keys; update pg-lock-ns to add an explicit default branch in the case expression that throws or logs a clear error (e.g., IllegalArgumentException or ex-info) including the unrecognized k value so callers get explicit diagnostics when a new lock namespace is passed; refer to the pg-lock-ns function and its case form when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@client/packages/webhooks/src/index.ts`:
- Around line 535-538: The current success check in defaultJsonFetch only
accepts res.status === 200, which treats 201/202/204 as errors; update the
conditional in defaultJsonFetch to treat any 2xx status as success (e.g.,
res.status >= 200 && res.status < 300), and handle empty-body responses like 204
by returning an empty object or suitable empty value instead of calling
res.json(); leave the error path using jsonReject((x) => Promise.reject(x), res)
unchanged for non-2xx statuses.
- Around line 494-503: The validateT function currently only rejects timestamps
that are too old; update it to also reject malformed/non-numeric t values and
signatures that are too far in the future. Inside validateT, parse t into a
number and if Number.isNaN(parsed) throw an InstantError indicating an invalid
timestamp (include t and receivedAt); then calculate age/currentSeconds and if
age > tolerance OR parsed - currentSeconds > tolerance (i.e. far-future
timestamp) throw the existing InstantError('Webhook signature is too old' or a
unified 'timestamp out of range') with context (tolerance, receivedAt, t).
Ensure you reference validateT and InstantError when making the change.
In `@client/www/components/dash/WebhookEvents.tsx`:
- Around line 342-351: The disclosure buttons currently toggle state via setOpen
and use the open variable but lack accessibility attributes; update the button
elements (the one wrapping ChevronRightIcon and AttemptOneLiner and the similar
button at the other location) to include aria-expanded={open} and aria-controls
pointing to the id of the collapsible region (add an id on the element that
renders the expanded content, e.g., collapse-panel-{attempt.id} or similar
unique id), ensuring the button references that id via aria-controls so screen
readers can determine which region is toggled.
- Around line 104-111: The polling loop in schedule fires mutate() and
immediately schedules the next timeout, allowing overlapping in-flight requests;
change schedule to wait for the current mutate to complete before scheduling the
next iteration (e.g., make schedule async and await mutate() or chain
mutate().then(...)), preserve the cancelled check around both the mutate call
and before scheduling, and continue using the existing symbols (schedule,
mutate, timeoutId, cancelled, intervalsMs, i) so slow responses don't stack
concurrent requests and errors are handled gracefully.
In `@client/www/pages/dash/index.tsx`:
- Around line 561-562: The tab list filters out the 'webhooks' tab when
webhooksEnabled is false, but the URL/query may still contain t=webhooks causing
an empty pane; detect when the selected tab value is 'webhooks' while
webhooksEnabled is false and normalize it to a safe fallback (e.g., 'home' or
the first visible tab) before rendering/selection. Update the tab selection
logic that reads the query param (where you check t.id and use webhooksEnabled)
to check for this case and replace the invalid selection with the fallback, and
apply the same normalization where selection is derived (also address the
similar logic at the other occurrence around lines 994-995).
In `@server/src/instant/isn.clj`:
- Around line 112-114: The <-bytes function assumes a 12-byte payload; add an
explicit length guard at the top of <-bytes that checks (alength ba) == 12 and
throws a clear IllegalArgumentException (or returns a deterministic error value)
when the length is invalid, including the actual length and expected 12 in the
message so callers can handle the failure deterministically; keep the rest of
the function (ByteBuffer/wrap, ->ISN, .getInt, .getLong,
LogSequenceNumber/valueOf) unchanged.
In `@server/src/instant/model/webhook.clj`:
- Around line 756-764: The current uuids->labels function only retains attrs
when (:value-type attr) is :blob, which drops scalar fields; update the
reduce-kv body so that for each attr-id you parse-uuid, call
attr-model/seek-by-id, and if attr exists always assoc! the label
(attr-model/fwd-label attr) to the value v (only special-case nil/missing attr
by skipping); in short, remove the strict :blob-only branch so scalar fields are
preserved in uuids->labels while still resolving labels via
attr-model/seek-by-id and attr-model/fwd-label.
- Around line 466-484: The UPDATE in requeue-q doesn't return the updated row,
so sql/execute-one! in requeue! returns an update-count map instead of the
event; change requeue-q to include a RETURNING clause (e.g. add a :returning
[:*] or equivalent "RETURNING *" in the uhsql/preformat map) so that requeue!
(and sql/execute-one!) returns the requeued event row rather than just the
update count.
In `@server/src/instant/webhook_sender.clj`:
- Around line 181-192: validate-url currently lets HttpUrl/parse return nil (it
doesn't throw) and then later NPEs and returns the wrong error payload because
the local url shadows the input string; fix by parsing into a new symbol (e.g.
parsed-url = HttpUrl/parse input-url), immediately check if parsed-url is nil
and call ex/throw-validation-err! :webhook with {:url input-url} and the
"Invalid URL." message, then proceed to call ensure-safe-host! and the DNS
lookup against parsed-url (use HttpUrl/.host parsed-url), and in the
DNS/ensure-safe-host! catch ensure you pass the original input string in the
error map (not the parsed-url) when calling ex/throw-validation-err!.
- Around line 73-78: Dns.lookup currently lets non-UnknownHost exceptions from
race-dns-resolve (which can throw ExecutionException) and the case where all
addresses are filtered out by (Collection/.removeIf ips bad-ip?) bubble up or
return an empty list; change it to catch ExecutionException (and any other
non-InterruptedException/RuntimeException that should be classified as DNS
failure) from race-dns-resolve and rethrow an UnknownHostException with the
original cause/message, and after filtering if ips is empty throw an
UnknownHostException (including context about all addresses being filtered)
instead of returning an empty list so classify-throwable and OkHttp treat these
as DNS failures; reference Dns.lookup, race-dns-resolve, Collection/.removeIf,
and bad-ip? when locating the code to apply the changes.
---
Nitpick comments:
In `@server/src/instant/config.clj`:
- Around line 401-405: The pg-lock-ns function uses a case on k that currently
falls through for unknown keys; update pg-lock-ns to add an explicit default
branch in the case expression that throws or logs a clear error (e.g.,
IllegalArgumentException or ex-info) including the unrecognized k value so
callers get explicit diagnostics when a new lock namespace is passed; refer to
the pg-lock-ns function and its case form when making this change.
In `@server/src/instant/util/json.clj`:
- Around line 13-14: The raw ObjectMapper var (object-mapper) is undocumented
and can be confused with the Cheshire-based encoders (e.g., Instant, NullValue,
protobuf encoders); add a docstring above the object-mapper explaining its
intended purpose (use for external JWT/JOSE parsing with nimbus-jose-jwt, not
for application-wide serialization), warn that it does not apply the custom
encoders defined in this file, and note that altering this shared ObjectMapper
(registerModule/configure) is not thread-safe—recommend creating a dedicated
ObjectMapper when you need custom modules or mutable configuration instead of
mutating the shared object-mapper.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3e0dd62c-388a-446b-bd6b-d08b2550c1ac
⛔ Files ignored due to path filters (1)
client/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (69)
client/README.mdclient/package.jsonclient/packages/admin/package.jsonclient/packages/admin/src/index.tsclient/packages/platform/package.jsonclient/packages/platform/src/api.tsclient/packages/platform/src/index.tsclient/packages/webhooks/README.mdclient/packages/webhooks/__tests__/src/helpers.test.tsclient/packages/webhooks/__tests__/src/processPayload.test.tsclient/packages/webhooks/__tests__/src/validate.test.tsclient/packages/webhooks/package.jsonclient/packages/webhooks/src/__types__/typeUtils.tsclient/packages/webhooks/src/__types__/typesTests.tsclient/packages/webhooks/src/index.tsclient/packages/webhooks/tsconfig.dev.jsonclient/packages/webhooks/tsconfig.jsonclient/packages/webhooks/tsconfig.test.jsonclient/sandbox/react-nextjs/app/api/webhooks/route.tsclient/sandbox/react-nextjs/pages/api/webhooks-pages.tsclient/sandbox/react-nextjs/pages/play/webhooks.tsxclient/scripts/publish_packages.cljclient/www/components/components/ui/card.tsxclient/www/components/components/ui/collapsible.tsxclient/www/components/components/ui/item.tsxclient/www/components/components/ui/tabs.tsxclient/www/components/dash/WebhookEvents.tsxclient/www/components/dash/Webhooks.tsxclient/www/components/icons/WebhookIcon.tsxclient/www/lib/flags.tsclient/www/lib/types.tsclient/www/package.jsonclient/www/pages/dash/index.tsxclient/www/pages/dash/new.tsxserver/.clj-kondo/config.ednserver/.clj-kondo/hooks/with_test_replication_slot.cljserver/cljfmt.ednserver/deps.ednserver/resources/migrations/109_add_webhooks.down.sqlserver/resources/migrations/109_add_webhooks.up.sqlserver/src/instant/config.cljserver/src/instant/core.cljserver/src/instant/dash/routes.cljserver/src/instant/flags.cljserver/src/instant/grpc.cljserver/src/instant/isn.cljserver/src/instant/jdbc/sql.cljserver/src/instant/jdbc/wal.cljserver/src/instant/jdbc/wal_entry.cljserver/src/instant/model/app.cljserver/src/instant/model/history.cljserver/src/instant/model/webhook.cljserver/src/instant/nippy.cljserver/src/instant/rate_limit.cljserver/src/instant/reactive/invalidator.cljserver/src/instant/smokescreen.cljserver/src/instant/util/json.cljserver/src/instant/util/test.cljserver/src/instant/webhook_processor.cljserver/src/instant/webhook_routes.cljserver/src/instant/webhook_sender.cljserver/src/tool.cljserver/test/instant/jdbc/sql_test.cljserver/test/instant/model/webhook_test.cljserver/test/instant/nippy_test.cljserver/test/instant/smokescreen_test.cljserver/test/instant/webhook_jwt_test.cljserver/test/instant/webhook_processor_test.cljserver/test/instant/webhook_sender_test.clj
✅ Files skipped from review due to trivial changes (9)
- client/packages/webhooks/tsconfig.dev.json
- client/packages/webhooks/tsconfig.json
- client/packages/webhooks/src/types/typeUtils.ts
- client/packages/webhooks/tsconfig.test.json
- client/packages/admin/package.json
- server/src/instant/grpc.clj
- client/www/package.json
- server/src/instant/rate_limit.clj
- server/resources/migrations/109_add_webhooks.down.sql
🚧 Files skipped from review as they are similar to previous changes (1)
- server/src/instant/webhook_routes.clj
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
client/packages/webhooks/src/index.ts (2)
535-538:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTreat all 2xx responses as success in
defaultJsonFetch.
status === 200incorrectly rejects valid success responses like201,202, and204.✅ Suggested fix
- if (res.status === 200) { - return res.json(); + if (res.ok) { + if (res.status === 204) return null; + return res.json(); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 535 - 538, The current check in defaultJsonFetch only treats status === 200 as success and thus rejects other 2xx responses; update the logic to treat any 2xx as success (use res.ok or check res.status >= 200 && res.status < 300) and additionally handle 204 No Content (don’t call res.json() for 204—return null/undefined or an empty value) before falling back to jsonReject((x) => Promise.reject(x), res); refer to defaultJsonFetch, jsonReject and res in your changes.
494-503:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject malformed and far-future signature timestamps.
Current logic only rejects old timestamps. Non-numeric and future
tvalues should also be rejected explicitly.🔐 Suggested fix
function validateT(receivedAt: Date, t: string, tolerance: number): void { - const age = Math.floor(receivedAt.getTime() / 1000) - parseInt(t, 10); + const ts = Number.parseInt(t, 10); + if (!Number.isFinite(ts)) { + throw new InstantError('Invalid signature timestamp', { t, receivedAt }); + } + const age = Math.floor(receivedAt.getTime() / 1000) - ts; + if (age < -tolerance) { + throw new InstantError('Webhook signature timestamp is in the future', { + tolerance, + receivedAt, + t, + }); + } if (age > tolerance) { throw new InstantError('Webhook signature is too old', { tolerance, receivedAt, t, }); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@client/packages/webhooks/src/index.ts` around lines 494 - 503, The validateT function currently only rejects old timestamps; update validateT to (1) parse t into an integer and throw an InstantError for non-numeric values (use parseInt and isNaN, include t and receivedAt in the error meta), and (2) reject far-future timestamps by computing nowSeconds = Math.floor(receivedAt.getTime() / 1000) and throwing an InstantError when the timestamp is greater than nowSeconds + tolerance (include tolerance, receivedAt, t in the error meta). Keep the existing old-timestamp check (age > tolerance) but replace its age calculation with the parsed integer value so both non-numeric and future checks use the same parsed t.
🧹 Nitpick comments (3)
server/src/instant/config.clj (1)
401-405: ⚡ Quick winConsider adding an explicit default case for clearer error messages.
The
caseexpression currently throws a genericIllegalArgumentExceptionif an unsupported key is passed. Adding an explicit default clause would provide a more descriptive error message and make the function's contract clearer.♻️ Suggested refinement
(defn pg-lock-ns "Creates a unique namespace for a pg advisory lock." [k] (int (case k - :webhook 1))) + :webhook 1 + (throw (ex-info (str "Unknown lock namespace: " k) + {:supported-namespaces [:webhook]})))))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/config.clj` around lines 401 - 405, The pg-lock-ns function uses a case on k that currently only handles :webhook, which results in a generic IllegalArgumentException for other keys; update the case in pg-lock-ns to include an explicit default branch that throws a descriptive exception (or uses ex-info) naming the invalid key and allowed values (e.g., ":webhook") so callers get a clear error message when an unsupported key is passed.server/src/instant/model/webhook.clj (1)
72-74: 💤 Low valueWeakly-consistent iteration with concurrent self-removal.
evict-webhooks-for-attr-iditerates theConcurrentHashMap/newKeySetreturned fromattr-listeners, and eachevict-webhook-from-cachecall triggers the Caffeine:on-removehook →remove-attr-listener, which mutates the same set we're iterating. The iterator is weakly consistent so this won't throw, but may yield duplicate visits or skip newly added entries during the loop. Snapshotting before iterating eliminates the ambiguity:♻️ Suggested refactor
(defn evict-webhooks-for-attr-id [attr-id] - (doseq [params (.get attr-listeners attr-id)] + (doseq [params (some-> (.get attr-listeners attr-id) vec)] (evict-webhook-from-cache params)))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/src/instant/model/webhook.clj` around lines 72 - 74, evict-webhooks-for-attr-id is iterating the live ConcurrentHashMap/newKeySet returned by attr-listeners while evict-webhook-from-cache triggers the Caffeine :on-remove hook (remove-attr-listener) which mutates that same set; take a snapshot of the listener set before iterating to avoid weakly-consistent duplicates/skips — e.g. obtain a stable sequence via (vec ...) or (into [] ...) of (.get attr-listeners attr-id) and then doseq over that snapshot so evict-webhook-from-cache (and remove-attr-listener) can safely mutate the original set during iteration.server/test/instant/webhook_sender_test.clj (1)
47-110: ⚖️ Poor tradeoffExternal network dependency in tests.
dns-resolver-filters-bad-ipsandsend-webhook-delivers-valid-signaturerely on live DoH resolution to Cloudflare/Google for*.nip.iolookups. These will fail in sandboxed/CI environments without internet egress and tie test reliability to third-party services. Consider gating these behind a network-required test selector or stubbingwebhook-sender/race-dns-resolveso they can run hermetically.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/test/instant/webhook_sender_test.clj` around lines 47 - 110, The tests dns-resolver-filters-bad-ips (and related send-webhook-delivers-valid-signature) depend on external DoH lookups; update them to avoid live network calls by either 1) gating the tests behind a network-required flag (e.g., wrap deftest in a check for an ENV var or test selector and skip when not present) or 2) stub/mock webhook-sender/race-dns-resolve in the test to return deterministic DNS results for the nip.io hostnames so the tests run hermetically; locate references to the tests named dns-resolver-filters-bad-ips and the function webhook-sender/race-dns-resolve to implement the gating or the stub.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@client/packages/webhooks/README.md`:
- Around line 114-136: The example imports Webhooks but then uses an undefined
db variable (used in db.webhooks.helpers(), db.webhooks.processRequest), which
will break copy/paste; update the snippet to show how to obtain or instantiate
db (or access webhooks) so the example is self-contained — for example,
demonstrate creating/obtaining a Webhooks-enabled client and assign it to db (or
replace db.webhooks.* calls with the appropriate Webhooks instance methods),
ensuring the symbols typedHandlers, combineHandlers, handlers, and
processRequest refer to that shown instance.
In `@client/www/components/dash/WebhookEvents.tsx`:
- Around line 85-86: The URL-building uses raw isn in path segments (e.g., the
url const in WebhookEvents.tsx that concatenates config.apiURI, app.id,
webhook.id and isn), which breaks when ISNs contain reserved characters; wrap
isn with encodeURIComponent(...) wherever it's interpolated into a path (replace
`${isn}` with `${encodeURIComponent(isn)}`) and apply the same change to the
other URL constructions in this file that use isn in path segments (the other
similar occurrences referenced in the comment).
In `@server/src/instant/model/history.clj`:
- Around line 237-250: The webhook join must only consider rows actually
inserted into history: modify the :history-inserts CTE to include a RETURNING
clause that returns the columns needed for webhook matching (at least isn,
app-id, topics, partition-bucket or the same columns used from :inputs), then
change the final SELECT/JOIN to join against the :history-inserts CTE
(referencing the CTE name :history-inserts) instead of :inputs so webhook
fan-out only happens for newly-inserted history rows; update any column aliases
used in the SELECT (e.g., :isn, :app-id, :topics) to reference the returned
columns from :history-inserts.
In `@server/src/instant/webhook_processor.clj`:
- Around line 98-103: The code reifies java.util.concurrent.Callable in the
creation of tasks (used with .invokeAll producing futs) but never imports
Callable, causing a compile error; add an import for
java.util.concurrent.Callable (or reference it fully) in the namespace
declaration so the reify Callable forms at the task creation sites (the reify
wrapping handle-event! and Timestamp/.toInstant) compile successfully.
---
Duplicate comments:
In `@client/packages/webhooks/src/index.ts`:
- Around line 535-538: The current check in defaultJsonFetch only treats status
=== 200 as success and thus rejects other 2xx responses; update the logic to
treat any 2xx as success (use res.ok or check res.status >= 200 && res.status <
300) and additionally handle 204 No Content (don’t call res.json() for
204—return null/undefined or an empty value) before falling back to
jsonReject((x) => Promise.reject(x), res); refer to defaultJsonFetch, jsonReject
and res in your changes.
- Around line 494-503: The validateT function currently only rejects old
timestamps; update validateT to (1) parse t into an integer and throw an
InstantError for non-numeric values (use parseInt and isNaN, include t and
receivedAt in the error meta), and (2) reject far-future timestamps by computing
nowSeconds = Math.floor(receivedAt.getTime() / 1000) and throwing an
InstantError when the timestamp is greater than nowSeconds + tolerance (include
tolerance, receivedAt, t in the error meta). Keep the existing old-timestamp
check (age > tolerance) but replace its age calculation with the parsed integer
value so both non-numeric and future checks use the same parsed t.
---
Nitpick comments:
In `@server/src/instant/config.clj`:
- Around line 401-405: The pg-lock-ns function uses a case on k that currently
only handles :webhook, which results in a generic IllegalArgumentException for
other keys; update the case in pg-lock-ns to include an explicit default branch
that throws a descriptive exception (or uses ex-info) naming the invalid key and
allowed values (e.g., ":webhook") so callers get a clear error message when an
unsupported key is passed.
In `@server/src/instant/model/webhook.clj`:
- Around line 72-74: evict-webhooks-for-attr-id is iterating the live
ConcurrentHashMap/newKeySet returned by attr-listeners while
evict-webhook-from-cache triggers the Caffeine :on-remove hook
(remove-attr-listener) which mutates that same set; take a snapshot of the
listener set before iterating to avoid weakly-consistent duplicates/skips — e.g.
obtain a stable sequence via (vec ...) or (into [] ...) of (.get attr-listeners
attr-id) and then doseq over that snapshot so evict-webhook-from-cache (and
remove-attr-listener) can safely mutate the original set during iteration.
In `@server/test/instant/webhook_sender_test.clj`:
- Around line 47-110: The tests dns-resolver-filters-bad-ips (and related
send-webhook-delivers-valid-signature) depend on external DoH lookups; update
them to avoid live network calls by either 1) gating the tests behind a
network-required flag (e.g., wrap deftest in a check for an ENV var or test
selector and skip when not present) or 2) stub/mock
webhook-sender/race-dns-resolve in the test to return deterministic DNS results
for the nip.io hostnames so the tests run hermetically; locate references to the
tests named dns-resolver-filters-bad-ips and the function
webhook-sender/race-dns-resolve to implement the gating or the stub.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 0742ded3-723f-4809-a19f-0b573a928ecb
⛔ Files ignored due to path filters (1)
client/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (69)
client/README.mdclient/package.jsonclient/packages/admin/package.jsonclient/packages/admin/src/index.tsclient/packages/platform/package.jsonclient/packages/platform/src/api.tsclient/packages/platform/src/index.tsclient/packages/webhooks/README.mdclient/packages/webhooks/__tests__/src/helpers.test.tsclient/packages/webhooks/__tests__/src/processPayload.test.tsclient/packages/webhooks/__tests__/src/validate.test.tsclient/packages/webhooks/package.jsonclient/packages/webhooks/src/__types__/typeUtils.tsclient/packages/webhooks/src/__types__/typesTests.tsclient/packages/webhooks/src/index.tsclient/packages/webhooks/tsconfig.dev.jsonclient/packages/webhooks/tsconfig.jsonclient/packages/webhooks/tsconfig.test.jsonclient/sandbox/react-nextjs/app/api/webhooks/route.tsclient/sandbox/react-nextjs/pages/api/webhooks-pages.tsclient/sandbox/react-nextjs/pages/play/webhooks.tsxclient/scripts/publish_packages.cljclient/www/components/components/ui/card.tsxclient/www/components/components/ui/collapsible.tsxclient/www/components/components/ui/item.tsxclient/www/components/components/ui/tabs.tsxclient/www/components/dash/WebhookEvents.tsxclient/www/components/dash/Webhooks.tsxclient/www/components/icons/WebhookIcon.tsxclient/www/lib/flags.tsclient/www/lib/types.tsclient/www/package.jsonclient/www/pages/dash/index.tsxclient/www/pages/dash/new.tsxserver/.clj-kondo/config.ednserver/.clj-kondo/hooks/with_test_replication_slot.cljserver/cljfmt.ednserver/deps.ednserver/resources/migrations/109_add_webhooks.down.sqlserver/resources/migrations/109_add_webhooks.up.sqlserver/src/instant/config.cljserver/src/instant/core.cljserver/src/instant/dash/routes.cljserver/src/instant/flags.cljserver/src/instant/grpc.cljserver/src/instant/isn.cljserver/src/instant/jdbc/sql.cljserver/src/instant/jdbc/wal.cljserver/src/instant/jdbc/wal_entry.cljserver/src/instant/model/app.cljserver/src/instant/model/history.cljserver/src/instant/model/webhook.cljserver/src/instant/nippy.cljserver/src/instant/rate_limit.cljserver/src/instant/reactive/invalidator.cljserver/src/instant/smokescreen.cljserver/src/instant/util/json.cljserver/src/instant/util/test.cljserver/src/instant/webhook_processor.cljserver/src/instant/webhook_routes.cljserver/src/instant/webhook_sender.cljserver/src/tool.cljserver/test/instant/jdbc/sql_test.cljserver/test/instant/model/webhook_test.cljserver/test/instant/nippy_test.cljserver/test/instant/smokescreen_test.cljserver/test/instant/webhook_jwt_test.cljserver/test/instant/webhook_processor_test.cljserver/test/instant/webhook_sender_test.clj
✅ Files skipped from review due to trivial changes (9)
- server/.clj-kondo/config.edn
- client/packages/webhooks/tsconfig.test.json
- client/www/pages/dash/new.tsx
- client/www/components/components/ui/item.tsx
- client/packages/platform/package.json
- client/README.md
- client/packages/webhooks/tsconfig.dev.json
- server/src/instant/rate_limit.clj
- client/packages/webhooks/src/types/typeUtils.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- server/src/instant/isn.clj
- server/src/instant/webhook_routes.clj
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
server/test/instant/model/webhook_test.clj (1)
409-420: ⚡ Quick winBound the future joins in the max-webhooks race test.
(mapv deref tasks)can hang the whole suite ifwebhook/create!ever deadlocks or blocks under contention. Adding a timeout turns that into a clear failure instead of wedging CI.💡 Proposed fix
tasks (mapv (fn [_] (future (try `@start-promise` [:ok (webhook/create! (params))] (catch Exception e [:err e])))) (range 20)) _ (deliver start-promise true) - results (mapv deref tasks) - {oks :ok errs :err} (group-by first results)] - (is (= 10 (count oks))) - (is (= 10 (count errs))) - (doseq [[_ e] errs] - (is (re-find #"may not have more than 10 active webhooks" - (.getMessage ^Exception e)))) - ;; Disabled webhooks don't count against the limit. - (let [{disabled-id :id} (sql/select-one (aurora/conn-pool :read) - ["select id from webhooks where app_id = ? limit 1" (:id app)])] - (webhook/disable! {:app-id (:id app) - :webhook-id disabled-id - :reason "test"}) - (is (webhook/create! (params))))))))) + results (mapv #(deref % 5000 ::timeout) tasks)] + (is (not-any? #{::timeout} results) "webhook/create! futures timed out") + (let [{oks :ok errs :err} (group-by first results)] + (is (= 10 (count oks))) + (is (= 10 (count errs))) + (doseq [[_ e] errs] + (is (re-find #"may not have more than 10 active webhooks" + (.getMessage ^Exception e)))) + ;; Disabled webhooks don't count against the limit. + (let [{disabled-id :id} (sql/select-one (aurora/conn-pool :read) + ["select id from webhooks where app_id = ? limit 1" (:id app)])] + (webhook/disable! {:app-id (:id app) + :webhook-id disabled-id + :reason "test"}) + (is (webhook/create! (params))))))))))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/test/instant/model/webhook_test.clj` around lines 409 - 420, The test currently blocks on (mapv deref tasks) which can hang if webhook/create! deadlocks; change the join to use timed deref on each future (use deref with a reasonable timeout and a sentinel/failure value) when collecting results into results so a stuck future fails the test instead of hanging the suite, and update the assertion handling around {oks :ok errs :err} to treat the sentinel as an error; refer to start-promise, tasks, webhook/create!, results and the mapv deref tasks site when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/test/instant/model/webhook_test.clj`:
- Around line 440-467: The test leaves two active webhooks with the same URL and
"create" action so the select-one that finds a webhook id (disabled-id) can pick
the wrong row; update the SQL lookup used before calling webhook/disable! to
also match the original etypes (e.g. require etypes to include 'users' or equal
array['users']) so it deterministically selects the webhook created by params,
then pass that id to webhook/disable!; specifically modify the sql/select-one
query (and its parameter list) used to set disabled-id so it filters on etypes
(in addition to app_id, sink->>'url' and actions) before calling
webhook/disable!.
---
Nitpick comments:
In `@server/test/instant/model/webhook_test.clj`:
- Around line 409-420: The test currently blocks on (mapv deref tasks) which can
hang if webhook/create! deadlocks; change the join to use timed deref on each
future (use deref with a reasonable timeout and a sentinel/failure value) when
collecting results into results so a stuck future fails the test instead of
hanging the suite, and update the assertion handling around {oks :ok errs :err}
to treat the sentinel as an error; refer to start-promise, tasks,
webhook/create!, results and the mapv deref tasks site when making this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7c571830-1d55-4813-aad0-81770cfbfe4e
⛔ Files ignored due to path filters (1)
client/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (3)
client/scripts/publish_packages.cljclient/www/components/dash/WebhookEvents.tsxserver/test/instant/model/webhook_test.clj
🚧 Files skipped from review as they are similar to previous changes (2)
- client/scripts/publish_packages.clj
- client/www/components/dash/WebhookEvents.tsx
See #2626 for an explanation of the webhook processing infra
See #2637 for an explanation of the Dashboard
See #2646 for an explanation of the SDK
This sets up infra for fetching webhook payloads.
It includes an endpoint for fetching webhook payloads and the code for generating and validating a JWT that allows you to fetch the payloads. When we POST the webhook event to the user's URL, we'll only send a very small payload with the
isn,webhook-id,app-id, and ajwttoken. Then the receiver will fetch the changes for that isn from the server using thejwt(I put the reasoning for this lower down).The JWT is signed by our webhook signing key. I added a new dependency on
nimbus-jose-jwtfor creating and verifying the jwts. We have auth0's jwt library, but it doesn't support our key type.The
webhookstable allows you to subscribe to a set of etypes and a set of actions. If you wanted to send a welcome email for users, then you might create a webhook foretypes=["$users"]andactions=["create"]. This PR only creates the webhook table to use for illustration and testing. There's no way for a user to create a webhook yet.Next up is the code that tails the invalidator and creates webhook events, then the code that delivers the webhooks, followed by the UI for managing webhooks.
Why make the client fetch the payload instead of sending it in the webhook?
Even though it adds a little bit of latency, it should be more efficient overall and more resilient in the case of failures.
If we're posting a payload to the server, we can't send it compressed because we don't know what the server accepts. With our approach, we send a very small uncompressed payload, and then cloudfront can compress with
gziporbrwhennodefetches the payload from the server.We can't make any guarantees about the size of the payload and many servers set an incoming body limit. With this approach, we sidestep that issue.
If an endpoint is temporarily down we don't have to spend any cycles constructing the payload until the endpoint comes back up. The user might also stop caring about the webhooks, but still return a 200 from the endoint--we'll save a bunch of cycles and bandwidth b/c the payloads will just go unfetched.
For the future, we'll already be halfway towards implementing pull-based workflows. It will also be easier to implement other sinks, like SQS, which put limits on the payload size.
If the developer uses the admin sdk, it will all be transparent to the user. Not fully-developed yet, but it will look something like this: