Webhooks Backend (Part 3) by dwwoelfel · Pull Request #2626 · instantdb/instant

dwwoelfel · 2026-05-04T22:46:59Z

See #2637 for an explanation of the Dashboard
See #2646 for an explanation of the SDK

This PR completes the backend machinery for webhooks. Next up after this will be admin sdk and the UI and the routes for creating/modifying webhooks, and viewing/retrying webhook events.

The main pieces are:

A hook in the invalidator that creates webhook events after we store wal records in the history table
A process that manages pulling webhook events from the queue and processing them
An http client that prevents server-side request forgery

Invalidator hook

When we call push-history!, we'll now return all of the webhooks that match the bloom filter (the webhook has a filter on attr-ids, and the wal-record has a filter on attr-ids--we check if they match).

Some of the returned webhooks will be false positives, so we do another filter that checks the triples changes to make sure our webhook matches.

If the webhook matches, we insert a webhook_event into the webhook_events table. Then we pick a random machine to notify that there is something new in the queue.

Webhook processor

We start a process with 15 workers (10 for webhooks and 5 for retries) that listens for notifications that a webhook_event is ready. When it gets a notification, the worker will fetch up to 100 records from the queue and process them all in parallel.

Right now we do the traditional skip locked fetch from pg. We may run into some scaling issues with that, so I've set it up so that we can switch to a system where we assign each event to a single machine. Then we just have to do a targeted update on a single row. I didn't make that the default because you still have to have a similar claim flow for events that get dropped, so this keeps things a lot simpler.

It also periodically checks for new work, stuck events, and events that need to be retried in the background.

After we send (or fail to send) the webhook, we will record an attempt on the webhook_events row. If the attempt fails, then we'll add it to the retry queue (by setting the status to 'error' and setting next_attempt_after). We'll do an exponential backoff (1m, 5m, 30m, 2h, 5h, 10h, and then every 12 hours for up to 11 attempts) until we give up after 3 days.

HTTP Client

We use Okhttp for our http client. I chose it because we already have it as a dependency through posthot-server and opentelemetry (I had to bump the opentelemetry version), it is supposed to be good at handling transitory failure, it supports connection pooling, and it supports custom DNS resolution.

We want to prevent Server-Side Request Forgery (SSRF), where an attacker uses our own network to attack our internal systems. We use a dns resolver that uses dns-over-https (checking both google and cloudflare in parallel and taking the first result). That lets us bypass the aws dns resolver, which will resolve internal services.

We have an ip filter in instant.smokescreen that prevents us from resolving dangerous IPs, like 127.0.0.1. I took the filters from Stripe's smokescreen (they use it to send webhooks, fly.io also has a nice writeup on the problem.).

Limitations

We only allow an app to have up to 100 webhooks--since we have to the processing sequentially in the singleton invalidator, we want to keep the processing to a reasonable level.

We don't support webhooks on link/unlink. I think we could probably support this later by adding link and unlink to actions and having a set of ref_attr_ids that we match against.

coderabbitai · 2026-05-04T22:47:06Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 106174d7-3d70-4151-b698-c1cc73f5168d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a complete webhook processing and delivery system, including event creation from WAL streams, HTTP sending with cryptographic signing, SSRF protection, rate limiting per app, database composite type support for webhook metadata, and comprehensive test infrastructure for webhook scenarios.

Changes

Webhook Processing and Delivery System

Layer / File(s)	Summary
Data Shapes & Types `server/src/instant/grpc.clj`, `server/src/instant/webhook_sender.clj`	Introduces `WebhookEvents` record for batching event primary keys and `WebhookAttempt` record capturing per-attempt telemetry (timing, status, response, error details).
Database Composite & Array Support `server/src/instant/jdbc/sql.clj`, `server/src/instant/jdbc/wal_entry.clj`	Adds parsing and serialization of PostgreSQL webhook_attempt and webhook_event_status arrays via new composite field parsing, token unquoting, and webhook-specific type handlers; adds print-method formatters for WalColumn and WalEntry for debugging.
SSRF & IP Filtering `server/src/instant/smokescreen.clj`, `server/src/instant/flags.clj`	New SSRF-protection module with IP classification (CGNAT, IPv6-embedding, unicast, link-local) and a whitelist-based bad-IP predicate; extends flags to parse per-item IPs for smokescreen whitelist and expose via `smokescreen-whitelist-ips()`.
HTTP Client & Signing `server/src/instant/webhook_sender.clj`	Builds OkHttp client with DNS-over-HTTPS resolvers racing Cloudflare/Google, SSRF-filtered IP results, and 20-second timeout; implements `sign-webhook` for Ed25519 timestamped signatures and `send-webhook` for signed POST delivery with retry-friendly response capture.
Webhook Model & Rate Limiting `server/src/instant/model/webhook.clj`, `server/src/instant/config.clj`	Implements webhook CRUD (create/enable/disable), URL validation, per-app rate limiting via PostgreSQL advisory locks, and data extraction paths (`webhook-data-for-wal-record`, `webhook-data-for-isn`) for event generation; adds `pg-lock-ns` for lock key generation.
Async Event Processing `server/src/instant/webhook_processor.clj`	New concurrent worker pool (regular and retry) backed by `ArrayBlockingQueue` and virtual-thread executor; handles event claiming, per-event webhook sending, retry scheduling, stuck-event kicking, and dynamic reconfiguration via flag watches.
Event Creation & WAL Integration `server/src/instant/model/history.clj`, `server/src/instant/reactive/invalidator.clj`	Refactors history partitioning to use modular bucket helpers; updates batch push to new WITH/UNNEST structure; wires `create-events!` and `notify-webhook-events` callback into invalidator's save-history process.
Server Lifecycle & Wiring `server/src/instant/core.clj`	Integrates webhook processor global start/stop into server startup/shutdown sequence with tracer spans and log initialization.
Serialization `server/src/instant/nippy.clj`	Adds Nippy extend-freeze/thaw (tag 16) for WebhookEvents, serializing event primary keys and reconstructing via `instant.grpc/->WebhookEvents`.
SQL Tooling & Formatting `server/src/tool.clj`	Adds webhook-attempt detection, composite string serialization, and `->pg-webhook-attempt-array` array formatter for unsafe SQL inlining; enables time measurement by default.
Testing Infrastructure `server/.clj-kondo/config.edn`, `server/.clj-kondo/hooks/with_test_replication_slot.clj`, `server/src/instant/util/test.clj`	New clj-kondo hook for test macro analysis; `with-test-replication-slot` macro to spin up temporary WAL replication slots and collect WalRecord events for tests; extended attribute metadata to support checked-data-type.
Test Coverage `server/test/instant/{webhook_sender,webhook_processor,smokescreen,nippy}_test.clj`, `server/test/instant/jdbc/sql_test.clj`, `server/test/instant/model/webhook_test.clj`	Comprehensive test suites for signing/verification, DNS filtering, webhook delivery, IP blocking, event serialization, composite parsing, rate limiting, and cache invalidation.
Dependencies `server/deps.edn`	Upgrades OpenTelemetry (1.44.1 → 1.61.0) and renames exporter artifact; adds OkHttp 5.3.2 (okhttp-jvm, okhttp-dnsoverhttps, mockwebserver3) and ipaddress 5.6.2 for HTTP and IP handling.

Sequence Diagram

sequenceDiagram
    participant WAL as WAL Stream
    participant Inv as Invalidator
    participant HM as History Model
    participant WM as Webhook Model
    participant WP as Webhook Processor
    participant WS as Webhook Sender
    participant HTTP as Remote Webhook
    
    WAL->>Inv: flush LSN
    Inv->>HM: push-batch!
    HM->>HM: create history entries
    HM->>WM: create-events!
    WM->>WM: compute event data<br/>(topics, idempotency key)
    WM->>WP: notify-events
    WP->>WP: enqueue to blocking queue
    
    rect rgb(100, 150, 200, 0.5)
        Note over WP: Worker Loop
        WP->>WP: claim-events!
        WP->>WS: send-webhook
        WS->>WS: DNS resolve (race DoH)
        WS->>WS: SSRF filter IP
        WS->>WS: sign payload
        WS->>HTTP: POST with signature
        HTTP-->>WS: response
        WS-->>WP: WebhookAttempt
        WP->>WM: record-attempt!
    end
    
    rect rgb(200, 100, 150, 0.5)
        Note over WP: Retry Path
        WP->>WP: should-repeat?<br/>(attempts < max)
        WP->>WM: schedule next retry
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly Related PRs

instantdb/instant#2613: Modifies webhook signing config, keyset registration, and JWKS/crypto routing that enable signature verification in this PR's webhook delivery path.

Suggested Reviewers

stopachka
nezaj

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Webhooks Backend (Part 3)' clearly summarizes the main change: completing the backend machinery for webhooks, making it directly related and specific to the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description clearly explains the main pieces of the webhook backend implementation: the invalidator hook, webhook processor, and HTTP client with SSRF protection.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch webhooks-3

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ks-3

coderabbitai

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/src/instant/core.clj`:
- Around line 356-357: The webhook processor is being started before flags are
initialized so it reads default :webhook-worker-count and
:webhook-retry-worker-count; call flags-impl/init (or otherwise initialize
flags) before invoking webhook-processor/start-global (inside with-log-init) in
-main so the processor picks up configured flag values rather than defaults—move
or add flags-impl/init prior to the (with-log-init :webhook-processor
(webhook-processor/start-global)) call.

In `@server/src/instant/flags.clj`:
- Around line 50-56: parse-ips-flag currently calls InetAddress/getByName which
performs DNS resolution; change it to only accept strict IP literals by
validating each input with inet.ipaddr.IPAddressString.isValid() (add
(inet.ipaddr IPAddressString) to :import), filter inputs with that check, log
and discard invalid entries, and then convert the validated literals to
InetAddress (or use InetAddress/getByAddress from the parsed bytes) so no DNS
lookups occur; update the function parse-ips-flag and its internal anonymous fn
to use IPAddressString validation instead of blindly calling
InetAddress/getByName.

In `@server/src/instant/model/webhook.clj`:
- Around line 176-190: The create! path currently builds :id-attr-ids and topics
using only each etype's "id" attribute, but webhook-matches? and the history
prefilter check for the actual changed attribute ids; update the create! logic
so it collects and persists the attr ids for all relevant attrs for each etype
(not just the "id" attr) before computing topics. Locate the id-attr-ids/topics
construction in create! (the reduce over etypes that calls
attr-model/seek-by-fwd-ident-name and history/bloom-bit) and change it to map
each etype to the full set of attribute ids (look up attr ids from attrs using
attr-model functions) and then compute topics by OR-ing bloom-bit for each of
those attr ids so :id_attr_ids and topics reflect the real changed fields used
by webhook-matches?.
- Around line 224-239: The update in enable-q inside the enable! flow sets
:status to the literal "enable" which is inconsistent with the rest of the code
that expects enabled webhooks to use "active"; update the uhsql/preformat for
enable-q so the :set map writes the enum value "active" (cast to :webhook_status
the same way other queries do) instead of "enable" so enable! will mark rows as
active and be visible to active-only queries; adjust enable-q (and ensure
enable! uses that updated enable-q) to reference the correct enum value.
- Around line 459-484: The current logic sets status based on :success? and
attempt_count but treats a 410 response as "error" (retryable) before calling
disable!, which leaves a retryable row; change the status computation in the let
binding in webhook.clj (the same form that builds params and calls
record-attempt!) to treat HTTP 410 as terminal: add a check like (when (= 410
(:status-code attempt)) "failed") (or place (= 410 (:status-code attempt))
branch in the cond ahead of the retry branch) so status becomes "failed" for
410s and therefore next-attempt-after remains nil (the next-attempt-after
expression already only sets a value when status = "error"); keep the subsequent
disable! call as-is so disabled webhooks are recorded as terminal attempts and
won't be retried by handle-event!.

In `@server/src/instant/smokescreen.clj`:
- Around line 31-37: Add explicit test coverage for IPv4-mapped IPv6 addresses
by extending the bad-ip? test cases to include examples like "::ffff:127.0.0.1",
"::ffff:10.0.0.1", "::ffff:192.168.1.1" (should be bad) and "::ffff:8.8.8.8"
(should be allowed), and update tests to assert these behaviors; optionally
harden has-ipv6-embedding? by adding an explicit prefix check for the
IPv4-mapped range (::ffff:0:0/96) alongside nat64-well-known-prefix,
six->four-prefix and teredo-prefix so IPv4-mapped embedding is detected even if
.isLocal semantics change.

In `@server/src/instant/util/test.clj`:
- Around line 373-386: The test currently does a blocking deref
@(:started-promise opts#) which can hang if wal/start-worker throws before
delivering that promise; replace that line with a race that surfaces worker
failures immediately: after creating worker# (future (wal/start-worker opts#)),
spawn a small watcher/future that (1) waits on (:started-promise opts#) and (2)
watches worker# for exceptions (catching/reading the worker future's thrown
exception), then deliver whichever completes first into a local promise and
deref that; if the watcher yields an exception rethrow it so the test fails and
finally block runs (refer to worker#, (:started-promise opts#),
wal/start-worker, pumper# and inv/stop to locate where to change the code).

In `@server/src/instant/webhook_processor.clj`:
- Around line 76-90: The work! function currently spawns one ua/vfuture per
event (via ua/vfuture in the mapv over events), allowing unbounded outbound
concurrency; change it to enforce a bounded concurrency equal to the configured
worker count (or a new concurrency param) so each queue token only processes up
to that many concurrent handle-event! calls. Locate work!,
webhook-model/claim-events!, ua/vfuture, max-apps/max-per-app and replace the
unbounded mapv of ua/vfuture with a bounded executor/semaphore or a
limited-parallelism map (e.g., a fixed thread-pool, semaphore-wrapped
handle-event! calls, or a utility like pmap with a concurrency limit) so deref
waits on at most N concurrent tasks; preserve should-repeat? behavior and error
handling.
- Around line 49-58: The code returns early for non-active webhooks and never
clears the claim, causing free-stuck-events! to recycle them; fix by handling
the non-active branch after webhook-model/get-by-app-id-and-webhook-id! so you
still clear the claim with webhook-model/record-attempt! (or call whatever
terminal/failed-recording helper exists) with a terminal attempt indicating the
webhook was inactive (include identifying fields like idempotency-key, event,
and a status/reason such as "webhook_inactive"), rather than only calling
webhook-model/record-attempt! inside the active branch; update the logic around
webhook-sender/send-webhook and webhook-model/record-attempt! to always record
an attempt for claimed events even when (:status webhook) is not "active".
- Around line 133-152: Validate worker-count and retry-worker-count after
reading flags and before constructing ArrayBlockingQueue: if either value is <=
0, avoid creating an ArrayBlockingQueue or calling
start-worker/start-retry-worker (e.g., set q/retry-q to nil and
workers/retry-workers to empty vectors or skip their creation) so start/restart
degrades gracefully instead of throwing IllegalArgumentException; reference the
symbols worker-count, retry-worker-count, ArrayBlockingQueue, start-worker, and
start-retry-worker when making the guard and ensure the tracer/with-span!
attributes still reflect the configured counts.

In `@server/src/instant/webhook_sender.clj`:
- Around line 10-21: The namespace is missing an import for
java.util.concurrent.Callable which causes compilation to fail where you use
(reify Callable ...) in webhook_sender.clj; add Callable to the (:import ...)
list alongside ExecutorService and TimeUnit so that the reify Callable
expression resolves (look for the existing imports block and include
java.util.concurrent Callable).

In `@server/src/tool.clj`:
- Around line 319-323: The webhook composite branches build SQL string literals
by wrapping the composite text from webhook-attempt->composite-str and
->pg-webhook-attempt-array in single quotes but do not escape internal
apostrophes, which breaks unsafe-sql-format-query for values like "can't
connect"; fix by escaping single quotes (replace "'" with "''") on the composite
string before formatting (i.e., call a small helper like escape-sql-literal or
apply (.replace composite-str "'" "''")) when producing the
"'%s'::webhook_attempt" and "'%s'::webhook_attempt[]" values so
webhook-attempt->composite-str and ->pg-webhook-attempt-array outputs are safe
to wrap.

In `@server/test/instant/webhook_sender_test.clj`:
- Around line 47-74: The tests (dns-resolver-filters-bad-ips and the other cases
mentioned) call webhook-sender/send-webhook and rely on real DNS via
race-dns-resolve/Dns, which makes CI flaky; replace that external dependency in
the tests by stubbing/mocking race-dns-resolve (or the Dns implementation used
by webhook-sender) with with-redefs or a test fixture to return deterministic
results for the hostnames used (e.g., return resolved private/metadata IPs or
throw a resolution error as needed), so assertions only depend on local test
state and not live DoH/DNS; update the tests at the referenced blocks (including
the other ranges 76-109 and 111-136) to use the stubbed resolver.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e0493f55-bb0a-4fad-9c05-3a78831c85c0

📥 Commits

Reviewing files that changed from the base of the PR and between cca7a5e and 917dcb6.

📒 Files selected for processing (26)

server/.clj-kondo/config.edn
server/.clj-kondo/hooks/with_test_replication_slot.clj
server/deps.edn
server/src/instant/config.clj
server/src/instant/core.clj
server/src/instant/flags.clj
server/src/instant/grpc.clj
server/src/instant/jdbc/sql.clj
server/src/instant/jdbc/wal.clj
server/src/instant/jdbc/wal_entry.clj
server/src/instant/model/history.clj
server/src/instant/model/webhook.clj
server/src/instant/nippy.clj
server/src/instant/rate_limit.clj
server/src/instant/reactive/invalidator.clj
server/src/instant/smokescreen.clj
server/src/instant/util/json.clj
server/src/instant/util/test.clj
server/src/instant/webhook_processor.clj
server/src/instant/webhook_sender.clj
server/src/tool.clj
server/test/instant/jdbc/sql_test.clj
server/test/instant/model/webhook_test.clj
server/test/instant/nippy_test.clj
server/test/instant/smokescreen_test.clj
server/test/instant/webhook_sender_test.clj

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

server/src/instant/util/test.clj (1)

387-390: ⚡ Quick win

@worker# and @pumper# in finally have no timeout and can hang indefinitely.

If inv/stop fails to close wal-chan# (e.g., due to a stuck network/replication connection), the pumper's a/<!! loop never sees nil and @pumper# blocks forever. Similarly, @worker# blocks if the worker future doesn't terminate after signalling. A timed deref with a diagnostic throw would prevent silent test-suite hangs.

🛡️ Proposed fix: add timeouts to both derefs

       (finally
         (inv/stop opts#)
-        `@worker`#
-        `@pumper`#))))
+        (when (= ::timeout (deref worker# 15000 ::timeout))
+          (future-cancel worker#)
+          (throw (ex-info "WAL worker did not shut down within 15s" {})))
+        (a/close! wal-chan#)  ; ensure pumper exits even if inv/stop missed it
+        (when (= ::timeout (deref pumper# 5000 ::timeout))
+          (throw (ex-info "WAL pumper did not shut down within 5s" {})))))))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/instant/util/test.clj` around lines 387 - 390, The finally block
currently blocks indefinitely on `@worker`# and `@pumper`#; change these to timed
derefs (deref with timeout or @ with timeout) and throw a diagnostic exception
if the timeout elapses so tests fail rather than hang: after calling inv/stop
(which may fail to close wal-chan#), replace the plain derefs of worker# and
pumper# with a timed-deref pattern (e.g., deref worker# TIMEOUT or alts/timeout)
and if nil/timeout is returned throw an informative exception mentioning worker#
or pumper# and wal-chan#/inv/stop to aid debugging. Ensure the same timeout
policy is applied to both worker# and pumper#.

server/test/instant/smokescreen_test.clj (1)

16-97: ⚡ Quick win

Add coverage for the whitelist bypass.

bad-ip? now has a new first branch for flags/smokescreen-whitelist-ips, but this suite never asserts that a normally blocked address becomes allowed when whitelisted. A small regression test here would lock in the InetAddress equality path the new flag parser depends on.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/test/instant/smokescreen_test.clj` around lines 16 - 97, Add a test
that verifies the new whitelist flag bypass in bad-ip?: create a case in the
smokescreen_test.clj that sets flags/smokescreen-whitelist-ips to include an
address normally blocked (e.g. "127.0.0.1" or "10.0.0.1"), then assert that
(bad-ip? <that-address>) returns false when the flag is enabled; ensure the test
exercises the InetAddress equality path the new parser uses by supplying the
exact string form the parser will convert to an InetAddress and toggling the
flag context around the call to bad-ip?.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/src/instant/model/webhook.clj`:
- Around line 459-488: The 410-disable branch must only run if the
`record-attempt!` update actually succeeded; change the code so that after
calling `sql/execute!` with `record-attempt-q` (the `res` variable from the
`record-attempt!` call) you check that the DB update affected the expected row
(i.e. `res` indicates a successful/positive update count) before invoking
`disable!`; keep the existing `gone?` check but add a guard like `when (and
gone? record-update-succeeded?)` so stale workers that failed to win the row
lock cannot disable the webhook.
- Around line 215-241: After updating a webhook in disable! and enable! you must
evict the cached row that get-by-app-id-and-webhook-id! stores so readers don't
see stale :status/:disabled_reason; for disable! call the cache eviction logic
immediately after the sql/do-execute! in the two-arg arity, and for enable!
perform the eviction after the with-transaction completes (i.e. after
sql/do-execute! and after the transaction block) using the same cache key
construction that get-by-app-id-and-webhook-id! uses (app-id and webhook-id) so
the cache entry is removed and subsequent reads fetch fresh data.

In `@server/src/instant/webhook_processor.clj`:
- Around line 89-95: The batch deref is being aborted when any future throws, so
change the per-event future creation (the mapv producing futs that calls
ua/vfuture with handle-event!) to wrap the handle-event! invocation in a
try/catch inside the future: catch exceptions, log or record the error and
return a safe failure marker (so deref never throws), ensuring the rest of the
futures complete and the subsequent should-repeat? check still runs; apply the
same pattern to the other similar block referenced by work-retry! so failures
are handled inside each future rather than at mapv deref.
- Around line 208-229: Protect lifecycle transitions by adding a shared lock and
guards: make stop-global, start-global and restart acquire a single mutex/lock
(e.g., an atom used as a guard) so stop/start cannot interleave, and change the
flag listeners inside _watch-worker-count-change to first check that `@process` is
non-nil and then call restart under that same lock (no-op if process is nil).
Also ensure the webhook-processor is created with sensible defaults in
server/src/instant/core.clj so listeners only reconfigure an already-initialized
processor after flags load; reference the start-global, stop-global, restart,
_watch-worker-count-change and the process atom when making these changes.

---

Nitpick comments:
In `@server/src/instant/util/test.clj`:
- Around line 387-390: The finally block currently blocks indefinitely on
`@worker`# and `@pumper`#; change these to timed derefs (deref with timeout or @
with timeout) and throw a diagnostic exception if the timeout elapses so tests
fail rather than hang: after calling inv/stop (which may fail to close
wal-chan#), replace the plain derefs of worker# and pumper# with a timed-deref
pattern (e.g., deref worker# TIMEOUT or alts/timeout) and if nil/timeout is
returned throw an informative exception mentioning worker# or pumper# and
wal-chan#/inv/stop to aid debugging. Ensure the same timeout policy is applied
to both worker# and pumper#.

In `@server/test/instant/smokescreen_test.clj`:
- Around line 16-97: Add a test that verifies the new whitelist flag bypass in
bad-ip?: create a case in the smokescreen_test.clj that sets
flags/smokescreen-whitelist-ips to include an address normally blocked (e.g.
"127.0.0.1" or "10.0.0.1"), then assert that (bad-ip? <that-address>) returns
false when the flag is enabled; ensure the test exercises the InetAddress
equality path the new parser uses by supplying the exact string form the parser
will convert to an InetAddress and toggling the flag context around the call to
bad-ip?.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1d767c29-f62f-4f2d-964a-d81985fb9e3b

📥 Commits

Reviewing files that changed from the base of the PR and between 917dcb6 and 7cbbd84.

📒 Files selected for processing (11)

server/src/instant/flags.clj
server/src/instant/model/webhook.clj
server/src/instant/smokescreen.clj
server/src/instant/util/test.clj
server/src/instant/webhook_processor.clj
server/src/instant/webhook_sender.clj
server/src/tool.clj
server/test/instant/model/webhook_test.clj
server/test/instant/smokescreen_test.clj
server/test/instant/webhook_processor_test.clj
server/test/instant/webhook_sender_test.clj

🚧 Files skipped from review as they are similar to previous changes (1)

server/src/instant/webhook_sender.clj

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

server/src/instant/model/webhook.clj (1)
183-197: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

create! still stores the wrong attr ids for update matching.

This persists only each etype’s "id" attr, but the new webhook-matches?-test proves a normal field update matches [name-aid] and not [id-aid]. A webhook created here for ["update"] will therefore miss ordinary non-id updates. Either persist all relevant attr ids for the selected etypes, or change the matcher/prefilter contract so both sides key off the same attr set.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/instant/model/webhook.clj` around lines 183 - 197, The webhook
create! currently builds id-attr-ids by taking only each etype’s "id" attribute
via attr-model/seek-by-fwd-ident-name, which causes update webhooks to miss
ordinary-field updates; change the create! logic that defines id-attr-ids (and
the subsequent topics) to collect and persist all relevant attribute ids for the
selected etypes (e.g., filter attrs from attr-model/get-by-app-id by etype and
collect their :id values) so the stored attr-id set matches what the
matcher/prefilter expects; ensure the same attribute-id set is used when
computing topics (history/bloom-bit) and when storing the webhook so matching
uses the full field-id list rather than only the "id" attr.

🧹 Nitpick comments (1)

server/src/instant/webhook_processor.clj (1)

35-44: 💤 Low value

Consider closing the JSON generator in a try/finally block.

If webhook-jwt/webhook-payload-jwt throws before .close gen, the generator remains unclosed. While ByteArrayOutputStream holds no OS resources (so there's no real leak), the partial write means an incomplete byte array would be returned — though in practice the exception propagates, preventing use of the corrupted buffer.

♻️ Suggested fix

 (defn webhook-body [{:keys [app_id webhook_id isn] :as _event}]
   (let [out (ByteArrayOutputStream.)
         gen (.createGenerator json/object-mapper out)
         payload-url (str config/server-origin "/webhooks/payload/" app_id "/" webhook_id "/" isn)]
-    (.writeStartObject gen)
-    (.writeStringField gen "payloadUrl" payload-url)
-    (.writeStringField gen "token" (webhook-jwt/webhook-payload-jwt app_id webhook_id isn))
-    (.writeEndObject gen)
-    (.close gen)
-    (.toByteArray out)))
+    (try
+      (.writeStartObject gen)
+      (.writeStringField gen "payloadUrl" payload-url)
+      (.writeStringField gen "token" (webhook-jwt/webhook-payload-jwt app_id webhook_id isn))
+      (.writeEndObject gen)
+      (.close gen)
+      (.toByteArray out)
+      (finally
+        (.close gen)))))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/instant/webhook_processor.clj` around lines 35 - 44, Wrap creation
and use of the JSON generator in a try/finally inside webhook-body: create gen
via (.createGenerator json/object-mapper out), perform the write calls and
compute the resulting (.toByteArray out) inside the try, and in the finally
ensure (.close gen) is always invoked (so webhook-jwt/webhook-payload-jwt
exceptions won’t leave gen unclosed); ByteArrayOutputStream (out) need not be
closed. Use the existing symbols webhook-body, gen, out and
webhook-jwt/webhook-payload-jwt to locate the code.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/src/instant/model/webhook.clj`:
- Line 494: Remove the debug hook tool/def-locals from the record-attempt!
function to avoid defining globals and leaking payload/state across concurrent
deliveries; locate the record-attempt! definition in webhook.clj, delete the
(tool/def-locals) invocation (and any related debug-only code tied to that
hook), and ensure the function only uses local bindings or returns values
without mutating namespace-level state so concurrent requests no longer retain
the last event/attempt payload.
- Around line 343-345: The predicate passed to ucoll/seek is using the keyword
lookup `:=` instead of equality, causing incorrect matches; update the vmemoize
get-wal-record predicate from `#(:= isn (:isn %))` to use an equality check `#(=
isn (:isn %))` so that get-wal-record (used in the get-wal-record binding and by
ucoll/seek over wal-records) returns the correct WAL record for a given isn.

---

Duplicate comments:
In `@server/src/instant/model/webhook.clj`:
- Around line 183-197: The webhook create! currently builds id-attr-ids by
taking only each etype’s "id" attribute via attr-model/seek-by-fwd-ident-name,
which causes update webhooks to miss ordinary-field updates; change the create!
logic that defines id-attr-ids (and the subsequent topics) to collect and
persist all relevant attribute ids for the selected etypes (e.g., filter attrs
from attr-model/get-by-app-id by etype and collect their :id values) so the
stored attr-id set matches what the matcher/prefilter expects; ensure the same
attribute-id set is used when computing topics (history/bloom-bit) and when
storing the webhook so matching uses the full field-id list rather than only the
"id" attr.

---

Nitpick comments:
In `@server/src/instant/webhook_processor.clj`:
- Around line 35-44: Wrap creation and use of the JSON generator in a
try/finally inside webhook-body: create gen via (.createGenerator
json/object-mapper out), perform the write calls and compute the resulting
(.toByteArray out) inside the try, and in the finally ensure (.close gen) is
always invoked (so webhook-jwt/webhook-payload-jwt exceptions won’t leave gen
unclosed); ByteArrayOutputStream (out) need not be closed. Use the existing
symbols webhook-body, gen, out and webhook-jwt/webhook-payload-jwt to locate the
code.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 77aa1700-2454-4025-ba8b-f0e69e3fbf51

📥 Commits

Reviewing files that changed from the base of the PR and between 917dcb6 and 251ce06.

📒 Files selected for processing (11)

server/src/instant/flags.clj
server/src/instant/model/webhook.clj
server/src/instant/smokescreen.clj
server/src/instant/util/test.clj
server/src/instant/webhook_processor.clj
server/src/instant/webhook_sender.clj
server/src/tool.clj
server/test/instant/model/webhook_test.clj
server/test/instant/smokescreen_test.clj
server/test/instant/webhook_processor_test.clj
server/test/instant/webhook_sender_test.clj

🚧 Files skipped from review as they are similar to previous changes (1)

server/src/instant/webhook_sender.clj

coderabbitai

🧹 Nitpick comments (2)

server/src/instant/model/webhook.clj (2)
62-67: ⚡ Quick win

with-cache-invalidation silently breaks for multi-form bodies.

~@body is spliced directly into the inner let bindings vector. With a single body form (today's only callers in disable!/enable!) this works, but as soon as someone writes (with-cache-invalidation k a-side-effect (do-thing)) the macro expands to (let [res# a-side-effect (do-thing)] ...) — a malformed bindings vector that won't compile. Wrap the body in do to make the macro robust to future callers.
♻️ Proposed fix
 (defmacro with-cache-invalidation [cache-key & body]
   `(let [cache-key# ~cache-key]
      (evict-webhook-from-cache cache-key#)
-     (let [res# ~@body]
+     (let [res# (do ~@body)]
        (evict-webhook-from-cache cache-key#)
        res#)))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/instant/model/webhook.clj` around lines 62 - 67, The macro
with-cache-invalidation improperly splices ~@body into the inner let bindings,
which breaks for multi-form bodies; fix it by wrapping the body forms in a do so
the inner let creates a single expression for res# (i.e. bind res# to (do
~@body)), leaving the evict-webhook-from-cache(cache-key#) calls unchanged and
preserving existing behavior in disable! / enable! callers.
155-159: 💤 Low value

Dispatch key missing namespace qualifier.

Every other sql/select* / sql/execute* call in this namespace passes a ::xxx! namespaced keyword as the dispatch tag (e.g. ::take-webhook-count-lock! on Line 138, ::create! on Line 205). This one is :check-webhook-limit! (single colon), so traces/metrics keyed off the dispatch tag will land in a different bucket. Looks like a typo.
✏️ Proposed fix
   (let [{:keys [webhook_count]}
-        (sql/select-one :check-webhook-limit!
+        (sql/select-one ::check-webhook-limit!
                         conn
                         (uhsql/formatp check-webhook-limit-q
                                        {:app-id app-id}))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/src/instant/model/webhook.clj` around lines 155 - 159, The dispatch
tag for the sql/select-one call is missing the namespace qualifier causing
traces to be categorized separately; update the dispatch keyword used in the
sql/select-one invocation that references check-webhook-limit-q to use the
namespaced keyword ::check-webhook-limit! (matching the style of
::take-webhook-count-lock! and ::create!) so the trace/metric dispatch bucket is
consistent with other sql/* calls.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@server/src/instant/model/webhook.clj`:
- Around line 62-67: The macro with-cache-invalidation improperly splices ~@body
into the inner let bindings, which breaks for multi-form bodies; fix it by
wrapping the body forms in a do so the inner let creates a single expression for
res# (i.e. bind res# to (do ~@body)), leaving the
evict-webhook-from-cache(cache-key#) calls unchanged and preserving existing
behavior in disable! / enable! callers.
- Around line 155-159: The dispatch tag for the sql/select-one call is missing
the namespace qualifier causing traces to be categorized separately; update the
dispatch keyword used in the sql/select-one invocation that references
check-webhook-limit-q to use the namespaced keyword ::check-webhook-limit!
(matching the style of ::take-webhook-count-lock! and ::create!) so the
trace/metric dispatch bucket is consistent with other sql/* calls.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f680eb1c-c1f1-4dd9-b0ca-a868e110203a

📥 Commits

Reviewing files that changed from the base of the PR and between 251ce06 and 2b75c79.

📒 Files selected for processing (2)

server/src/instant/model/webhook.clj
server/test/instant/model/webhook_test.clj

stopachka

Beautiful work!

…ks-3

github-actions · 2026-05-08T21:17:40Z

View Vercel preview at instant-www-js-webhooks-3-jsv.vercel.app.

…ks-3

dwwoelfel added 2 commits May 4, 2026 15:45

WIP on sending webhooks

2ab6e45

remove debug code

542ced5

dwwoelfel added 11 commits May 4, 2026 15:51

fix lint

01c2b2d

more missing changes

156ecb9

lint

c95c8f8

cleaner composite type parsing

6717e1a

truncate attempt-at

3760312

cleanup and tests

d7a6538

fix retries

1e6e35d

missing test util change

625acd5

remove debug

75a1d23

fix tests

1311297

Merge branch 'webhooks-2' of github.com:instantdb/instant into webhoo…

917dcb6

…ks-3

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

lots of feedback, better ssrf prevention

7cbbd84

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread server/src/instant/model/webhook.clj Outdated

Comment thread server/src/instant/model/webhook.clj

Comment thread server/src/instant/webhook_processor.clj Outdated

Comment thread server/src/instant/webhook_processor.clj

implement feedback

251ce06

dwwoelfel changed the title ~~[WIP] Webhooks Backend (Part 3)~~ Webhooks Backend (Part 3) May 6, 2026

dwwoelfel marked this pull request as ready for review May 6, 2026 00:34

remove outdated TODO

7335b90

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

Comment thread server/src/instant/model/webhook.clj Outdated

Comment thread server/src/instant/model/webhook.clj Outdated

dwwoelfel added 2 commits May 5, 2026 20:07

fix = typo

2b75c79

test smokescreen override

47c454f

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

dwwoelfel added 2 commits May 6, 2026 08:01

feedback

a39211a

track false positives

63eab8f

stopachka approved these changes May 6, 2026

View reviewed changes

dwwoelfel added 2 commits May 6, 2026 16:29

camelCase for idempotencyKey

8147172

Merge branch 'webhooks-2' of github.com:instantdb/instant into webhoo…

887e4df

…ks-3

dwwoelfel temporarily deployed to Preview May 8, 2026 21:11 — with GitHub Actions Inactive

Merge branch 'webhooks-2' of github.com:instantdb/instant into webhoo…

5052a07

…ks-3

dwwoelfel temporarily deployed to Preview May 12, 2026 21:35 — with GitHub Actions Inactive

dwwoelfel mentioned this pull request May 13, 2026

Webhooks Part 2 #2618

Merged

Webhooks UI on dash (behind a flag) (#2637)

ac4c93d

dwwoelfel temporarily deployed to Preview May 13, 2026 20:54 — with GitHub Actions Inactive

dwwoelfel deployed to Preview May 13, 2026 20:55 — with GitHub Actions View deployment

dwwoelfel merged commit c8ad4fb into webhooks-2 May 13, 2026
33 checks passed

dwwoelfel deleted the webhooks-3 branch May 13, 2026 21:01

coderabbitai Bot mentioned this pull request May 14, 2026

Rename etype to namespace for user-facing webhooks code #2673

Merged

Conversation

dwwoelfel commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Limitations

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Suggested Reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

stopachka left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dwwoelfel commented May 4, 2026 •

edited

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading