Skip to content

[codex] Add Hyperping cron healthchecks#2324

Merged
riderx merged 1 commit into
mainfrom
codex/hyperping-cron-healthchecks
May 23, 2026
Merged

[codex] Add Hyperping cron healthchecks#2324
riderx merged 1 commit into
mainfrom
codex/hyperping-cron-healthchecks

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented May 21, 2026

Summary (AI generated)

  • Added optional healthcheck_url support on public.cron_tasks.
  • Rows with healthcheck_url use a small separate queue-consumer starter that passes the URL to queue_consumer; existing process_function_queue calls stay unchanged.
  • queue_consumer calls the URL with GET only after processQueue succeeds and the queue is empty.
  • No URL validation, no run table, no healthcheck tracking state.

Motivation (AI generated)

Cron healthchecks should be reported by the consumer after the actual worker work finishes, not by SQL when work is only started.

Business Impact (AI generated)

This lets Capgo wire external cron healthchecks to background queue work without false success reports from failed worker batches.

Test Plan (AI generated)

  • bunx eslint supabase/functions/_backend/triggers/queue_consumer.ts tests/queue-consumer-message-shape.unit.test.ts tests/cron-healthchecks.test.ts tests/security-definer-execute-hardening.test.ts
  • bunx sqlfluff lint --dialect postgres supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
  • bun run supabase:db:reset
  • bun run supabase:with-env -- bunx vitest run tests/cron-healthchecks.test.ts tests/security-definer-execute-hardening.test.ts tests/queue-consumer-message-shape.unit.test.ts
  • bunx vitest run tests/queue-consumer-message-shape.unit.test.ts
  • bun lint
  • bun typecheck
  • git diff --check

Generated with AI

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds cron_tasks.healthcheck_url, DB helpers process_function_queue(...) (single and array overloads), replaces process_all_cron_tasks() with an advisory-lock scheduler that forwards healthcheck URLs, updates the queue consumer to report results and optionally ping healthcheck URLs, and adds tests and security hardening.

Changes

Cron Hyperping healthchecks

Layer / File(s) Summary
Healthcheck URL column and per-queue helper
supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
Adds healthcheck_url to public.cron_tasks and implements process_function_queue(queue_name, batch_size, healthcheck_url) that validates batch size, counts pgmq queue rows, and POSTs to the queue consumer sync endpoint; sets ownership and service_role execute only.
Multi-queue overload and scheduler replacement
supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
Adds process_function_queue(queue_names text[], batch_size, healthcheck_url) that loops queues and converts per-queue errors to warnings; replaces process_all_cron_tasks() with a non-blocking advisory-lock scheduler that evaluates cron rules and forwards healthcheck_url (defaulting batch size when set).
Queue consumer: result shape, read handling, and healthcheck helpers
supabase/functions/_backend/triggers/queue_consumer.ts
Introduces QueueProcessResult, changes readQueue to return `Message[]
Integration tests: cron healthchecks
tests/cron-healthchecks.test.ts
Adds a DB-backed test that inserts a cron_tasks row with healthcheck_url and asserts the value is stored (transactionally rolled back).
Unit tests for maybePingCronHealthcheck
tests/queue-consumer-message-shape.unit.test.ts
Adds unit tests that stub DB queue length and assert maybePingCronHealthcheck only calls the healthcheck URL when worker succeeded and queue is empty; covers other suppression cases.
Security-definer execute hardening
tests/security-definer-execute-hardening.test.ts
Extends SERVICE_ONLY_PROCS in tests to include process_all_cron_tasks() and both process_function_queue overloads so anon/authenticated cannot execute them directly.

Sequence Diagram(s)

sequenceDiagram
  participant Scheduler as process_all_cron_tasks
  participant DB as postgres (pgmq.metrics / cron_tasks)
  participant QueueConsumer as /functions/v1/triggers/queue_consumer/sync
  participant Hyperping as hc.hyperping.io

  Scheduler->>DB: SELECT enabled cron_tasks (includes healthcheck_url)
  Scheduler->>QueueConsumer: POST /functions/v1/triggers/queue_consumer/sync (queue_name, batch_size, healthcheck_url)
  QueueConsumer->>DB: SELECT queue_length FROM pgmq.metrics
  QueueConsumer->>Hyperping: GET healthcheck_url (when process success && queue_length == 0)
  Hyperping-->>QueueConsumer: HTTP 200/timeout/error
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

codex

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.27% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title '[codex] Add Hyperping cron healthchecks' accurately and concisely summarizes the main change: adding cron healthcheck support for Hyperping monitoring.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description includes a summary, motivation, business impact, and detailed test plan with all steps executed, though it lacks explicit documentation of changes and manual testing details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented May 21, 2026

Merging this PR will not alter performance

✅ 43 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing codex/hyperping-cron-healthchecks (b07069a) with main (9318f92)

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from 333c78a to 0480af1 Compare May 21, 2026 21:41
@riderx riderx marked this pull request as ready for review May 21, 2026 22:03
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@coderabbitai coderabbitai Bot added the codex label May 21, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql`:
- Around line 206-211: The function public.process_all_cron_tasks() lacks an
explicit ACL revocation; add a REVOKE ALL ON FUNCTION
public.process_all_cron_tasks() FROM PUBLIC; (matching the pattern used for
report_cron_success_healthcheck) immediately after setting the OWNER to postgres
so the function cannot be called by PUBLIC and to follow the repo's permission
guidelines.
- Around line 166-168: The dynamic EXECUTE in the CASE branch for task.task_type
= 'function' is vulnerable because it concatenates task.target directly; change
this to use pg_catalog.format with identifier placeholders (e.g., format('SELECT
%I.%I()', task.target_schema, task.target_function)) or validate/parse
task.target against an allowed function-call regex and use format('%s', ...)
with %I/%L as appropriate; update the logic that reads cron_tasks.target to
either split into target_schema/target_function or to validate the string before
calling EXECUTE so the EXECUTE no longer concatenates raw task.target.

In `@tests/cron-healthchecks.test.ts`:
- Line 31: Replace plain it(...) with it.concurrent(...) for the unit test
titled "keeps healthcheck URLs scoped to Hyperping" and the other test at the
later position flagged in the review so both tests run in parallel; locate the
test declarations by their names (e.g., the it('keeps healthcheck URLs scoped to
Hyperping', ...) call) and update them to it.concurrent('keeps healthcheck URLs
scoped to Hyperping', async () => { ... }) and likewise change the other it(...)
to it.concurrent(...).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5b92eb14-ea95-4929-8c29-98f9aaac81a0

📥 Commits

Reviewing files that changed from the base of the PR and between f1090d7 and 0480af1.

📒 Files selected for processing (3)
  • supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
  • tests/cron-healthchecks.test.ts
  • tests/security-definer-execute-hardening.test.ts

Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql Outdated
Comment thread tests/cron-healthchecks.test.ts Outdated
@riderx
Copy link
Copy Markdown
Member Author

riderx commented May 21, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from 295dea9 to 0b4064e Compare May 23, 2026 10:30
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
tests/cron-healthchecks.test.ts (1)

69-71: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid table-wide cron task updates in a parallelized integration test.

Line 70 updates all rows in public.cron_tasks; while the transaction is open this can hold broad row locks and cause cross-test contention/flakiness. Since the assertion is already scoped by task_name, remove this global update or scope it to test-owned fixtures only.

Suggested minimal change
-      await client.query('UPDATE public.cron_tasks SET enabled = false')

As per coding guidelines: tests/**/*.test.ts: “Design all tests for parallel execution across files”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cron-healthchecks.test.ts` around lines 69 - 71, The test opens a
transaction and runs a table-wide UPDATE via client.query('UPDATE
public.cron_tasks SET enabled = false') which can lock many rows and cause
flakiness; change that to only update the test-owned rows (e.g., add a WHERE
clause filtering by task_name or specific test fixture ids) or remove the global
update entirely and instead update the single task referenced in the assertion
(use client.query with "UPDATE public.cron_tasks SET enabled = false WHERE
task_name = $1", passing the test task_name) so the transaction only touches the
targeted row(s).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@supabase/functions/_backend/triggers/queue_consumer.ts`:
- Around line 911-920: The code currently updates
cron_healthcheck_runs.pinged_at (via the client.query returning pingResult and
setting healthcheckUrl/shouldPing) before calling pingCronHealthcheck(...);
change this so the UPDATE that sets pinged_at only runs after
pingCronHealthcheck completes successfully (i.e., after a successful Hyperping
GET/2xx response), and do not persist pinged_at on timeouts or non-2xx
responses; locate the client.query that returns healthcheck_url (using
pingResult and healthcheckRunId) and instead perform a SELECT to fetch
healthcheck_url (or keep the SELECT but defer the UPDATE) and call
pingCronHealthcheck first, then run the UPDATE statement to set pinged_at =
now() only when pingCronHealthcheck returns success; apply the same change to
the similar block around pingCronHealthcheck usage at the other occurrence
referenced (the block at the later 963-972 region).

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql`:
- Around line 247-268: The insert into cron_healthcheck_runs (producing
healthcheck_run_id) must reflect only successfully queued /sync dispatches:
change the flow so you iterate FOREACH queue_name IN ARRAY queue_names, attempt
PERFORM public.process_function_queue(queue_name, batch_size, NULL) collecting
successful queue_names (or a success counter), and only after the loop INSERT
INTO public.cron_healthcheck_runs with healthcheck_url, queue_names (or
expected_workers set to the success count) and retrieve healthcheck_run_id;
alternatively, if you keep the current immediate INSERT, update the stored
expected_workers (or decrement it) inside the EXCEPTION WHEN OTHERS branch
referencing healthcheck_run_id and expected_workers so the persisted run matches
actual dispatch successes and will ping correctly.
- Around line 91-104: The current loop in the cron healthcheck helper (using
p_run_id, run_record.queue_names and querying pgmq.q_* to compute
remaining_count/total_count) incorrectly counts all live rows in each queue;
change it to only count messages belonging to this run by using a per-run marker
or launched-jobs table: ensure dispatched messages are tagged with the run id
(or a run_job entry is created for each message), then replace the pgmq.q_*
COUNT query with a filtered COUNT that only includes rows for this run (e.g.,
WHERE run_id = p_run_id or JOIN to the run_jobs table), and update
remaining_count/total_count logic to sum only those scoped rows so completion is
driven by per-run items rather than live queue depth.

In `@tests/queue-consumer-message-shape.unit.test.ts`:
- Around line 289-353: Add a regression test that simulates a failed Hyperping
(non-2xx or rejected fetch) for
__queueConsumerTestUtils__.completeCronHealthcheckRun: create a run via
createHealthcheckRun and db via createHealthcheckDb(run, 0), set fetchImpl to
either a rejected promise or a Response with status 500, call
completeCronHealthcheckRun with the success flag true, then assert that reported
is false, fetchImpl was called once, run.completed_workers is 1,
run.failed_workers is 0, and run.pinged_at remains null so the failure-to-ping
branch is covered.

---

Duplicate comments:
In `@tests/cron-healthchecks.test.ts`:
- Around line 69-71: The test opens a transaction and runs a table-wide UPDATE
via client.query('UPDATE public.cron_tasks SET enabled = false') which can lock
many rows and cause flakiness; change that to only update the test-owned rows
(e.g., add a WHERE clause filtering by task_name or specific test fixture ids)
or remove the global update entirely and instead update the single task
referenced in the assertion (use client.query with "UPDATE public.cron_tasks SET
enabled = false WHERE task_name = $1", passing the test task_name) so the
transaction only touches the targeted row(s).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e8717c12-9468-4aeb-9533-b43d3b5aa659

📥 Commits

Reviewing files that changed from the base of the PR and between d90128c and 295dea9.

📒 Files selected for processing (5)
  • supabase/functions/_backend/triggers/queue_consumer.ts
  • supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
  • tests/cron-healthchecks.test.ts
  • tests/queue-consumer-message-shape.unit.test.ts
  • tests/security-definer-execute-hardening.test.ts

Comment thread supabase/functions/_backend/triggers/queue_consumer.ts Outdated
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql Outdated
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql Outdated
Comment thread tests/queue-consumer-message-shape.unit.test.ts Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
tests/cron-healthchecks.test.ts (1)

65-65: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Switch this test to it.concurrent(...) to satisfy test parallelization rules.

it(...) here violates the repository rule for *.test.ts files.

Suggested diff
-  it('creates a worker-completion healthcheck run without reporting from SQL', async () => {
+  it.concurrent('creates a worker-completion healthcheck run without reporting from SQL', async () => {
#!/bin/bash
# Verify all test declarations in this file are concurrent
rg -n "^\s*it(\.concurrent)?\(" tests/cron-healthchecks.test.ts

As per coding guidelines: tests/**/*.test.ts: “use it.concurrent() instead of it() to run tests in parallel within the same file for faster CI/CD”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cron-healthchecks.test.ts` at line 65, Replace the synchronous test
declaration it('creates a worker-completion healthcheck run without reporting
from SQL', async () => { ... }) with a concurrent declaration
it.concurrent('creates a worker-completion healthcheck run without reporting
from SQL', async () => { ... }) so the test runs in parallel; update the
invocation for the test case that uses the named string to call
it.concurrent(...) instead of it(...), leaving the async body (and any hooks
inside) unchanged.
supabase/functions/_backend/triggers/queue_consumer.ts (1)

911-920: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't persist pinged_at before the Hyperping GET succeeds.

These lines mark the run as pinged inside the transaction, but the actual pingCronHealthcheck(...) call happens only after commit. If Hyperping times out or returns non-2xx, the row still claims success and later retries are suppressed permanently. Fetch the URL here, perform the GET first, and only then UPDATE ... SET pinged_at = now() on success.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/functions/_backend/triggers/queue_consumer.ts` around lines 911 -
920, The current code updates pinged_at inside the DB transaction before
actually performing the external GET (see pingResult, healthcheckRunId and
subsequent pingCronHealthcheck(...)), which silences retries even if the GET
fails; change the flow to SELECT or query the healthcheck_url only (no SET
pinged_at) using the existing query or a SELECT for the given healthcheckRunId,
call pingCronHealthcheck(...) and verify a successful 2xx response, and only
after a confirmed success run an UPDATE statement that sets pinged_at = now()
(use the same conditional WHERE id = $1 AND pinged_at IS NULL or RETURNING to
ensure you don’t overwrite racing updates). Ensure failures do not update
pinged_at and that you still handle concurrent attempts safely by checking
pinged_at in the UPDATE RETURNING result.
tests/queue-consumer-message-shape.unit.test.ts (1)

289-353: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add the failed-Hyperping regression case.

These tests still only cover the success and no-op branches. Add a rejected or non-2xx fetchImpl case that asserts reported === false and run.pinged_at stays null; that's the branch currently broken in completeCronHealthcheckRun.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/queue-consumer-message-shape.unit.test.ts` around lines 289 - 353, Add
a new test in tests/queue-consumer-message-shape.unit.test.ts that covers the
failed-Hyperping regression by calling
__queueConsumerTestUtils__.completeCronHealthcheckRun (use createHealthcheckRun
and createHealthcheckDb to set up the run/db and testHealthcheckRunId) with a
fetchImpl that either rejects or returns a non-2xx Response (e.g., status 500);
assert that reported === false, that fetchImpl was called (since the ping was
attempted), and that run.completed_workers increments but run.pinged_at remains
null (and run.failed_workers unchanged for the success path or incremented
appropriately if you test the failure-path variant).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@supabase/functions/_backend/triggers/queue_consumer.ts`:
- Around line 424-425: The cloudlog call inside the messages === null branch
drops structured context; update the branch that currently calls
cloudlog(`[${queueName}] Queue read failed.`) to use structured logging with the
Hono requestId (use c.get('requestId')) and include queueName and the failure
message, e.g. cloudlog({ requestId: c.get('requestId'), queueName, msg: 'Queue
read failed' }), so it matches surrounding logging patterns and preserves
request correlation.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql`:
- Around line 159-163: The current cap of 10 is preventing large runs from
finishing: replace the capped computation of calls_needed (currently using
LEAST(..., 10)) so it is computed as ceiling(queue_size / batch_size)::int
without the hard 10 limit, and remove any corresponding cap on expected_workers
so it can reflect the true number of batches; additionally modify the run
completion logic that currently waits for tracked queue depth to reach zero to
instead track dispatched batches (or a dispatched_batches / remaining_calls
counter) and declare the run complete when dispatched_batches >= calls_needed
(or remaining_calls == 0), updating any uses of calls_needed and
expected_workers accordingly.

---

Duplicate comments:
In `@supabase/functions/_backend/triggers/queue_consumer.ts`:
- Around line 911-920: The current code updates pinged_at inside the DB
transaction before actually performing the external GET (see pingResult,
healthcheckRunId and subsequent pingCronHealthcheck(...)), which silences
retries even if the GET fails; change the flow to SELECT or query the
healthcheck_url only (no SET pinged_at) using the existing query or a SELECT for
the given healthcheckRunId, call pingCronHealthcheck(...) and verify a
successful 2xx response, and only after a confirmed success run an UPDATE
statement that sets pinged_at = now() (use the same conditional WHERE id = $1
AND pinged_at IS NULL or RETURNING to ensure you don’t overwrite racing
updates). Ensure failures do not update pinged_at and that you still handle
concurrent attempts safely by checking pinged_at in the UPDATE RETURNING result.

In `@tests/cron-healthchecks.test.ts`:
- Line 65: Replace the synchronous test declaration it('creates a
worker-completion healthcheck run without reporting from SQL', async () => { ...
}) with a concurrent declaration it.concurrent('creates a worker-completion
healthcheck run without reporting from SQL', async () => { ... }) so the test
runs in parallel; update the invocation for the test case that uses the named
string to call it.concurrent(...) instead of it(...), leaving the async body
(and any hooks inside) unchanged.

In `@tests/queue-consumer-message-shape.unit.test.ts`:
- Around line 289-353: Add a new test in
tests/queue-consumer-message-shape.unit.test.ts that covers the failed-Hyperping
regression by calling __queueConsumerTestUtils__.completeCronHealthcheckRun (use
createHealthcheckRun and createHealthcheckDb to set up the run/db and
testHealthcheckRunId) with a fetchImpl that either rejects or returns a non-2xx
Response (e.g., status 500); assert that reported === false, that fetchImpl was
called (since the ping was attempted), and that run.completed_workers increments
but run.pinged_at remains null (and run.failed_workers unchanged for the success
path or incremented appropriately if you test the failure-path variant).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: abc33135-9751-4727-aaef-6251cd8b71de

📥 Commits

Reviewing files that changed from the base of the PR and between 295dea9 and 0b4064e.

📒 Files selected for processing (5)
  • supabase/functions/_backend/triggers/queue_consumer.ts
  • supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
  • tests/cron-healthchecks.test.ts
  • tests/queue-consumer-message-shape.unit.test.ts
  • tests/security-definer-execute-hardening.test.ts

Comment thread supabase/functions/_backend/triggers/queue_consumer.ts Outdated
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql Outdated
@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from 0b4064e to a66f8cb Compare May 23, 2026 11:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/cron-healthchecks.test.ts (1)

31-63: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This test now encodes unrestricted healthcheck URLs.

The assertion on Line 63 verifies that a non-Hyperping URL is accepted. That removes coverage for URL scoping and can mask security regressions.

Please split this into positive/negative coverage (allowed Hyperping URL succeeds, disallowed domain is rejected).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/cron-healthchecks.test.ts` around lines 31 - 63, Update the single test
"stores any healthcheck URL on cron tasks" into two focused cases: one positive
test that inserts a cron task using an allowed Hyperping URL (e.g., a
hyperping.io URL) and asserts the returned healthcheck_url matches that allowed
URL, and one negative test that attempts to insert a cron task with a disallowed
external domain (e.g., example.com) and asserts the insert fails or the DB
rejects it (expecting an error or no returned row). Keep using the same
setup/teardown pattern (client = await pool.connect(), BEGIN/ROLLBACK) and reuse
the taskName variable and the same INSERT query/RETURNING behavior, but for the
negative case assert the failure condition instead of expecting the row to equal
the disallowed URL.
♻️ Duplicate comments (5)
supabase/functions/_backend/triggers/queue_consumer.ts (1)

418-419: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Restore structured context in queue-read failure logs.

Line 419 drops correlation metadata. Keep this log structured with requestId so queue read failures remain traceable.

Suggested fix
   if (messages === null) {
-    cloudlog(`[${queueName}] Queue read failed.`)
+    cloudlog({
+      requestId: c.get('requestId'),
+      message: `[${queueName}] Queue read failed.`,
+      queueName,
+    })
     return {

As per coding guidelines: "All endpoints must receive Hono Context<MiddlewareKeyVariables> object and use c.get('requestId') for structured logging with cloudlog()".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/functions/_backend/triggers/queue_consumer.ts` around lines 418 -
419, The null-check branch that logs queue read failures (when messages ===
null) currently calls cloudlog(`[${queueName}] Queue read failed.`) and drops
correlation metadata; update that branch to call cloudlog with structured
context including the Hono requestId retrieved via c.get('requestId') (e.g.,
include requestId and queueName in the log payload) so the failure remains
traceable—locate the messages === null check and replace the simple string log
with a structured cloudlog call that includes c.get('requestId') and queueName.
tests/queue-consumer-message-shape.unit.test.ts (1)

217-290: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add a regression test for failed healthcheck ping.

Current additions cover success/no-op only. Add one case where fetchImpl returns non-2xx (or rejects) and assert reported === false with one attempted call.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/queue-consumer-message-shape.unit.test.ts` around lines 217 - 290, Add
a regression test in tests/queue-consumer-message-shape.unit.test.ts that
exercises __queueConsumerTestUtils__.maybePingCronHealthcheck when the
healthcheck request fails: create a DB with zero pending work via
createHealthcheckDb(0), supply a fetchImpl mock that either returns a Response
with non-2xx status (e.g. 500) or rejects, call maybePingCronHealthcheck with
the same success payload used in the passing test, then assert that the function
returns reported === false and that fetchImpl was called exactly once
(expect(fetchImpl).toHaveBeenCalledTimes(1)); reference maybePingCronHealthcheck
and createHealthcheckDb to locate where to add the case.
supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql (3)

33-37: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The 10-batch cap still truncates large queue runs.

calls_needed is clipped to 10, so queues larger than 10 * batch_size only dispatch the first ten workers. That leaves part of the backlog outside this cron run, which breaks end-to-end completion semantics for healthchecked runs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql` around
lines 33 - 37, The code currently clips calls_needed with LEAST(..., 10),
causing large queues to only dispatch up to 10 batches; remove that fixed cap so
calls_needed equals the computed ceil(queue_size / batch_size) (cast to int)
instead of LEAST(...,10). Update the assignment for calls_needed to compute
pg_catalog.ceil(queue_size / batch_size::double precision)::int (or equivalent)
and remove the trailing , 10 argument so all required batches are scheduled.

94-103: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Best-effort dispatch can falsely report multi-queue success.

When several queues share one healthcheck_url, a failure in any later queue is downgraded to a warning after earlier queues were already dispatched with that same URL. Those successful queues can still complete and ping the healthcheck even though part of the cron task never ran.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql` around
lines 94 - 103, The loop currently downgrades per-queue failures to warnings so
a shared healthcheck_url can be pinged even if some queues failed; change the
logic to record failed queues (e.g., declare an array or boolean like
failed_queues or had_error) inside the EXCEPTION block for the FOR/EACH loop
that calls public.process_function_queue(queue_name, batch_size,
healthcheck_url), keep the RAISE WARNING for context, and after the FOREACH
finishes check the recorder and RAISE EXCEPTION (including the failed queue
names and SQLERRM details) if any failures occurred so the cron job reports
overall failure instead of falsely succeeding.

206-208: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Still executing task.target as raw SQL.

This concatenates a cron-table value directly into a SECURITY DEFINER EXECUTE, so a malformed or hostile target can run arbitrary SQL as postgres. Parse and validate the target first, then build the call with pg_catalog.format() instead of raw concatenation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql` around
lines 206 - 208, The CASE branch handling WHEN 'function' is currently
concatenating task.target into a SECURITY DEFINER EXECUTE, which allows
arbitrary SQL; instead parse and validate task.target (e.g. split into function
name and args, enforce allowed identifier pattern and optional safe argument
types) and then build the statement with pg_catalog.format() to safely quote
identifiers/values before EXECUTE; update the logic in the CASE that references
task.target and the EXECUTE to validate the parsed function name (deny if it
contains unsafe characters or schema changes) and use pg_catalog.format() (e.g.
format with %I for the function identifier and appropriate %L/%s for args) so
the executed SQL is properly escaped and cannot run arbitrary commands as
postgres.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@supabase/functions/_backend/triggers/queue_consumer.ts`:
- Around line 793-813: The healthcheck URL must be validated and constrained
before performing an outbound fetch to avoid SSRF: add URL validation logic that
enforces scheme === 'https' and checks the hostname against an explicit
allowlist before calling maybePingCronHealthcheck (or inside
maybePingCronHealthcheck if easier); reject or return false for
null/invalid/unauthorized URLs. Update callers that pass healthcheckUrl (the
code paths around maybePingCronHealthcheck invocation) to use a
validatedHealthcheckUrl (or pass the original but have maybePingCronHealthcheck
perform the validation) and only call pingCronHealthcheck(fetchImpl) when
validation passes; ensure fetchImpl remains injectable for tests and keep
referencing maybePingCronHealthcheck, pingCronHealthcheck, and the
healthcheckUrl parameter so reviewers can locate the changes.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql`:
- Around line 158-162: The schedule fields (current_hour, current_minute,
current_second, current_dow, current_day) are being extracted from NOW(), which
is session-timezone dependent; instead compute a single UTC timestamp (e.g.,
utc_now := timezone('UTC', NOW()) or NOW() AT TIME ZONE 'UTC') and then EXTRACT
the HOUR/MINUTE/SECOND/DOW/DAY from that utc_now, updating the assignments for
current_hour/current_minute/current_second/current_dow/current_day so run_at_*,
run_on_dow and run_on_day are based on UTC.
- Around line 19-20: Change the guard that currently reads "IF batch_size <= 0
THEN RAISE EXCEPTION 'batch_size must be positive';" to explicitly reject NULLs
by using "IF batch_size IS NULL OR batch_size <= 0 THEN RAISE EXCEPTION
'batch_size must be positive';" and apply this same replacement to the second
function that accepts the batch_size parameter (the other occurrence around the
later guard). This ensures NULL input is trapped early for both functions that
validate batch_size.

---

Outside diff comments:
In `@tests/cron-healthchecks.test.ts`:
- Around line 31-63: Update the single test "stores any healthcheck URL on cron
tasks" into two focused cases: one positive test that inserts a cron task using
an allowed Hyperping URL (e.g., a hyperping.io URL) and asserts the returned
healthcheck_url matches that allowed URL, and one negative test that attempts to
insert a cron task with a disallowed external domain (e.g., example.com) and
asserts the insert fails or the DB rejects it (expecting an error or no returned
row). Keep using the same setup/teardown pattern (client = await pool.connect(),
BEGIN/ROLLBACK) and reuse the taskName variable and the same INSERT
query/RETURNING behavior, but for the negative case assert the failure condition
instead of expecting the row to equal the disallowed URL.

---

Duplicate comments:
In `@supabase/functions/_backend/triggers/queue_consumer.ts`:
- Around line 418-419: The null-check branch that logs queue read failures (when
messages === null) currently calls cloudlog(`[${queueName}] Queue read failed.`)
and drops correlation metadata; update that branch to call cloudlog with
structured context including the Hono requestId retrieved via c.get('requestId')
(e.g., include requestId and queueName in the log payload) so the failure
remains traceable—locate the messages === null check and replace the simple
string log with a structured cloudlog call that includes c.get('requestId') and
queueName.

In `@supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql`:
- Around line 33-37: The code currently clips calls_needed with LEAST(..., 10),
causing large queues to only dispatch up to 10 batches; remove that fixed cap so
calls_needed equals the computed ceil(queue_size / batch_size) (cast to int)
instead of LEAST(...,10). Update the assignment for calls_needed to compute
pg_catalog.ceil(queue_size / batch_size::double precision)::int (or equivalent)
and remove the trailing , 10 argument so all required batches are scheduled.
- Around line 94-103: The loop currently downgrades per-queue failures to
warnings so a shared healthcheck_url can be pinged even if some queues failed;
change the logic to record failed queues (e.g., declare an array or boolean like
failed_queues or had_error) inside the EXCEPTION block for the FOR/EACH loop
that calls public.process_function_queue(queue_name, batch_size,
healthcheck_url), keep the RAISE WARNING for context, and after the FOREACH
finishes check the recorder and RAISE EXCEPTION (including the failed queue
names and SQLERRM details) if any failures occurred so the cron job reports
overall failure instead of falsely succeeding.
- Around line 206-208: The CASE branch handling WHEN 'function' is currently
concatenating task.target into a SECURITY DEFINER EXECUTE, which allows
arbitrary SQL; instead parse and validate task.target (e.g. split into function
name and args, enforce allowed identifier pattern and optional safe argument
types) and then build the statement with pg_catalog.format() to safely quote
identifiers/values before EXECUTE; update the logic in the CASE that references
task.target and the EXECUTE to validate the parsed function name (deny if it
contains unsafe characters or schema changes) and use pg_catalog.format() (e.g.
format with %I for the function identifier and appropriate %L/%s for args) so
the executed SQL is properly escaped and cannot run arbitrary commands as
postgres.

In `@tests/queue-consumer-message-shape.unit.test.ts`:
- Around line 217-290: Add a regression test in
tests/queue-consumer-message-shape.unit.test.ts that exercises
__queueConsumerTestUtils__.maybePingCronHealthcheck when the healthcheck request
fails: create a DB with zero pending work via createHealthcheckDb(0), supply a
fetchImpl mock that either returns a Response with non-2xx status (e.g. 500) or
rejects, call maybePingCronHealthcheck with the same success payload used in the
passing test, then assert that the function returns reported === false and that
fetchImpl was called exactly once (expect(fetchImpl).toHaveBeenCalledTimes(1));
reference maybePingCronHealthcheck and createHealthcheckDb to locate where to
add the case.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6b387245-628a-42c0-9bc1-7e5265a91008

📥 Commits

Reviewing files that changed from the base of the PR and between 0b4064e and a66f8cb.

📒 Files selected for processing (5)
  • supabase/functions/_backend/triggers/queue_consumer.ts
  • supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
  • tests/cron-healthchecks.test.ts
  • tests/queue-consumer-message-shape.unit.test.ts
  • tests/security-definer-execute-hardening.test.ts

Comment thread supabase/functions/_backend/triggers/queue_consumer.ts
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql Outdated
Comment thread supabase/migrations/20260521210531_cron_hyperping_healthchecks.sql
@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from a66f8cb to 8f668b4 Compare May 23, 2026 11:13
@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from 8f668b4 to b07069a Compare May 23, 2026 11:19
@riderx riderx force-pushed the codex/hyperping-cron-healthchecks branch from b07069a to 2fa0622 Compare May 23, 2026 11:24
@sonarqubecloud
Copy link
Copy Markdown

@riderx riderx merged commit fe36317 into main May 23, 2026
41 checks passed
@riderx riderx deleted the codex/hyperping-cron-healthchecks branch May 23, 2026 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant