Skip to content

test: stabilize flaky cli encryption and queue tests#1861

Merged
riderx merged 2 commits intomainfrom
codex/fix-flaky-cli-encryption-tests
Mar 25, 2026
Merged

test: stabilize flaky cli encryption and queue tests#1861
riderx merged 2 commits intomainfrom
codex/fix-flaky-cli-encryption-tests

Conversation

@riderx
Copy link
Copy Markdown
Member

@riderx riderx commented Mar 25, 2026

Summary (AI generated)

  • serialize CLI SDK key generation and bundle uploads so concurrent tests cannot corrupt cwd-dependent key files
  • force the test upload helper to use zip uploads instead of drifting into flaky TUS behavior
  • remove shared-mock concurrency in the RBAC unit test and add cleanup/retry hardening to the queue sync test

Motivation (AI generated)

The production failures and flakes were coming from test infrastructure rather than product logic. The CLI encryption tests shared global SDK state through process cwd and root-level key generation, while the queue and RBAC tests still had small timing and shared-mock races under Vitest parallelism.

Business Impact (AI generated)

This reduces false negatives in CI/CD, makes release validation more reliable, and cuts time spent re-running or investigating flaky test failures. More stable test signal lowers the risk of delayed merges and wasted engineering time.

Test Plan (AI generated)

  • bunx eslint tests/cli-sdk-utils.ts tests/bundle-metadata-rbac.unit.test.ts tests/queue_load.test.ts
  • bun run supabase:with-env -- bunx vitest run tests/bundle-metadata-rbac.unit.test.ts tests/queue_load.test.ts tests/cli-new-encryption.test.ts

Generated with AI

Summary by CodeRabbit

  • Tests
    • Made metadata-permission tests run sequentially for more reliable outcomes.
    • Added per-test queue cleanup and a resilient sync helper with retry/backoff for sturdier queue tests.
    • Hardened SDK test utilities: moved key generation to an external subprocess with serialized execution, enforced explicit upload defaults, and stabilized working-directory handling to avoid cross-test interference.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Removed concurrent execution from two RBAC tests; changed key generation in test utils to spawn a subprocess and serialized SDK operations with explicit cwd handling; added per-test queue cleanup and a retrying /queue_consumer/sync helper used by queue load tests.

Changes

Cohort / File(s) Summary
RBAC Test Execution
tests/bundle-metadata-rbac.unit.test.ts
Removed .concurrent from two test cases so they run sequentially; assertions and mocked permission behavior unchanged.
SDK Test Utilities
tests/cli-sdk-utils.ts
Replaced in-process key generation with a spawned bun -e script via execFile, centralized operation serialization (queue), changed cwd mutation to use chdir(...)/restore, and set uploadBundleSDK defaults (useZip: true, useTus: false).
Queue Tests & Helpers
tests/queue_load.test.ts
Added beforeEach cleanup deleting pgmq.q_* and pgmq.a_*; introduced fetchQueueSync(queueName, maxRetries=4) helper that POSTs /queue_consumer/sync, asserts {status: 'ok'} on HTTP 202 and retries with 250ms backoff; updated tests to use the helper.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

codex

Poem

🐇 In test-time meadows I hop and sing,

Keys spawned outside, queues in a ring.
Sync calls retry with a patient thump,
Sequential tests keep the logs from dump.
Hooray — tidy runs, now pass that bump! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main objective: stabilizing flaky tests in CLI encryption and queue functionality through serialization and retry logic improvements.
Description check ✅ Passed The description includes a comprehensive summary, motivation, business impact, and test plan with verification steps, though it could be more detailed about specific changes per file.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/fix-flaky-cli-encryption-tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/queue_load.test.ts (1)

14-24: Consider removing the redundant beforeAll cleanup.

With beforeEach now performing the same cleanup (lines 21-24), the beforeAll block (lines 14-19) is redundant. Every test will start with a clean queue regardless.

♻️ Proposed simplification
-beforeAll(async () => {
-  // Clean up any existing messages in the test queue
-  // Count before cleanup for debugging
-  await pool.query(`DELETE FROM pgmq.q_${queueName}`)
-  await pool.query(`DELETE FROM pgmq.a_${queueName}`)
-})
-
 beforeEach(async () => {
   await pool.query(`DELETE FROM pgmq.q_${queueName}`)
   await pool.query(`DELETE FROM pgmq.a_${queueName}`)
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/queue_load.test.ts` around lines 14 - 24, Remove the redundant
beforeAll cleanup block: since beforeEach already runs await pool.query(`DELETE
FROM pgmq.q_${queueName}`) and await pool.query(`DELETE FROM
pgmq.a_${queueName}`) before every test, delete the entire beforeAll(...) that
performs the same pool.query deletions (references: beforeAll, beforeEach,
pool.query, queueName) so tests rely solely on beforeEach to reset state.
tests/cli-sdk-utils.ts (1)

260-273: Consider making the delay configurable or documenting the source.

The 100ms delay for async SDK config writes is a pragmatic workaround. If this proves insufficient in CI environments with higher latency, consider making it configurable or adding a retry loop that checks if the config file was modified.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli-sdk-utils.ts` around lines 260 - 273, Replace the fixed 100ms sleep
used to wait for async SDK writes (the anonymous Promise around setTimeout near
CAPACITOR_CONFIG_PATH and configBackup handling) with a configurable wait or a
retry loop: expose a constant or parameter (e.g., SDK_CONFIG_WRITE_TIMEOUT_MS or
a helper waitForConfigWrite function) and implement retries that check the
filesystem (using readFileSync and comparing to expected modified content or
timestamp) until the config file reflects the SDK change or a timeout elapses;
then restore via writeFileSync within the same try/catch as before. This keeps
the existing cleanup logic around configBackup/writeFileSync but makes the delay
robust and configurable for CI.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/cli-sdk-utils.ts`:
- Around line 260-273: Replace the fixed 100ms sleep used to wait for async SDK
writes (the anonymous Promise around setTimeout near CAPACITOR_CONFIG_PATH and
configBackup handling) with a configurable wait or a retry loop: expose a
constant or parameter (e.g., SDK_CONFIG_WRITE_TIMEOUT_MS or a helper
waitForConfigWrite function) and implement retries that check the filesystem
(using readFileSync and comparing to expected modified content or timestamp)
until the config file reflects the SDK change or a timeout elapses; then restore
via writeFileSync within the same try/catch as before. This keeps the existing
cleanup logic around configBackup/writeFileSync but makes the delay robust and
configurable for CI.

In `@tests/queue_load.test.ts`:
- Around line 14-24: Remove the redundant beforeAll cleanup block: since
beforeEach already runs await pool.query(`DELETE FROM pgmq.q_${queueName}`) and
await pool.query(`DELETE FROM pgmq.a_${queueName}`) before every test, delete
the entire beforeAll(...) that performs the same pool.query deletions
(references: beforeAll, beforeEach, pool.query, queueName) so tests rely solely
on beforeEach to reset state.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a00b2273-37fd-45f1-a9ab-9b0ee2dffab8

📥 Commits

Reviewing files that changed from the base of the PR and between 9cf7d9a and 9b7e57f.

📒 Files selected for processing (3)
  • tests/bundle-metadata-rbac.unit.test.ts
  • tests/cli-sdk-utils.ts
  • tests/queue_load.test.ts

@sonarqubecloud
Copy link
Copy Markdown

@riderx riderx merged commit b7715f8 into main Mar 25, 2026
13 of 14 checks passed
@riderx riderx deleted the codex/fix-flaky-cli-encryption-tests branch March 25, 2026 06:23
@coderabbitai coderabbitai bot mentioned this pull request Mar 27, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant