Add a dead letter queue and re-add more bare-boned metrics#488
Add a dead letter queue and re-add more bare-boned metrics#488
Conversation
WalkthroughAdds a dead-letter queue (DLQ) and tracking table, RPCs for queue and circuit-breaker metrics, DLQ routing and enriched error logging in the GitHub async worker, a metrics function refactor to use two RPCs, a small Upstash concurrency tweak, and a Supabase client version bump. Changes
Sequence Diagram(s)sequenceDiagram
actor Worker as Async Worker
participant Queue as pgmq q_async_calls
participant Handler as Message Handler
participant DLQ as pgmq q_async_calls_dlq
participant DB as async_worker_dlq_messages
participant Sentry
Worker->>Queue: consume message
Handler->>Handler: process message (circuit/rate-limit checks)
alt route to DLQ (retry_count ≥ 5 or circuit/rate-limit/error + injection)
Handler->>DLQ: enqueue envelope
Handler->>DB: insert DLQ tracking row (envelope, error, retry_count, class_id, debug_id, log_id)
Handler->>Sentry: capture error with DLQ context
Note right of DB: archive original message
else normal retry/requeue or archive
Handler->>Queue: requeue / delay / archive per logic
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
supabase/functions/github-async-worker/index.ts (4)
252-344: DLQ helper is well-scoped; consider surfacing failures to callers.The DLQ helper cleanly encapsulates queue send + tracking-table insert and Sentry context. However, all failures are swallowed (logged only) and callers always proceed to archive the original message. That means if both the DLQ send and the tracking-table insert fail, the message is permanently lost even though the DLQ path failed.
If you want stricter guarantees, consider propagating a failure signal (e.g., boolean/Result or thrown error) so callers can decide not to archive when DLQ persistence fails, or at least differentiate between “successfully dead-lettered” and “failed to dead-letter.”
479-486: Org-level circuit breaker DLQ threshold looks correct.The guard to send to DLQ when
retry_count >= 5under an open org-level circuit is consistent with the requeue strategy: you get the initial attempt plus up to five retries before dead-lettering, and the original message is archived in the same branch.If you find yourself adding more similar blocks, it might be worth extracting a small helper like
if (shouldDeadLetter(envelope)) …to avoid repeating the same pattern.
1116-1122: Rate-limit branch DLQ cutoff is placed in a sensible position.Checking
retry_countand routing to DLQ before further circuit-breaker logic ensures rate-limited messages don’t requeue indefinitely. The archive call aftersendToDeadLetterQueuematches the rest of the worker’s pattern.Same note as above: if this pattern grows further, a shared helper to encapsulate the DLQ + archive decision would reduce duplication.
1214-1220: Fallback error path correctly participates in DLQ policy.The final error path (where
orgis unknown) now also honors theretry_count >= 5rule before requeueing. This ensures all error scenarios eventually converge on DLQ, not just those with a recognized org/method.No functional problems; just be aware the same retry logic is now duplicated in three places, so a small helper could make future changes less error-prone.
supabase/functions/metrics/index.ts (1)
37-45: Label escaping helper and circuit-breaker metrics look good; consider a small null-guard.The
escapeLabelhelper correctly handles backslashes, quotes, and control characters, and the per-breaker metric lines follow Prometheus label syntax. Assumingscope,key, andstateare always non-null strings from the RPC, this is solid.If there’s any chance of
scope/key/statebeingnullorundefined, a small defensive cast (e.g.,escapeLabel(cb.scope ?? "unknown")) would prevent runtime errors.Also applies to: 63-71
supabase/functions/_shared/SupabaseTypes.d.ts (3)
9568-9571: Batch gradebook recalc RPC is okay; consider documenting the expected row shape forp_rows.
enqueue_gradebook_row_recalculation_batchwithp_rows: Json[]is flexible but untyped. If this is going to be used from multiple places, it may be worth standardizing a TS interface (in app code) that documents the expected keys in each row so you don’t lose type‑safety end‑to‑end.
9998-10005: Gradebook column update‑with‑recalc RPCs are consistent with existing patterns.
update_gradebook_column_student_with_recalc(p_id,p_updates: Json) andupdate_gradebook_column_students_batch_with_recalc(p_updates: Json[], returnsJson) follow the established “updates as Json payload” pattern. From a types standpoint this is fine; if you find yourself reusing these payload shapes heavily, you could introduce shared TS types on the client to avoid ad‑hocJsoneverywhere.
10017-10020:update_gradebook_rows_batchRPC signature is reasonable for bulk updates.The
p_batch_updates: Json[]→Jsonreturn is in line with the other batch gradebook APIs. No obvious type issues here; just keep the payload contract documented somewhere outside this generated file so callers know exactly what to send and what to expect back.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
supabase/functions/_shared/SupabaseTypes.d.ts(9 hunks)supabase/functions/github-async-worker/index.ts(6 hunks)supabase/functions/metrics/index.ts(1 hunks)supabase/migrations/20251113141300_create-async-worker-dlq.sql(1 hunks)utils/supabase/SupabaseTypes.d.ts(9 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
supabase/functions/github-async-worker/index.ts (2)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
Database(3-10099)Json(1-1)supabase/functions/_shared/GitHubAsyncTypes.ts (1)
GitHubAsyncEnvelope(99-107)
utils/supabase/SupabaseTypes.d.ts (1)
supabase/functions/_shared/SupabaseTypes.d.ts (1)
Json(1-1)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
utils/supabase/SupabaseTypes.d.ts (2)
Json(1-1)Database(3-10099)utils/supabase/DatabaseTypes.d.ts (1)
Json(3-3)
⏰ Context from checks skipped due to timeout of 180000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: deploy
🔇 Additional comments (11)
supabase/functions/github-async-worker/index.ts (2)
505-512: Method-level circuit breaker DLQ behavior mirrors org-level logic.The method-specific circuit breaker correctly reuses the same
retry_count >= 5threshold and archive flow, with a more specific error message keyed on the method. This keeps behavior consistent between org-scoped and org+method-scoped breakers.No functional issues spotted here.
1171-1177: Non-rate-limit (org-known) error DLQ handling is consistent.For non-rate-limit errors where
orgis known, the newretry_count >= 5guard sends messages to DLQ and archives them instead of requeuing, which aligns with the rate-limit path and prevents endless retries on persistent failures.Behavior looks correct and keeps the immediate circuit-breaker semantics intact.
supabase/functions/metrics/index.ts (1)
17-33: RPC-based metrics fetch is straightforward and robust.The switch to two focused RPCs with immediate error checks (
queueError/circuitBreakerError) simplifies the metrics path and keeps failure modes explicit. Using the first row ofqueueSizesas the source of async/DLQ counts is reasonable given the RPC contract.No issues from a control-flow perspective.
utils/supabase/SupabaseTypes.d.ts (3)
858-910: DLQ table typing looks consistent with usage patternsThe
async_worker_dlq_messagestable definition (column nullability, use ofJson, andgithub_async_methodenum, plusclass_idFK) matches the patterns used elsewhere (e.g.api_gateway_calls) and looks correct for DLQ usage. No issues from a typing perspective.Please just confirm that this matches the actual SQL migration (column names, nullability, and enum type) so codegen and database stay in sync.
4989-5038:review_assignments.hard_deadlinefield shape looks right; check downstream handling
hard_deadlineis non-nullable onRowand optional onInsert/Update, which is the expected pattern for a NOT NULL column with a default (likelyfalse). Typing looks fine and should be backwards compatible for inserts.I’d recommend:
- Verifying the migration sets an explicit default for
hard_deadline.- Double-checking any UI/API code that reasons about review deadlines now considers this flag where appropriate.
9301-9317: New RPC type signatures are coherent; ensure they match the SQL functionsThe added function typings (
check_can_add_to_help_request,check_can_remove_from_help_request, the gradebook with_recalc helpers, and the async queue / circuit-breaker metrics RPCs) all have sensible argument names and return types, and they’re aligned with existing conventions (e.g.Returns: undefinedfor void,Json/Json[]for opaque payloads).Please confirm against the SQL functions that:
- Parameter names and orders match exactly (especially for the batch
Json[]arguments).- Result column names in
get_async_queue_sizesandget_circuit_breaker_statuses(async_queue_size,dlq_queue_size,is_open,key, etc.) are spelled identically, so row mapping stays type-safe.Also applies to: 9568-9571, 9599-9615, 9998-10005, 10017-10020
supabase/migrations/20251113141300_create-async-worker-dlq.sql (1)
1-60: DLQ table, indexes, and RLS policies look good.The table design with comprehensive error tracking fields, appropriate indexes on frequently-queried columns, foreign key with cascade delete, and restrictive RLS policies are all solid. The pgmq queue creation with exception handling is also a good pattern.
supabase/functions/_shared/SupabaseTypes.d.ts (4)
858-910: DLQ table schema looks consistent; confirm envelope contents are appropriately sanitized.The
async_worker_dlq_messagestable shape and enum usage (github_async_method) look aligned with the rest of the async/GitHub telemetry. The only thing to double‑check is thatenvelopeandlast_error_contextdon’t carry raw secrets (tokens, headers, etc.) before being stored, since the type is a genericJsonand can’t enforce that at compile time.
4988-5038:hard_deadlinefield wiring looks correct; ensure DB nullability/defaults match the type.Adding
hard_deadline: booleanonRowwith optional flags onInsert/Updatematches how other booleans are modeled here. Just make sure the underlying column isNOT NULLwith a sensible default (probablyfalse), otherwise Supabase could hand younullat runtime where the type says plainboolean.
9301-9317: Help‑request permission RPC signatures are clear and consistent.The new
check_can_add_to_help_request/check_can_remove_from_help_requestRPC types look fine. The only thing to keep straight in calling code is thatp_user_idis an auth/user id whilep_profile_id_to_removeis a profile id; the type system can’t distinguish those, so it’s worth being deliberate at call sites.
9599-9615: Async queue and circuit‑breaker metrics RPC types look good.The result shapes for
get_async_queue_sizesandget_circuit_breaker_statusesare straightforward and match the intended use for lightweight metrics RPCs. Just ensure the callers treat these as arrays of rows (as modeled here) and not a single object.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
supabase/functions/metrics/index.ts (2)
17-19: Consider parallelizing independent RPC calls.The two RPC calls are independent and could be executed concurrently using
Promise.allto reduce latency.Apply this diff to parallelize the calls:
- // Query queue sizes and circuit breaker statuses using RPC functions - const { data: queueSizes, error: queueError } = await supabase.rpc("get_async_queue_sizes"); - const { data: circuitBreakers, error: circuitBreakerError } = await supabase.rpc("get_circuit_breaker_statuses"); + // Query queue sizes and circuit breaker statuses using RPC functions + const [ + { data: queueSizes, error: queueError }, + { data: circuitBreakers, error: circuitBreakerError } + ] = await Promise.all([ + supabase.rpc("get_async_queue_sizes"), + supabase.rpc("get_circuit_breaker_statuses") + ]);
47-61: Metrics format looks correct with proper timestamps.The Prometheus text exposition format is correct, and the timestamp is now properly in milliseconds as required.
Optional: Consider making the version configurable.
The version "1.0.0" is hardcoded in line 49. You might want to make this configurable via environment variable or import it from a version file for easier maintenance.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
supabase/functions/metrics/index.ts(1 hunks)supabase/migrations/20251113141300_create-async-worker-dlq.sql(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- supabase/migrations/20251113141300_create-async-worker-dlq.sql
⏰ Context from checks skipped due to timeout of 180000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: deploy
🔇 Additional comments (5)
supabase/functions/metrics/index.ts (5)
21-29: LGTM!Error handling is correct with appropriate logging before throwing, ensuring errors are captured and reported properly.
35-35: Timestamp units issue resolved.The previous review correctly identified that the timestamp should be in milliseconds for Prometheus. This has been fixed—
Date.now()returns milliseconds directly.
37-45: LGTM!The label escaping function correctly handles all required Prometheus special characters (backslash, quotes, newline, carriage return, tab).
63-72: LGTM!The circuit breaker metrics generation correctly handles empty arrays, properly escapes label values, and follows the Prometheus gauge format with appropriate timestamp usage.
31-32: No issues found—code correctly matches RPC contract.The
get_async_queue_sizes()RPC is designed to return exactly one row with aggregated queue counts. Accessing[0]is appropriate, and the optional chaining with default values provides proper defensive handling.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
870-922: DLQ table typings look consistent; consider whetherretry_countshould be required on insert
async_worker_dlq_messagesmatches the usual Row/Insert/Update pattern and reuses existing types (Json,github_async_method,class_idFK) consistently. The only question is whetherInsert["retry_count"]should be required: if the DB column has a default (e.g., 0 for first insertion), you may want to make it optional inInsertlikegithub_circuit_breakers.trip_countto keep callers from having to thread an explicit0everywhere. If there is no default by design, current typing is fine.
9585-9603: Arg-less RPC typings work, butArgsshape is inconsistent with other no-arg functionsThese new RPCs’ return types look good and line up with their intended metrics payloads. For
Args, you’ve usedRecord<PropertyKey, never>, whereas most other arg-less functions here useArgs: never. Both can work withsupabase-js, but mixing conventions is a bit surprising; if these types weren’t generated, consider standardizing on whichever form your codegen or client version expects (likelynever) for easier mental parsing and to avoid subtlerpccall-site friction.supabase/functions/github-async-worker/index.ts (2)
479-486: Clarify and centralize retry threshold semantics used for DLQ routing.All DLQ routing branches gate on
envelope.retry_count ?? 0 >= 5, which in combination withrequeueWithDelaymeans a message can be processed up to 6 times (initial attempt + 5 requeues) before it is finally dead-lettered.This is totally reasonable, but it’s subtle and duplicated several times (circuit-open pre-checks and all error paths). I’d suggest:
- Extracting a small helper (e.g.
hasExceededRetryThreshold(envelope, limit = 5)) and reusing it in all locations.- Double-checking whether the intention is “5 retries” (i.e. 6 total attempts) or “5 attempts total”; if the latter, the comparison should be
> 4after increment, rather than>= 5on the pre-increment value.This will make the behavior easier to reason about and harder to accidentally diverge across branches.
Also applies to: 505-512, 1133-1139, 1188-1194, 1231-1237
528-543: DLQ failure injection is powerful—consider stricter scoping to non‑prod environments.The
INJECT_DLQ_FAILUREgate plus a 50% random failure on first attempt will reliably drive messages to the DLQ threshold, but if this env var is ever enabled in production it will:
- Artificially inflate error and circuit‑breaker metrics, and
- Potentially trip long‑lived breakers.
To reduce blast radius, consider also checking an environment flag for the deployment stage (or a dedicated
INJECT_DLQ_FAILURE_ENV=staging-style guard), or tightening the condition to only act whenretry_count >= 1(no random failures on the first attempt) so it doesn’t penalize fresh messages.Example tweak:
- if (injectDlqFailure === "true" || injectDlqFailure === "1") { - if (retryCount >= 1 || Math.random() < 0.5) { + if (injectDlqFailure === "true" || injectDlqFailure === "1") { + if (retryCount >= 1) { const error = new Error(`DLQ test failure injection - simulating processing error (retry ${retryCount})`); scope.setTag("dlq_test_injection", "true"); scope.setContext("dlq_test_injection", { retry_count: retryCount, method: envelope.method }); throw error; } }or guard it further with a non‑prod env check.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (7)
package.json(1 hunks)supabase/functions/_shared/GitHubSyncHelpers.ts(1 hunks)supabase/functions/_shared/SupabaseTypes.d.ts(2 hunks)supabase/functions/github-async-worker/index.ts(7 hunks)supabase/functions/metrics/index.ts(1 hunks)supabase/migrations/20251113141300_create-async-worker-dlq.sql(1 hunks)utils/supabase/SupabaseTypes.d.ts(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- utils/supabase/SupabaseTypes.d.ts
🧰 Additional context used
🧬 Code graph analysis (3)
supabase/functions/metrics/index.ts (1)
tests/e2e/TestingUtils.ts (1)
supabase(12-12)
supabase/functions/github-async-worker/index.ts (2)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
Database(3-10118)Json(1-1)supabase/functions/_shared/GitHubAsyncTypes.ts (1)
GitHubAsyncEnvelope(99-107)
supabase/functions/_shared/SupabaseTypes.d.ts (1)
utils/supabase/SupabaseTypes.d.ts (1)
Json(1-1)
⏰ Context from checks skipped due to timeout of 180000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: deploy
🔇 Additional comments (3)
supabase/functions/_shared/GitHubSyncHelpers.ts (1)
71-73: Clarify the rationale for reducing concurrency limits from 30 to 20.The Upstash-backed rate limiter's
reservoir,maxConcurrent, andreservoirRefreshAmountare all being reduced from 30 to 20. This represents a ~33% reduction in concurrent sync operations, which could impact throughput during bulk sync operations across multiple repositories.While tighter concurrency controls may improve stability and reduce GitHub API rate limit issues (especially with the new DLQ system handling failures), it would be helpful to understand:
- What drove the decision to reduce to 20 specifically?
- Has there been testing or monitoring to validate this won't create bottlenecks?
- Are there metrics in place (perhaps via the new
get_async_queue_sizesRPC mentioned in the AI summary) to monitor queue depth and latency?Consider documenting the rationale in a comment near line 58, and ensure the new metrics endpoints can track if this concurrency limit becomes a bottleneck.
package.json (1)
178-178: Verify Supabase CLI v2.58.5 compatibility with existing scripts and workflows.The changelog for Supabase CLI v2.56.0 through v2.58.5 reveals no explicit breaking changes in the visible releases—mostly bug fixes and feature additions (JWT issuer support, auth notification templates, type-generation improvements). Type-generation improvements require recent CLI versions (example: supabase-js v2.75.1 + CLI > 2.53.1), which v2.58.5 satisfies.
However, the full changelog from v2.23.4 through v2.55.x could not be fully retrieved. Before merging:
- Test type generation: Run the script at lines 21–22 (
npx supabase gen types) locally to confirm output matches expectations- Verify migrations: Ensure existing migrations in
supabase/migrations/still apply without errors- Check DLQ/RPC features: Confirm the new DLQ table and RPC definitions work as expected with v2.58.5
supabase/functions/metrics/index.ts (1)
17-37: Metrics path looks sane; RPC usage and Prometheus formatting are consistent.The simplified metrics endpoint using
get_async_queue_sizesandget_circuit_breaker_statusesis straightforward, and the Prometheus output now uses millisecond timestamps (Date.now()), which aligns with the text exposition format expectations. Label escaping for circuit‑breaker dimensions is also a good touch.Only thing to keep an eye on is that
queueSizes?.[0]assumes a single-row result fromget_async_queue_sizes(); that matches the current function definition, but if it ever changes to multiple rows you’ll want to aggregate explicitly.Overall this looks good to ship as-is.
Also applies to: 48-63, 68-75
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
supabase/migrations/20251113141300_create-async-worker-dlq.sql (1)
61-61: Update comment to reflect that this RPC only returns queue sizes.The comment incorrectly states this function returns "async queue sizes and circuit breaker statuses," but
get_async_queue_sizes()only returns queue sizes. Circuit breaker statuses come from the separateget_circuit_breaker_statuses()function below.
🧹 Nitpick comments (2)
supabase/functions/github-async-worker/index.ts (2)
252-450: Consider extracting duplicated Sentry logging into a helper function.The Sentry context and message logging is repeated identically in four error branches (lines 276-300, 309-333, 364-388, 397-421) and again in the success path (lines 425-447). This duplication makes maintenance harder and increases the risk of inconsistency.
Extract the shared logging into a helper:
function logDlqAttempt( scope: Sentry.Scope, envelope: GitHubAsyncEnvelope, meta: { msg_id: number; enqueued_at: string }, errorMessage: string, errorType: string, retryCount: number, success: boolean ) { scope.setTag("dlq", "true"); scope.setTag("retry_count", String(retryCount)); scope.setContext("dead_letter_queue", { original_msg_id: meta.msg_id, method: envelope.method, retry_count: retryCount, error_message: errorMessage, error_type: errorType, enqueued_at: meta.enqueued_at, class_id: envelope.class_id, debug_id: envelope.debug_id, log_id: envelope.log_id }); const action = success ? "sent to" : "Failed to send message to"; Sentry.captureMessage( `${action} dead letter queue after ${retryCount} retries: ${envelope.method}`, { level: "error", tags: { dlq: "true", method: envelope.method, retry_count: String(retryCount) } } ); }Then replace all five duplicated blocks with a single call to this helper.
585-603: Consider extracting the DLQ routing pattern into a helper function.The circuit breaker DLQ routing logic is duplicated at lines 585-603 and 622-640, differing only in the error message. This pattern is repeated two more times in the rate limit and error handlers (lines 1263-1280, 1329-1346, 1383-1400).
A helper function would reduce duplication:
async function routeToDlqOrLeaveUnarchived( adminSupabase: SupabaseClient<Database>, envelope: GitHubAsyncEnvelope, meta: { msg_id: number; enqueued_at: string }, error: Error, scope: Sentry.Scope ): Promise<boolean> { const dlqSuccess = await sendToDeadLetterQueue(adminSupabase, envelope, meta, error, scope); if (dlqSuccess) { await archiveMessage(adminSupabase, meta.msg_id, scope); return true; } else { console.error(`Failed to send message ${meta.msg_id} to DLQ, leaving unarchived`); scope.setContext("dlq_archive_skipped", { msg_id: meta.msg_id, reason: "DLQ send failed" }); Sentry.captureMessage(`Message ${meta.msg_id} not archived due to DLQ failure`, { level: "error" }); return false; } }This would consolidate the five nearly-identical DLQ routing blocks scattered throughout the error handling paths.
Also applies to: 622-640
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
supabase/functions/github-async-worker/index.ts(7 hunks)supabase/migrations/20251113141300_create-async-worker-dlq.sql(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
supabase/functions/github-async-worker/index.ts (2)
supabase/functions/_shared/SupabaseTypes.d.ts (2)
Database(3-10118)Json(1-1)supabase/functions/_shared/GitHubAsyncTypes.ts (1)
GitHubAsyncEnvelope(99-107)
⏰ Context from checks skipped due to timeout of 180000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: deploy
🔇 Additional comments (3)
supabase/migrations/20251113141300_create-async-worker-dlq.sql (1)
1-59: LGTM! DLQ infrastructure is well-structured.The queue, tracking table, indexes, and RLS policies are properly configured. The foreign key cascade behavior and index choices support efficient querying patterns.
supabase/functions/github-async-worker/index.ts (2)
656-673: Consider clarifying the test injection logic or making it more predictable.The DLQ test injection condition
if (retryCount >= 1 || Math.random() < 0.5)has mixed behavior:
- First attempt (retryCount = 0): 50% chance to fail
- Subsequent attempts: Always fails
This ensures messages eventually reach the 5-retry DLQ threshold, but the 50% randomness on the first try may be unexpected. Consider either:
- Making it deterministic (always fail when enabled), or
- Adding a comment explaining why the random initial failure is desirable for test coverage.
252-450: Past review concerns have been addressed.The function now returns a boolean indicating success, and all call sites (lines 589-602, 626-639, 1266-1279, 1332-1346, 1386-1400) properly check the result before archiving. The INJECT_DLQ_FAILURE test injection is also protected with environment checks. Well done on implementing the suggested improvements!
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
Summary by CodeRabbit
New Features
Improvements
✏️ Tip: You can customize this high-level summary in your review settings.