Skip to content

fix(elixir-client): Add fast-loop detection#4028

Merged
magnetised merged 2 commits intomainfrom
magnetised/zoqmymlvnmyw
Mar 19, 2026
Merged

fix(elixir-client): Add fast-loop detection#4028
magnetised merged 2 commits intomainfrom
magnetised/zoqmymlvnmyw

Conversation

@magnetised
Copy link
Copy Markdown
Contributor

When a server or CDN bug causes repeated responses at the same offset without progress, the client would hammer the server indefinitely. Port the TypeScript client's fast-loop detection: track non-live requests in a 500ms sliding window, and if 5+ requests occur at the same offset, clear caches on first detection, apply exponential backoff on subsequent detections, and raise an error after 5 consecutive detections.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
916 1 915 0
View the top 1 failed test(s) by shortest run time
test/client.test.ts > should fall back to long polling after 3 consecutive short SSE connections
Stack Traces | 0.414s run time
AssertionError: expected 5 to be less than or equal to 4
 ❯ test/client.test.ts:2987:37

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@claude
Copy link
Copy Markdown

claude bot commented Mar 19, 2026

@/tmp/review_body.txt

@magnetised magnetised force-pushed the magnetised/zoqmymlvnmyw branch 3 times, most recently from a7c037e to 0ee52aa Compare March 19, 2026 15:33
@claude
Copy link
Copy Markdown

claude bot commented Mar 19, 2026

Claude Code Review

Summary

This PR ports the TypeScript client's fast-loop detection to the Elixir client. When a server or CDN bug causes repeated non-progressive responses at the same offset, the client now detects the stuck retry loop, clears caches on first detection, applies exponential backoff on subsequent ones, and raises an error after 5 consecutive detections. The implementation is clean and well-tested.

What's Working Well

  • Algorithm mirrors the TypeScript implementation faithfully - same 500ms window, 5-request threshold, first-detection cache clear, exponential backoff, and 5-detection error ceiling.
  • Good defensive reset coverage - ShapeState.reset/2 (called on 409 must-refetch) now also clears fast-loop state, preventing false detections across shape handle changes.
  • Clear state machine - the {:ok, _} / {:backoff, ms, _} / {:error, message} return shape from check_fast_loop/1 maps cleanly onto the stream dispatch logic.
  • Actionable error message - the error text lists concrete CDN/proxy misconfiguration causes and links to the troubleshooting guide.
  • Good test coverage - unit tests cover all check_fast_loop/1 branches plus reset/2 and clear_fast_loop/1; integration test verifies the full stream raises when stuck.

Issues Found

Critical (Must Fix): None.

Important (Should Fix): None.

Suggestions (Nice to Have):

  1. Use start_link_supervised!/1 for the test Agent

The integration test starts an Agent with Agent.start_link/1. Project convention is to use start_link_supervised!/1 so ExUnit's supervisor handles cleanup on test failure. While the Agent is linked to the test process and will be cleaned up on normal exit, start_link_supervised!/1 is more robust.

  1. Document that stale-retry requests also feed the fast-loop detector

The {:stale_retry, ...} path calls fetch/1 recursively, so stale-retry requests are also counted in the sliding window and can trigger fast-loop detection. This is arguably correct behaviour (repeated stale 409s at the same offset are a CDN loop), but a brief comment near the stale_retry handler would make this interaction explicit for future readers.

  1. Minor: Enum.map(0..3, fn _ -> ... end) in tests could use List.duplicate

Several tests build lists of identical entries via Enum.map(0..3, fn _ -> ... end). List.duplicate(entry, 4) is slightly more expressive of intent, but this is purely stylistic.

Issue Conformance

No linked issue was attached to this PR. The PR description clearly explains the change and references the TypeScript client's matching behaviour, which provides sufficient context. Ideally a GitHub issue would be created to track the underlying bug (undetected fast retry loops) and linked here.

Previous Review Status

This is the first review iteration.


Review iteration: 1 | 2026-03-19

When a server or CDN bug causes repeated responses at the same offset
without progress, the client would hammer the server indefinitely. Port
the TypeScript client's fast-loop detection: track non-live requests in
a 500ms sliding window, and if 5+ requests occur at the same offset,
clear caches on first detection, apply exponential backoff on subsequent
detections, and raise an error after 5 consecutive detections.
@magnetised magnetised force-pushed the magnetised/zoqmymlvnmyw branch from 0ee52aa to 8ec8965 Compare March 19, 2026 15:41
@magnetised magnetised merged commit 9335ae6 into main Mar 19, 2026
24 of 26 checks passed
@magnetised magnetised deleted the magnetised/zoqmymlvnmyw branch March 19, 2026 15:47
@github-actions
Copy link
Copy Markdown
Contributor

This PR has been released! 🚀

The following packages include changes from this PR:

  • @core/elixir-client@0.9.3

Thanks for contributing to Electric!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants