fix(serializer): detect cycles in dict/list/Sequence/__slots__ branches#1657
Open
wtfashwin wants to merge 1 commit into
Open
fix(serializer): detect cycles in dict/list/Sequence/__slots__ branches#1657wtfashwin wants to merge 1 commit into
wtfashwin wants to merge 1 commit into
Conversation
EventSerializer only guarded the __dict__ branch against cyclic graphs. A cycle through any of the other container branches recursed until the RecursionError was swallowed inside Python's GC, leaving the asyncio loop GIL-starved for minutes. Apply the same id()-keyed `seen` pattern to the four uncovered branches, wrapped in try/finally so a shared object reachable from sibling subtrees (a DAG, not a cycle) is not mis-marked as a cycle. Fixes langfuse#1655
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1655.
EventSerializer.defaultonly guarded the__dict__branch against cyclic object graphs. Cycles through thedict,list,Sequence, or__slots__branches recursed until the resultingRecursionErrorwas swallowed inside Python's GC machinery, leaving the asyncio event loop GIL-starved for minutes (a real production symptom — repro and stack details on the issue).The same
id()-keyedseenpattern is now applied to the four uncovered branches. Each guard is wrapped intry/finallywithset.discard, so a shared object reachable from two sibling subtrees (a DAG, not a cycle) is not mis-marked. The pre-existing__dict__branch behaviour and cycle-marker format are unchanged — the change is purely additive.Marker format on the new branches is
"<cycle:<Type>>"(as suggested in the issue body); the existing__dict__marker stays as baretype(obj).__name__to preserve back-compat.Why this matters
RecursionErroris raised inside a GC finalizer and dropped — there is no exception in user code, just a stalled event loop.@observedecorator when a logged payload happens to be cyclic (e.g. an OAuth token-refresh log record on FastAPI workers).Changes
langfuse/_utils/serializer.py— id-based cycle guard added to thedict,list,Sequence, and__slots__branches (mirroring the existing__dict__guard). The__dict__branch'sset.removewas switched toset.discardinside atry/finallyso an exception mid-walk no longer leaksseenstate.tests/unit/test_serializer_cycle_detection.py— 15 new regression tests covering:dict,list, customSequence,__slots__-only object)a → b → a) and mixed-container cycles (dict → list → dict)seenstate__dict__cycle marker (type(obj).__name__) is preservedTest plan
uv run --frozen pytest tests/unit/test_serializer_cycle_detection.py tests/unit/test_serializer.py -v— 33 / 33 passuv run --frozen pytest -n auto --dist worksteal tests/unit— 435 pass, 2 skipped, 0 failuv run --frozen ruff check .(changed files) — cleanuv run --frozen ruff format --check .(changed files) — already formatteduv run --frozen mypy langfuse --no-error-summary— cleanGreptile Summary
This PR fixes a silent production hang (issue #1655) where cyclic object graphs passed through
@observecould stall the asyncio event loop: theRecursionErrorwas swallowed inside Python's GC, never surfacing to user code. The sameid()-based cycle guard that already existed for the__dict__branch is now applied to thedict,list,Sequence, and__slots__branches, each wrapped intry/finally+set.discardso that shared-but-non-cyclic objects (DAGs) are not incorrectly flagged.dict,list,Sequence,__slots__) now detect cycles and return a\"<cycle:Type>\"sentinel instead of recursing; the pre-existing__dict__marker format (type(obj).__name__) is left unchanged for back-compat.__dict__branch is hardened:set.remove(which can raiseKeyErrorif state is corrupt) is replaced withset.discardinside atry/finally, matching the new branches.Confidence Score: 4/5
Safe to merge. Changes are additive cycle guards on four previously unprotected branches, each using the same try/finally + discard pattern so DAGs are not incorrectly flagged, and the test suite is thorough.
One test comment incorrectly describes which branch is exercised for _CycleSequence — the Sequence branch fires before dict, not after — which could mislead future maintainers tracing a regression. The production code itself is correct; this is a documentation gap in the test file only.
The comment at line 203 of tests/unit/test_serializer_cycle_detection.py should be corrected before merge.
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["EventSerializer.default(obj)"] --> B{Type dispatch} B -->|tuple/set/frozenset| C["return list(obj)\nno cycle guard needed"] B -->|dict| D{"id in seen?"} B -->|list| E{"id in seen?"} B -->|Sequence| F{"id in seen?"} B -->|has slots| G{"id in seen?"} B -->|has dict| H{"id in seen?"} D -->|yes| I["return cycle:dict sentinel"] D -->|no| J["add id to seen\nrecurse keys+values\nfinally discard id"] E -->|yes| K["return cycle:list sentinel"] E -->|no| L["add id to seen\nrecurse items\nfinally discard id"] F -->|yes| M["return cycle:Sequence sentinel"] F -->|no| N["add id to seen\nrecurse items\nfinally discard id"] G -->|yes| O["return cycle:SlotType sentinel"] G -->|no| P["add id to seen\nrecurse slots dict\nfinally discard id"] H -->|yes| Q["return type name\nback-compat marker"] H -->|no| R["add id to seen\nrecurse __dict__\nfinally discard id"]Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(serializer): detect cycles in dict/l..." | Re-trigger Greptile