Skip to content

fix(serializer): correctly serialize float('-inf') as "-Infinity"#1683

Open
devteamaegis wants to merge 2 commits into
langfuse:mainfrom
devteamaegis:fix/logic-error-data-corruption
Open

fix(serializer): correctly serialize float('-inf') as "-Infinity"#1683
devteamaegis wants to merge 2 commits into
langfuse:mainfrom
devteamaegis:fix/logic-error-data-corruption

Conversation

@devteamaegis
Copy link
Copy Markdown

@devteamaegis devteamaegis commented May 31, 2026

What's broken

EventSerializer._default_inner() in langfuse/_utils/serializer.py converts float('-inf') to the string "Infinity" instead of "-Infinity". The guard math.isinf(obj) returns True for both positive and negative infinity, but the code unconditionally returns "Infinity" without checking the sign. Any LLM output, metric, or metadata field containing negative infinity is silently sent to the Langfuse server with the wrong sign, causing data corruption.

Why it happens

math.isinf() returns True for both float('inf') and float('-inf'), so the sign check was simply missing.

Fix

Changed the single return statement to return "-Infinity" if obj < 0 else "Infinity", which is a one-line change in langfuse/_utils/serializer.py.

Test

Added test_infinity_floats() in tests/unit/test_serializer.py that asserts both float('inf') encodes to "Infinity" and float('-inf') encodes to "-Infinity".

Fixes #1682

Greptile Summary

This PR fixes a sign bug in EventSerializer._default_inner() where float('-inf') was incorrectly serialized as "Infinity" instead of "-Infinity". The one-line fix is correct and the primary regression test is solid, but the diff also bundles a substantial depth-limiting feature and method restructuring that are not described.

  • -inf fix (serializer.py line 79): return "-Infinity" if obj < 0 else "Infinity" correctly handles both signs; confirmed by the new test_infinity_floats test.
  • Depth-limiting feature (_MAX_DEPTH = 20, _depth counter, _default_inner refactor): guards against deep recursion for dict/slots/BaseModel objects, but items inside tuple/set/frozenset collections bypass the limit because they are returned as list(obj) without recursive default() calls and reach super().encode() with a reset depth context.
  • Test suite expansion: six new tests covering infinity, depth truncation, slot truncation, and dict key serialization — good coverage of the new behaviour.

Confidence Score: 4/5

The core bug fix is correct and safe to merge; the bundled depth-limiting refactor is mostly sound but has a gap for tuple/set/frozenset nesting.

The -inf serialization fix works exactly as intended. The depth-limiting infrastructure is a welcome addition and handles the common cases (dicts, slots, dataclasses, BaseModel). The gap is that items returned from tuple/set/frozenset branches bypass _MAX_DEPTH because they go through super().encode() with a reset depth counter — a structure like ((deep_obj,),) can still recurse without hitting the limit. The scope of the PR is also much larger than the description conveys, making it harder to review the full behavioral surface.

The tuple/set/frozenset branch in langfuse/_utils/serializer.py (around line 139) deserves a closer look; the remaining changes are straightforward.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["encode(obj)"] --> B["_depth = 0\nseen.clear()"]
    B --> C["self.default(obj)\n_depth → 1"]
    C --> D["_default_inner(obj)"]
    D --> E{Primitive?\nstr / float / int\nUUID / datetime / …}
    E -- Yes --> F["Return value\n(before depth check)"]
    E -- No --> G{_depth >= _MAX_DEPTH?}
    G -- Yes --> H["Return '<TypeName>'"]
    G -- No --> I{Complex type?}
    I -- dataclass --> J["asdict(obj)"]
    I -- BaseModel --> K["obj.model_dump()"]
    I -- dict --> L["{ default(k): default(v) }\n_depth +1 per call"]
    I -- list/Sequence --> M["[ default(item) ]\n_depth +1 per call"]
    I -- tuple/set/frozenset --> N["list(obj) ⚠\nitems NOT via default()"]
    I -- __slots__ --> O["{ slot: default(val) }\n_depth +1 per call"]
    I -- __dict__ --> P["circular-ref check\nthen { k: default(v) }"]
    N --> Q["super().encode(raw_list)"]
    Q --> R["JSONEncoder calls default(item)\n_depth RESETS to 0 ⚠"]
    J & K & L & M & O & P --> S["Return serialized form"]
    S --> T["super().encode(result)"]
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
langfuse/_utils/serializer.py:139-140
**Depth limit bypassed for tuple/set/frozenset items**

`list(obj)` is returned as-is — the items are **not** passed through `self.default()`. `super().encode()` then calls `self.default()` on each non-serializable item, but by that point `_depth` has been decremented back to 0. A structure like `((((custom_obj,),),),)` triggers `default()` repeatedly from depth 0, so `_MAX_DEPTH` provides no protection for objects nested exclusively inside these collection types.

### Issue 2 of 2
langfuse/_utils/serializer.py:38-51
**PR scope significantly exceeds the described one-line fix**

The description says this is "a one-line change in `langfuse/_utils/serializer.py`" to fix the `-inf` sign, but the diff includes a new `_MAX_DEPTH` depth-limiting feature, the `default()``_default_inner()` refactor, reordering of all the type-dispatch branches, and a changed `__slots__` strategy. These are meaningful behavioral changes (e.g. objects inside tuples/sets now have a different depth-tracking context) that warrant their own description and review focus.

Reviews (1): Last reviewed commit: "test(serializer): add test for float('-i..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Comment on lines 139 to 140
if isinstance(obj, (tuple, set, frozenset)):
return list(obj)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Depth limit bypassed for tuple/set/frozenset items

list(obj) is returned as-is — the items are not passed through self.default(). super().encode() then calls self.default() on each non-serializable item, but by that point _depth has been decremented back to 0. A structure like ((((custom_obj,),),),) triggers default() repeatedly from depth 0, so _MAX_DEPTH provides no protection for objects nested exclusively inside these collection types.

Prompt To Fix With AI
This is a comment left during a code review.
Path: langfuse/_utils/serializer.py
Line: 139-140

Comment:
**Depth limit bypassed for tuple/set/frozenset items**

`list(obj)` is returned as-is — the items are **not** passed through `self.default()`. `super().encode()` then calls `self.default()` on each non-serializable item, but by that point `_depth` has been decremented back to 0. A structure like `((((custom_obj,),),),)` triggers `default()` repeatedly from depth 0, so `_MAX_DEPTH` provides no protection for objects nested exclusively inside these collection types.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: EventSerializer serializes float('-inf') as "Infinity" instead of "-Infinity"

2 participants