Skip to content

Replace line iter with bytes to lines wrapper#663

Merged
sjmonson merged 2 commits intomainfrom
fix/gptoss_streaming
Mar 31, 2026
Merged

Replace line iter with bytes to lines wrapper#663
sjmonson merged 2 commits intomainfrom
fix/gptoss_streaming

Conversation

@sjmonson
Copy link
Copy Markdown
Collaborator

Summary

Read bytes stream rather then lines stream and manually split on double newline \n\n.

Details

Some models (gpt-oss) don't escape their newlines which causes httpx.iter_lines to split in the wrong spot. Note that is could potently could still be a problem if a model emits a \n\n token. so we may need a more robust solution in the future.

Test Plan

See original issue.

Related Issues


  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Samuel Monson <smonson@redhat.com>
@sjmonson sjmonson self-assigned this Mar 30, 2026
dbutenhof
dbutenhof previously approved these changes Mar 30, 2026
jaredoconnell
jaredoconnell previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Collaborator

@jaredoconnell jaredoconnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem more like a workaround, but if this fixes the bug we can try it.

@sjmonson sjmonson marked this pull request as draft March 31, 2026 18:18
@sjmonson
Copy link
Copy Markdown
Collaborator Author

sjmonson commented Mar 31, 2026

So this doesn't fully fix the issue but I think merge it anyways because it fixes a separate issue. We have been over-counting event iterations due to counting each newline when we really should have been counting each \n\n split. The actual gpt-oss issue may be unfixable. I thought this was working around a bug in httpx but in-fact the issue is that gpt-oss is emitting invalid JSON. The only way to fix it would be to manually correct the JSON which is either too computationally expensive or escapes valid JSON as well as invaild.

Signed-off-by: Samuel Monson <smonson@redhat.com>
@sjmonson sjmonson dismissed stale reviews from jaredoconnell and dbutenhof via 804f3c7 March 31, 2026 19:10
@sjmonson sjmonson marked this pull request as ready for review March 31, 2026 19:10
Copy link
Copy Markdown
Collaborator

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach won't break anything worse than before, so if it avoids problems on even a subset of cases, it's a win. I agree: let's go for it.

@sjmonson
Copy link
Copy Markdown
Collaborator Author

Next release we can look at https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation for building a better parser.

@sjmonson sjmonson merged commit 22b4dfd into main Mar 31, 2026
18 checks passed
@sjmonson sjmonson deleted the fix/gptoss_streaming branch March 31, 2026 20:10
@sjmonson sjmonson mentioned this pull request Apr 1, 2026
4 tasks
sjmonson added a commit that referenced this pull request Apr 1, 2026
## Summary

Partially reverts #663 to iterating over lines, but keeps the skipping
of blank newlines.

## Details

#663 switched the HTTP backend to iterating over byte strings. The
problem is that is did not handle the case where a line was split over
multiple iterations.

## Test Plan

Run a benchmark with known errored request rate (preferably 0) and
ensure that there are no failed requests due to `orjson.JSONDecodeError:
unexpected end of data`.

---

- [x] "I certify that all code in this PR is my own, except as noted
below."

## Use of AI

- [ ] Includes AI-assisted code completion
- [ ] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)
@dbutenhof dbutenhof added this to the v0.6.0 milestone Apr 7, 2026
@dbutenhof dbutenhof added the bug Represents a user-visible defect label Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Represents a user-visible defect

Projects

None yet

Development

Successfully merging this pull request may close these issues.

orjson.JSONDecodeError (truncated JSON at char 158) when using /v1/completions with openai/gpt-oss-120b

3 participants