Replace line iter with bytes to lines wrapper by sjmonson · Pull Request #663 · vllm-project/guidellm

sjmonson · 2026-03-30T18:25:11Z

Summary

Read bytes stream rather then lines stream and manually split on double newline \n\n.

Details

Some models (gpt-oss) don't escape their newlines which causes httpx.iter_lines to split in the wrong spot. Note that is could potently could still be a problem if a model emits a \n\n token. so we may need a more robust solution in the future.

Test Plan

See original issue.

Related Issues

Resolves orjson.JSONDecodeError (truncated JSON at char 158) when using /v1/completions with openai/gpt-oss-120b #662

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Samuel Monson <smonson@redhat.com>

src/guidellm/backends/openai/http.py

jaredoconnell

This does seem more like a workaround, but if this fixes the bug we can try it.

sjmonson · 2026-03-31T19:04:07Z

So this doesn't fully fix the issue but I think merge it anyways because it fixes a separate issue. We have been over-counting event iterations due to counting each newline when we really should have been counting each \n\n split. The actual gpt-oss issue may be unfixable. I thought this was working around a bug in httpx but in-fact the issue is that gpt-oss is emitting invalid JSON. The only way to fix it would be to manually correct the JSON which is either too computationally expensive or escapes valid JSON as well as invaild.

Signed-off-by: Samuel Monson <smonson@redhat.com>

dbutenhof

This approach won't break anything worse than before, so if it avoids problems on even a subset of cases, it's a win. I agree: let's go for it.

sjmonson · 2026-03-31T19:24:02Z

Next release we can look at https://html.spec.whatwg.org/multipage/server-sent-events.html#event-stream-interpretation for building a better parser.

## Summary Partially reverts #663 to iterating over lines, but keeps the skipping of blank newlines. ## Details #663 switched the HTTP backend to iterating over byte strings. The problem is that is did not handle the case where a line was split over multiple iterations. ## Test Plan Run a benchmark with known errored request rate (preferably 0) and ensure that there are no failed requests due to `orjson.JSONDecodeError: unexpected end of data`. --- - [x] "I certify that all code in this PR is my own, except as noted below." ## Use of AI - [ ] Includes AI-assisted code completion - [ ] Includes code generated by an AI application - [ ] Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes `## WRITTEN BY AI ##`)

Replace line iter with bytes to lines wrapper

e065c12

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson self-assigned this Mar 30, 2026

dbutenhof previously approved these changes Mar 30, 2026

View reviewed changes

src/guidellm/backends/openai/http.py Show resolved Hide resolved

jaredoconnell previously approved these changes Mar 30, 2026

View reviewed changes

sjmonson marked this pull request as draft March 31, 2026 18:18

Skip blank lines

804f3c7

Signed-off-by: Samuel Monson <smonson@redhat.com>

sjmonson dismissed stale reviews from jaredoconnell and dbutenhof via 804f3c7 March 31, 2026 19:10

sjmonson marked this pull request as ready for review March 31, 2026 19:10

sjmonson requested review from dbutenhof and jaredoconnell March 31, 2026 19:10

dbutenhof approved these changes Mar 31, 2026

View reviewed changes

sjmonson merged commit 22b4dfd into main Mar 31, 2026
18 checks passed

sjmonson deleted the fix/gptoss_streaming branch March 31, 2026 20:10

sjmonson mentioned this pull request Apr 1, 2026

Revert back to iterating over lines #680

Merged

4 tasks

dbutenhof added this to the v0.6.0 milestone Apr 7, 2026

dbutenhof added the bug Represents a user-visible defect label Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace line iter with bytes to lines wrapper#663

Replace line iter with bytes to lines wrapper#663
sjmonson merged 2 commits intomainfrom
fix/gptoss_streaming

sjmonson commented Mar 30, 2026

Uh oh!

Uh oh!

jaredoconnell left a comment

Uh oh!

sjmonson commented Mar 31, 2026 •

edited

Loading

Uh oh!

dbutenhof left a comment

Uh oh!

sjmonson commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sjmonson commented Mar 30, 2026

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

Uh oh!

jaredoconnell left a comment

Choose a reason for hiding this comment

Uh oh!

sjmonson commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dbutenhof left a comment

Choose a reason for hiding this comment

Uh oh!

sjmonson commented Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sjmonson commented Mar 31, 2026 •

edited

Loading