feat: streaming support in m serve OpenAI API server by markstur · Pull Request #823 · generative-computing/mellea

markstur · 2026-04-11T01:41:37Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Fixes m serve OpenAI API streaming support #822

Add OpenAI API compatible support for streaming in m serve app.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

github-actions · 2026-04-11T01:41:51Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

planetf1

Some blocking findings

planetf1

noticed a few things to tighten up.

Minor, but looking at the tests - The TestStreamingEndpoint tests (e.g. line 166) are marked @pytest.mark.asyncio and declared async def but contain no await — TestClient is synchronous and doesn't need the marker. TestStreamingHelpers is fine.

Suspect it will still work but it implies the wrong behaviour?

markstur · 2026-04-15T06:12:23Z

noticed a few things to tighten up.

Minor, but looking at the tests - The TestStreamingEndpoint tests (e.g. line 166) are marked @pytest.mark.asyncio and declared async def but contain no await — TestClient is synchronous and doesn't need the marker. TestStreamingHelpers is fine.

Suspect it will still work but it implies the wrong behaviour?

fixed

markstur · 2026-04-15T06:18:21Z

thanks for the review! Addressed all the comments. See replies if there was some question, but I think they are covered.

psschwei

couple of things flagged by claude

psschwei

LGTM
Thanks @markstur !
I think since @planetf1 had requested changes, he'll also need to approve (?)

planetf1

Thanks for the PR — streaming support is well-structured and the test coverage is thorough. Two things worth addressing before merge:

planetf1 · 2026-04-16T12:28:22Z

Also opened #873 as I noted example is importing from the cli package, which I would suggest is inappropriate - it's not new to this PR though.

markstur · 2026-04-16T18:30:11Z

Thanks for the review! Even the little things are really good for me to look closer at.
Addressed all the comments and rebased.

markstur · 2026-04-16T19:12:32Z

I guess now that another PR merged, I have to get used to saying Assisted-by: IBM Bob
I guess I can do that on squash/merge.

Fixes: generative-computing#822 Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Added streaming support w/ setting system_fingerprint. Make it consistent. We are currently just setting it to None but now it is consistent for future use. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Doesn't belong in library. This is for cli. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

astream() returns deltas (only new fragments), not accumulated text. Update docstring. Fix unused previous_length in streaming.py. Rename vars for clarity. Fix streaming tests to be consistent with the non-accumulating behavoir. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Adds missing yield of the [DONE] that clients will expect. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Since stream defaults to False, a regression was introduced where stream=False is now passed to backends where it used to be default. Fix is to only forward stream=True and not add stream=False. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Stuff was marked asyncio and async def when it didn't need to be. Moved the usage tests a little for readability. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

In m serve, usage was included for backward compatibility but this is a new feature so that's not an issue. Instead the OpenAI spec is what we want to follow. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Use content=None instead of content="" to be more correct with OpenAI API. Remove unneeded check for "" in the test. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Test-first for... Bug where we get into a loop without checking for this first. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Fix bug where we get into loop despite already having computed chunk. Update test which was written to test existing bug -- use wording that makes it current and not referencing the non-existent bug. Make the asserts more precise with expected chunks. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

…vior The default is now to only include usage when it is asked for when streaming. This is consistent with the OpenAI API spec. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Following OpenAI API spec and better for validation. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Some validation tests including how coercion from 1 or "true" to True is handled. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

planetf1

All looks good! Thanks for addressing (and followup)

markstur requested a review from a team as a code owner April 11, 2026 01:41

markstur requested review from akihikokuroda and avinash2692 April 11, 2026 01:41

github-actions bot added the enhancement New feature or request label Apr 11, 2026

psschwei requested review from psschwei and removed request for akihikokuroda April 11, 2026 02:06

planetf1 requested changes Apr 14, 2026

View reviewed changes

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread cli/serve/app.py

planetf1 requested changes Apr 14, 2026

View reviewed changes

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread cli/serve/models.py Outdated

Comment thread docs/examples/m_serve/README.md

Comment thread docs/examples/m_serve/m_serve_example_streaming.py Outdated

psschwei reviewed Apr 14, 2026

View reviewed changes

Comment thread docs/examples/m_serve/README.md

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

markstur requested review from jakelorocco and nrfulton as code owners April 15, 2026 05:56

markstur requested a review from planetf1 April 15, 2026 06:18

markstur enabled auto-merge April 15, 2026 06:18

psschwei reviewed Apr 15, 2026

View reviewed changes

Comment thread cli/serve/streaming.py Outdated

Comment thread mellea/helpers/openai_compatible_helpers.py Outdated

Comment thread cli/serve/models.py Outdated

Comment thread cli/serve/streaming.py Outdated

markstur requested a review from psschwei April 15, 2026 17:59

psschwei approved these changes Apr 15, 2026

View reviewed changes

planetf1 reviewed Apr 16, 2026

View reviewed changes

Comment thread cli/serve/streaming.py Outdated

planetf1 reviewed Apr 16, 2026

View reviewed changes

Comment thread cli/serve/streaming.py

planetf1 reviewed Apr 16, 2026

View reviewed changes

Comment thread cli/serve/models.py

planetf1 mentioned this pull request Apr 16, 2026

feat: re-export serve types from public mellea.* namespace #873

Open

markstur force-pushed the issue_822 branch from cfb870b to 1533a5d Compare April 16, 2026 18:25

markstur requested a review from planetf1 April 16, 2026 18:30

markstur disabled auto-merge April 16, 2026 19:11

markstur added 23 commits April 16, 2026 16:17

feat: streaming support in m serve OpenAI API server

3a2ad3d

Fixes: generative-computing#822 Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

test: add streaming test file

ca5a9dd

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: docstring fix for streaming

bd40cea

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: make system_fingerprint consistent in m serve

cdc8422

Added streaming support w/ setting system_fingerprint. Make it consistent. We are currently just setting it to None but now it is consistent for future use. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: move stream_chat_completion_chunks out of helpers and into cli

0f798d2

Doesn't belong in library. This is for cli. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: emit [DONE] after error when streaming

dc04ce2

Adds missing yield of the [DONE] that clients will expect. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: remove misleading asyncio tags in test_serve_streaming.py

7cc02d3

Stuff was marked asyncio and async def when it didn't need to be. Moved the usage tests a little for readability. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: only include usage in stream results when requested

0e72ee3

In m serve, usage was included for backward compatibility but this is a new feature so that's not an issue. Instead the OpenAI spec is what we want to follow. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: add missing code blocks in docs/examples/m_serve/README.md

a2515be

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: add missing dict type details

6eb326c

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: m serve streaming initial and final chunks use content=None

89ad014

Use content=None instead of content="" to be more correct with OpenAI API. Remove unneeded check for "" in the test. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

test: add test to confirm bug when mot is already computed

6459a05

Test-first for... Bug where we get into a loop without checking for this first. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: remove unused import in openai_compatible_helpers.py

e4eb62d

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: fix model comment about stream_options/incude_usage default beha…

3b26d33

…vior The default is now to only include usage when it is asked for when streaming. This is consistent with the OpenAI API spec. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: typing fix

c909cdf

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: use pedantic model for m serve stream_options

eeee6ae

Following OpenAI API spec and better for validation. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

test: add validation unit tests for StreamOptions model

f4c4932

Some validation tests including how coercion from 1 or "true" to True is handled. Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: add logging of streaming exception

fdd456f

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: handle empty string content in streaming

dfd16b5

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

fix: FancyLogger is no longer Fancy

52d1866

Signed-off-by: Mark Sturdevant <mark.sturdevant@ibm.com>

markstur force-pushed the issue_822 branch from 1533a5d to 52d1866 Compare April 16, 2026 23:22

planetf1 approved these changes Apr 17, 2026

View reviewed changes

jakelorocco approved these changes Apr 17, 2026

View reviewed changes

markstur added this pull request to the merge queue Apr 17, 2026

Merged via the queue into generative-computing:main with commit ec8cc18 Apr 17, 2026
4 checks passed

markstur deleted the issue_822 branch April 17, 2026 19:48

Conversation

markstur commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Testing

Uh oh!

github-actions bot commented Apr 11, 2026

Uh oh!

planetf1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

markstur commented Apr 15, 2026

Uh oh!

markstur commented Apr 15, 2026

Uh oh!

psschwei left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

psschwei left a comment

Choose a reason for hiding this comment

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

planetf1 commented Apr 16, 2026

Uh oh!

markstur commented Apr 16, 2026

Uh oh!

markstur commented Apr 16, 2026

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

markstur commented Apr 11, 2026 •

edited

Loading

planetf1 left a comment •

edited

Loading