Fix SSE streaming format to comply with specification #7182

Copilot · 2025-11-07T19:06:04Z

Fix SSE streaming format to comply with specification

Complete Solution

This PR fixes all streaming response issues causing OpenAI API client failures:

SSE Format Fixes ✅

Explicit SSE formatting: data: {json}\n\n with proper %s specifier
Error message streaming: Proper JSON responses with correct SSE format
Robust error handling: Explicit marshal error checking with fallbacks

OpenAI Compliance Fixes ✅

Empty finish_reason: Changed to *string with omitempty
- Streaming chunks: field omitted (null)
- Final chunk: "finish_reason": "stop"
Empty initial chunks: Initial message now only sends role
- Before: {"delta": {"role": "assistant", "content": ""}}
- After: {"delta": {"role": "assistant"}}
Proper stream termination: Final chunk has empty delta with finish_reason
- Before: {"delta": {"content": ""}, "finish_reason": "stop"}
- After: {"delta": {}, "finish_reason": "stop"} + data: [DONE]

Code Quality Improvements ✅

Constants for finish reasons: Centralized constants prevent typos and improve maintainability
- FinishReasonStop = "stop"
- FinishReasonToolCalls = "tool_calls"
- FinishReasonFunctionCall = "function_call"

Streaming Flow (OpenAI Compliant)

// 1. Initial chunk - only role
data: {"choices": [{"index": 0, "delta": {"role": "assistant"}}]}

// 2. Content chunks - no finish_reason field
data: {"choices": [{"index": 0, "delta": {"content": "Hello"}}]}
data: {"choices": [{"index": 0, "delta": {"content": " world"}}]}

// 3. Final chunk - empty delta with finish_reason
data: {"choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}]}
data: [DONE]

Files Modified

core/http/endpoints/openai/constants.go: New file with finish reason constants
core/http/endpoints/openai/chat.go: Streaming fixes + use constants
core/http/endpoints/openai/completion.go: Streaming fixes + use constants
core/http/endpoints/openai/realtime.go: Consistency updates + use constants
core/schema/openai.go: FinishReason schema change

Validation

✅ Passes go vet and gofmt
✅ No security vulnerabilities (CodeQL)
✅ OpenAI API specification compliant
✅ Backward compatible for non-streaming responses

Fixes #7180

Original prompt

This section details on the original issue you should resolve

<issue_title>Address streaming responses incongruences</issue_title>
<issue_description>LocalAI version:
3.7.0

Environment, CPU architecture, OS, and Version:

Describe the bug
It seems that the way streaming is implemented in LocalAI breaks some clients, such as sst/opencode#3994 , and see also: #1194

To Reproduce
N/A yet

Expected behavior
Both opencode and other software using streaming token should work

Logs

Additional context

A list of things that doesn't seem to work are noted in task update but no model output with LocalAI / Qwen3 8B sst/opencode#3994 (comment)

Streaming implementation is in https://github.com/mudler/LocalAI/blob/master/core/http/endpoints/openai/chat.go for OpenAI
</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Address streaming responses incongruences #7180

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

netlify · 2025-11-07T19:06:10Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`9c0c1ae`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/690e5fdc1892a8000879ceab
😎 Deploy Preview	https://deploy-preview-7182--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

- Replace json.Encoder with json.Marshal for explicit formatting - Use explicit \n\n for all SSE messages (instead of relying on implicit newlines) - Change %v to %s format specifier for proper string formatting - Fix error message streaming to include proper SSE format - Ensure consistency between chat.go and completion.go endpoints Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

- Handle json.Marshal errors explicitly in error response paths - Add fallback simple error message if marshal fails - Prevents sending 'data: <nil>' on marshal failures - Addresses code review feedback Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

mudler · 2025-11-07T20:25:57Z

@copilot try to address additionally also these points:

Empty finish_reason: LocalAI sends "finish_reason": "" instead of "stop" or null
Empty initial chunks: Sends empty content chunks that cause streaming clients to terminate early
Missing proper stream termination: Doesn't properly signal completion to OpenCode's processor

- Change FinishReason from string to *string in Choice schema - Streaming chunks now omit finish_reason (null) instead of empty string - Final chunks properly set finish_reason to "stop", "tool_calls", etc. - Remove empty content from initial streaming chunks (only send role) - Final streaming chunk sends empty delta with finish_reason - Addresses OpenAI API compliance issues causing client failures Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Copilot · 2025-11-07T20:37:58Z

@copilot try to address additionally also these points:

Empty finish_reason: LocalAI sends "finish_reason": "" instead of "stop" or null
...

All three issues have been addressed in commit deefcb0:

Empty finish_reason: Changed schema to use *string with omitempty - streaming chunks now omit the field (null) while final chunks properly set it to "stop"
Empty initial chunks: Initial message now only sends role without content field - removed Content: &textContentToReturn which was empty
Missing proper stream termination: Final chunk now sends empty delta {} with finish_reason, followed by data: [DONE] per OpenAI spec

The streaming format now matches OpenAI's specification:

Initial: {"delta": {"role": "assistant"}}
Chunks: {"delta": {"content": "text"}} (no finish_reason)
Final: {"delta": {}, "finish_reason": "stop"} + data: [DONE]

- Use consistent pattern: declare variable then take address - Remove inline anonymous function for better readability - Addresses code review feedback Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

mudler · 2025-11-07T20:47:10Z

core/http/endpoints/openai/chat.go

 						}
 						log.Error().Msgf("Stream ended with error: %v", err)

+						stopReason := "stop"


@copilot move common finish reasons to constants

Done in commit 9c0c1ae. Created constants.go with:

FinishReasonStop = "stop"

FinishReasonToolCalls = "tool_calls"

FinishReasonFunctionCall = "function_call"

All string literals replaced with these constants across chat.go, completion.go, and realtime.go.

- Create constants.go with FinishReasonStop, FinishReasonToolCalls, FinishReasonFunctionCall - Replace all string literals with constants in chat.go, completion.go, realtime.go - Improves code maintainability and prevents typos Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Initial plan

c60c6c2

Copilot AI assigned Copilot and mudler Nov 7, 2025

Copilot started work on behalf of mudler November 7, 2025 19:06 View session

Copilot AI and others added 3 commits November 7, 2025 19:14

Fix SSE streaming format to comply with specification

0f2a126

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Address incongruences in streaming responses~~ Fix SSE streaming format to comply with specification Nov 7, 2025

Copilot AI requested a review from mudler November 7, 2025 19:24

Copilot finished work on behalf of mudler November 7, 2025 19:24

Copilot started work on behalf of mudler November 7, 2025 20:26 View session

Improve code consistency for string pointer creation

b38426f

- Use consistent pattern: declare variable then take address - Remove inline anonymous function for better readability - Addresses code review feedback Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

mudler reviewed Nov 7, 2025

View reviewed changes

Copilot AI requested a review from mudler November 7, 2025 20:48

Copilot finished work on behalf of mudler November 7, 2025 20:48

Copilot started work on behalf of mudler November 7, 2025 20:53 View session

Copilot finished work on behalf of mudler November 7, 2025 21:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix SSE streaming format to comply with specification #7182

Fix SSE streaming format to comply with specification #7182

Uh oh!

Copilot AI commented Nov 7, 2025 •

edited

Loading

Uh oh!

netlify bot commented Nov 7, 2025 •

edited

Loading

Uh oh!

mudler commented Nov 7, 2025

Uh oh!

Copilot AI commented Nov 7, 2025

Uh oh!

mudler Nov 7, 2025

Uh oh!

Copilot AI Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix SSE streaming format to comply with specification #7182

Are you sure you want to change the base?

Fix SSE streaming format to comply with specification #7182

Uh oh!

Conversation

Copilot AI commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!