Incorrect streaming chat responses when using Azure OpenAI client's chat completions #8084

parthmishra · 2024-02-05T21:20:25Z

Checklist

I have searched the existing issues for similar issues.
I added a very descriptive title to this issue.
I have provided sufficient information below to help reproduce this issue.

Summary

When using write_stream() with chat completions from the Azure OpenAI client (not the regular one!), the first ChatChunkCompletion returns an empty choices list because it returns extra Responsible AI information first prior to sending the actual chat completions. When attempting to write_stream() this, it ends up pre-filling the chat response with the str of the ChatChunkCompletion

Reproducible Code Example

import streamlit as st
from openai import AzureOpenAI

client = AzureOpenAI(...) # need an active OpenAI model and Azure subscription to fill out this

# set up using LLM chatgpt Streamlit tutorial, just including the assistant portion below for brevity
 
with st.chat_message("assistant"):
    stream = client.chat.completions.create(model="gpt-4", message=messages, stream=True)
    response = st.write_stream(stream)
    st.session_state.messages.append({"role": "assistant", "content": response})

Steps To Reproduce

Create an Azure OpenAI model deployment for any model that supports chat completions
Create a streamed response from Chat Completions method
Try to use write_stream() to write the stream

Expected Behavior

The response should be written if there is actual content to display in the ChatCompletionChunk

Current Behavior

In the above image you can see that the choices list is empty which it's usually not for regular OpenAI client calls.

Is this a regression?

Yes, this used to work in a previous version.

Debug info

Streamlit version: 1.31.0
Python version: 3.11.4
Operating System: macOS Sonoma 14.3
Browser: Safari/Chrome

Additional Information

While this issue is specific to Azure OpenAI client and should ideally be fixed upstream, I don't think it would hurt if Streamlit could do a check in its write API (e.g. if len(chunk.choices > 0)) to make sure there is actually content to write

I think it's possible to workaround this as user by just swallowing the first chunk of the stream (e.g. next(stream)) before sending to write_stream() but I haven't tried yet.

Same as this issue: microsoft/semantic-kernel#3650

See other examples of guarding against this: AbanteAI/mentat#430

The text was updated successfully, but these errors were encountered:

github-actions · 2024-02-05T21:20:38Z

If this issue affects you, please react with a 👍 (thumbs up emoji) to the initial post.

Your feedback helps us prioritize which bugs to investigate and address first.

LukasMasuch · 2024-02-05T21:31:13Z

@parthmishra Thanks for reporting this issue 👍 We haven't really looked into AzureOpenAI client yet, but that's something we will likely fix for the next release.

parthmishra · 2024-02-05T21:47:25Z

@LukasMasuch thank you for looking into this!

While the issue is specific to AzureOpenAI, I think from a streamlit perspective it seems like it makes sense to only stream the chunk if there is actual content to display in a ChatChunkCompletion regardless of what client was used to create it.

## Describe your changes This PR adds support for chat streams from AzureOpenAI and implements a couple of other related follow-up improvements. ## GitHub Issue Link (if applicable) - Closes #8084 ## Testing Plan - Updated tests --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

## Describe your changes This PR adds support for chat streams from AzureOpenAI and implements a couple of other related follow-up improvements. ## GitHub Issue Link (if applicable) - Closes streamlit#8084 ## Testing Plan - Updated tests --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.

parthmishra added status:needs-triage Has not been triaged by the Streamlit team type:bug Something isn't working labels Feb 5, 2024

LukasMasuch added status:confirmed Bug has been confirmed by the Streamlit team feature:st.write_stream and removed status:needs-triage Has not been triaged by the Streamlit team labels Feb 5, 2024

LukasMasuch added the priority:P2 label Feb 5, 2024

LukasMasuch added type:enhancement Requests for feature enhancements or new features and removed type:bug Something isn't working labels Feb 6, 2024

streamlit deleted a comment from github-actions bot Feb 6, 2024

sfc-gh-jcarroll removed status:confirmed Bug has been confirmed by the Streamlit team priority:P2 labels Feb 6, 2024

LukasMasuch mentioned this issue Feb 13, 2024

Add support for AzureOpenAI chat stream #8107

Merged

LukasMasuch closed this as completed in #8107 Feb 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect streaming chat responses when using Azure OpenAI client's chat completions #8084

Incorrect streaming chat responses when using Azure OpenAI client's chat completions #8084

parthmishra commented Feb 5, 2024 •

edited

github-actions bot commented Feb 5, 2024

LukasMasuch commented Feb 5, 2024

parthmishra commented Feb 5, 2024

Incorrect streaming chat responses when using Azure OpenAI client's chat completions #8084

Incorrect streaming chat responses when using Azure OpenAI client's chat completions #8084

Comments

parthmishra commented Feb 5, 2024 • edited

Checklist

Summary

Reproducible Code Example

Steps To Reproduce

Expected Behavior

Current Behavior

Is this a regression?

Debug info

Additional Information

github-actions bot commented Feb 5, 2024

LukasMasuch commented Feb 5, 2024

parthmishra commented Feb 5, 2024

parthmishra commented Feb 5, 2024 •

edited