Stream GPT-4, ChatGPT and GPT-3.5 responses & Deploy Streamlit apps

The code is to demonstrate usage of streaming with GPT-4 API, ChatGPT API and InstructGPT (GPT-3.5.) models & Streamlit-app.

Pre-requisites:

The approach uses only openai and time libraries and re-prints the streams using print(end='', flush=True):

!pip install --upgrade openai
import openai
import time
openai.api_key = user_secrets.get_secret("OPENAI_API_KEY")
startime = time.time()

Disclaimer: The downside of streaming in production usage is the control of appropiate usage policy: https://beta.openai.com/docs/usage-guidelines, which should be reviewed in advance for each application, so I suggest to take a look this policy prior deciding to use streaming.

How to stream GPT-4 API model (gpt-4 or gpt-4-32k & gpt-4-32k-0314) responses?

Run the file streams.ipnyb first part.

### STREAM GPT-4 API RESPONSES
delay_time = 0.01 #  faster
max_response_length = 8000
answer = ''
# ASK QUESTION
prompt = input("Ask a question: ")
start_time = time.time()

response = openai.ChatCompletion.create(
    # GPT-4 API REQQUEST
    model='gpt-4',
    messages=[
        {'role': 'user', 'content': f'{prompt}'}
    ],
    max_tokens=max_response_length,
    temperature=0,
    stream=True,  # this time, we set stream=True
)

for event in response: 
    # STREAM THE ANSWER
    print(answer, end='', flush=True) # Print the response    
    # RETRIEVE THE TEXT FROM THE RESPONSE
    event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT
    event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE
    answer = event_text.get('content', '') # RETRIEVE CONTENT
    time.sleep(delay_time)

After inserting the user input and pressing enter, you should see the output printed:

How to stream ChatGPT API model (gpt-3.5-turbo) responses?

Run the file streams.ipnyb second part. Add user input and you should see similar to below:

### STREAM CHATGPT API RESPONSES
delay_time = 0.01 #  faster
max_response_length = 200
answer = ''
# ASK QUESTION
prompt = input("Ask a question: ")
start_time = time.time()

response = openai.ChatCompletion.create(
    # CHATPG GPT API REQQUEST
    model='gpt-3.5-turbo',
    messages=[
        {'role': 'user', 'content': f'{prompt}'}
    ],
    max_tokens=max_response_length,
    temperature=0,
    stream=True,  # this time, we set stream=True
)

for event in response: 
    # STREAM THE ANSWER
    print(answer, end='', flush=True) # Print the response    
    # RETRIEVE THE TEXT FROM THE RESPONSE
    event_time = time.time() - start_time  # CALCULATE TIME DELAY BY THE EVENT
    event_text = event['choices'][0]['delta'] # EVENT DELTA RESPONSE
    answer = event_text.get('content', '') # RETRIEVE CONTENT
    time.sleep(delay_time)

How to stream InstructGPT API model (text-davinci-003) responses?

Run the file streams.pnyb third part. Add user input and you should see similar to below:

collected_events = []
completion_text = []
speed = 0.05 #smaller is faster
max_response_length = 200
start_time = time.time()
prompt = input("Ask a question: ")
# Generate Answer
response = openai.Completion.create(
    model='text-davinci-003',
    prompt=prompt,
    max_tokens=max_response_length,
    temperature=0,
    stream=True,  # this time, we set stream=True
)

# Stream Answer
for event in response:
    event_time = time.time() - start_time  # calculate the time delay of the event
    collected_events.append(event)  # save the event response
    event_text = event['choices'][0]['text']  # extract the text
    completion_text += event_text  # append the text
    time.sleep(speed)
    print(f"{event_text}", end="", flush=True)

How to create Streamlit app with OpenAI API?

I add a working "app_streamlit.py"-file, which you can fork to your repository with the "requirements.txt" and deploy it in Streamlit.

In the advanced settings, add the OPENAI_API_KEY-variable using format:

OPENAI_API_KEY = "INSERT HERE YOUR KEY"

Suggestions and improvements

Feel free to fork and further improve the code as per the license. For example you can further improve the ChatML to ensure the flow follows desired "system" rules. I left these empty now to make this basic script very generic. I recommend to check my articles specific to ChatGPT API about streaming responses in Medium related to Streaming, ChatML: guiding prompts with system, assistant and user roles and ChatGPT API introduction tutorial.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_streamlit.py		app_streamlit.py
requirements.txt		requirements.txt
streams.ipynb		streams.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

app_streamlit.py

app_streamlit.py

requirements.txt

requirements.txt

streams.ipynb

streams.ipynb

Repository files navigation

Stream GPT-4, ChatGPT and GPT-3.5 responses & Deploy Streamlit apps

Pre-requisites:

How to stream GPT-4 API model (gpt-4 or gpt-4-32k & gpt-4-32k-0314) responses?

How to stream ChatGPT API model (gpt-3.5-turbo) responses?

How to stream InstructGPT API model (text-davinci-003) responses?

How to create Streamlit app with OpenAI API?

Suggestions and improvements

About

Releases

Packages

Contributors 2

Languages

License

tmgthb/Stream-responses

Folders and files

Latest commit

History

Repository files navigation

Stream GPT-4, ChatGPT and GPT-3.5 responses & Deploy Streamlit apps

Pre-requisites:

How to stream GPT-4 API model (gpt-4 or gpt-4-32k & gpt-4-32k-0314) responses?

How to stream ChatGPT API model (gpt-3.5-turbo) responses?

How to stream InstructGPT API model (text-davinci-003) responses?

How to create Streamlit app with OpenAI API?

Suggestions and improvements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages