##### Copyright 2024 Google LLC.

In [None]:
# @title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Gemini API: Streaming Quickstart

This notebook demonstrates streaming in the Python SDK. By default, the Python SDK returns a response after the model completes the entire generation process. You can also stream the response as it is being generated, and the model will return chunks of the response as soon as they are generated.

**Download this notebook and run it locally (not in Google Colab)**

Streaming is not handled correctly in Google Colab yet. Currently all the stream chunks are returned together, not as they are generated. To see the correct behavior, download this notebook and run it locally using Jupyter, instead.

In [None]:
!pip install -U -q google-generativeai # Install the Python SDK

In [None]:
import google.generativeai as genai

You'll need an API key stored in an environment variable to run this notebook. See the the [Authentication quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example.

In [None]:
import os
genai.configure(api_key=os.environ['GOOGLE_API_KEY'])

## Handle streaming responses

To stream responses, use [`GenerativeModel.generate_content(..., stream=True)`](https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content).

**Note**: This cell runs with a Google Colab runtime, but does not properly show streaming due to implementation details of Colab runtimes.

In [None]:
model = genai.GenerativeModel('gemini-1.5-flash-latest')
response = model.generate_content("Write a cute story about cats.", stream=True)
for chunk in response:
    print(chunk.text)
    print("_"*80)

## Handle streaming responses asynchronously

To stream responses asynchronously, use [`GenerativeModel.generate_content_async(..., stream=True)`](https://ai.google.dev/api/python/google/generativeai/GenerativeModel#generate_content_async).

**Note**: These cells do NOT work with a Google Colab runtime, but do work in a local Jupyter notebook.

In [None]:
async for chunk in await model.generate_content_async("Write a cute story about cats.", stream=True):
    if chunk.text:
        print(chunk.text)
    print("_"*80)

Here's a simple example of two asynchronous functions running simultaneously.

In [None]:
import asyncio

async def get_response():
    async for chunk in await model.generate_content_async("Write a cute story about cats.", stream=True):
        if chunk.text:
            print(chunk.text)
        print("_"*80)

async def something_else():
    for i in range(5):
        print("==========not blocked!==========")
        await asyncio.sleep(3)

async def async_demo():
    # Create tasks
    task1 = asyncio.create_task(get_response())
    task2 = asyncio.create_task(something_else())

    # Wait for tasks to complete
    await asyncio.gather(task1, task2)

# Jupyter notebooks handle event loops for you, so await directly
await async_demo()