# Basics of SUTRA Streaming with OpenAI Client


<img src="https://play-lh.googleusercontent.com/_O9p4Z4yucA2NLmZBu9mTJCuBwXeT9NcbtrDN6I8gKlkIPRySV0adOmbyipjSj9Gew" width="120">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zWzkMPyy22J98U4OBZIz_xinwhw8cPV_?usp=sharing)

## Introduction

This beginner-friendly notebook shows you how to use Sutra's streaming feature with the OpenAI client. Streaming lets you see responses as they're being generated (word by word), instead of waiting for the complete response.

### Why Use Streaming?

- **Feels faster**: Users see text appearing immediately
- **More interactive**: Creates a more natural conversation experience
- **Better for long responses**: Users can start reading while the rest is being generated

## Get Your API Keys

Before you begin, make sure you have:

1. A SUTRA API key (Get yours at [TWO AI's SUTRA API page](https://www.two.ai/sutra/api))
2. Basic familiarity with Python and Jupyter notebooks

This notebook is designed to run in Google Colab, so no local Python installation is required.

## Setup

Install necessary packages and set up the environment.

In [None]:
# Install the OpenAI library
!pip install -q openai

### Import necessary libraries


In [None]:
# Import necessary libraries
import os
import json
import re
from openai import OpenAI
from google.colab import userdata

### Authentication

Set up authentication using Colab secrets.

In [None]:
# Get API key from Colab secrets
api_key = userdata.get("SUTRA_API_KEY")

### Initialize the client with SUTRA's API endpoint

In [None]:
# Initialize the client
client = OpenAI(
        base_url='https://api.two.ai/v2',
        api_key=api_key
    )

OpenAI client initialized successfully.


## 2. Basic Streaming - Your First Example

Let's create a simple function to stream responses from Sutra.

In [None]:
def stream_response(prompt):
    print("Streaming response...")

    stream = client.chat.completions.create(
        model="sutra-v2",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
        max_tokens=500,
        stream=True
    )

    full_response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end="", flush=True)

    print("\n\n--- Done! ---")
    return full_response

###Now let's try it with a simple question:

In [None]:
# Try our streaming function with a simple prompt
prompt = "Explain what artificial intelligence is to a 10-year old."
response = stream_response(prompt)

Streaming response...
Artificial intelligence, or AI, is like a smart robot or computer that can think and learn a bit like humans do. Imagine you have a toy that can listen to you and answer your questions or play games with you. That’s similar to what AI does!

AI can help with many things, like telling you the weather, helping doctors find out what's wrong with patients, or even making video games more fun by acting like real players. It learns from examples and information, so the more it practices, the better it gets at doing tasks, just like you get better at soccer the more you play!

--- Done! ---


## 3. Having a Conversation

Now let's try having a back-and-forth conversation with streaming responses.

In [None]:
def chat_with_streaming(messages):
    """Have a conversation with streaming responses"""
    print("Streaming response...")

    stream = client.chat.completions.create(
        model="sutra-v2",
        messages=messages,  # Pass the entire conversation history
        temperature=0.7,
        max_tokens=500,
        stream=True
    )

    full_response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end="", flush=True)

    print("\n\n--- Done! ---")
    return full_response

In [None]:
# Start a conversation
conversation = [
    {"role": "user", "content": "Tell me about some popular Indian street foods."}
]

# First message
print("You: Tell me about some popular Indian street foods.")
print("\nSutra: ", end="")
response = chat_with_streaming(conversation)

# Add the response to our conversation history
conversation.append({"role": "assistant", "content": response})

# Second message - follow up question
conversation.append({"role": "user", "content": "Which one of these is your favorite and why?"})

print("\nYou: Which one of these is your favorite and why?")
print("\nSutra: ", end="")
response = chat_with_streaming(conversation)

You: Tell me about some popular Indian street foods.

Sutra: Streaming response...
Indian street food is diverse and reflects the country's rich culinary heritage. Here are some popular options:

1. **Pani Puri**: Also known as Golgappa or Phuchka, this dish consists of hollow, crispy puris filled with a spicy mixture of tamarind water, chickpeas, and potatoes.

2. **Bhel Puri**: A savory snack made from puffed rice, vegetables, and tangy chutneys. It’s often garnished with sev (crispy noodles) and coriander.

3. **Vada Pav**: A popular Mumbai street food, it features a spicy potato fritter (vada) placed in a bun (pav), often served with chutneys and fried green chilies.

4. **Chaat**: This term encompasses various snacks, including Aloo Tikki Chaat (spicy potato patties with yogurt and chutneys) and Dahi Puri (small puris filled with yogurt and spices).

5. **Samosa**: Triangular pastries filled with spiced potatoes, peas, and sometimes meat, usually deep-fried until golden brown. The

## 4. Simple Word Counter

Let's create a simple example that counts words as they're being generated.

In [None]:
def simple_stream_with_word_count(prompt):
    """Stream response and display word count"""
    print("Streaming response...")

    stream = client.chat.completions.create(
        model="sutra-v2",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7,
        max_tokens=500,
        stream=True
    )

    full_response = ""

    for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end="", flush=True)

    words = full_response.split()
    print(f"\n\n[Total words: {len(words)}]")
    print("--- Done! ---")
    return full_response

### Try the word counter


In [None]:
# Try the word counter
prompt = "Write a short paragraph about the importance of clean water."
response = simple_stream_with_word_count(prompt)

Streaming response...
Clean water is essential for sustaining life, as it is crucial for human health, agriculture, and ecosystem balance. Access to safe drinking water prevents waterborne diseases, supports bodily functions, and contributes to overall well-being. In agriculture, clean water is vital for irrigation and livestock, ensuring food security and economic stability. Furthermore, healthy aquatic ecosystems rely on clean water to thrive, maintaining biodiversity and supporting various species. The importance of clean water underscores the need for sustainable management practices to protect this precious resource for future generations.

[Total words: 86]
--- Done! ---


## 5. Handling Errors

Let's create a simple function that handles errors when streaming.

In [None]:
def safe_stream(prompt):
    """Stream response with simple fallback on error"""
    try:
        print("Streaming response...")
        stream = client.chat.completions.create(
            model="sutra-v2",
            messages=[{"role": "user", "content": prompt}],
            temperature=0.7,
            max_tokens=500,
            stream=True
        )

        full_response = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                content = chunk.choices[0].delta.content
                full_response += content
                print(content, end="", flush=True)

        print("\n\n--- Done! ---")
        return full_response

    except:
        print("\nError during streaming. Falling back...")
        try:
            response = client.chat.completions.create(
                model="sutra-v2",
                messages=[{"role": "user", "content": prompt}],
                temperature=0.7,
                max_tokens=500
            )
            result = response.choices[0].message.content
            print(result)
            print("\n--- Done (fallback) ---")
            return result
        except:
            return "Sorry, something went wrong. Please try again."

### Try our error-handling function


In [None]:
# Try our error-handling function
prompt = "What is Transformer Architecture form in simple terms."
response = safe_stream(prompt)

Streaming response...
Transformer architecture is a type of neural network design primarily used for processing sequential data, such as text. Here are the main components explained simply:

1. **Attention Mechanism**: Instead of processing data in order (like traditional models), transformers use attention to weigh the importance of different words in a sentence, allowing the model to focus on relevant parts of the input regardless of their position.

2. **Self-Attention**: This is a specific type of attention where the model looks at all words in a sentence to understand how they relate to each other. For example, in the sentence "The cat sat on the mat," the model can learn that "cat" and "sat" are related.

3. **Positional Encoding**: Since transformers don't process data sequentially, they need a way to understand the order of words. Positional encodings are added to the input data to provide information about the position of each word in the sequence.

4. **Multi-Head Attention**

## 6. Practical Example: Interactive Story

Let's create a simple interactive story generator that you can try out.

In [None]:
def interactive_story():
    """Simple interactive story generator"""
    print("=== Interactive Story Generator ===\n")
    print("Create a short story with your input.")
    print("You can guide the direction after each paragraph.\n")

    # Ask for the initial story idea
    story_idea = input("What should the story be about? ")

    # Initialize the conversation
    conversation = [
        {"role": "system", "content": "You are a creative storyteller. Write vivid and engaging stories one paragraph at a time."},
        {"role": "user", "content": f"Write the first paragraph of a short story about {story_idea}. Keep it to 3–5 sentences."}
    ]

    # Generate the story in 3 parts
    for i in range(3):
        print(f"\n--- Paragraph {i + 1} ---\n")

        # Generate and display the paragraph
        response = chat_with_streaming(conversation)
        conversation.append({"role": "assistant", "content": response})

        # Ask for user input to guide the next part
        if i < 2:
            direction = input("\nWhat should happen next? ")
            conversation.append({
                "role": "user",
                "content": f"Continue the story with this idea: {direction}. Keep it to one paragraph (3–5 sentences)."
            })

    print("\n=== Story Complete! ===")
    return "Story complete!"

# interactive story generator


In [None]:
interactive_story()

=== Interactive Story Generator ===

Create a short story with your input.
You can guide the direction after each paragraph.

What should the story be about? A lost cat in a big city

--- Paragraph 1 ---

Streaming response...
Whiskers, a fluffy tabby with emerald green eyes, found herself perched on the edge of a bustling sidewalk in downtown Metroville, her heart racing amidst the cacophony of honking cars and chattering pedestrians. Just hours before, she had been curled up in the cozy warmth of her owner’s lap, but a curious flicker of movement had lured her out of the open window, leading to her unplanned adventure. The towering skyscrapers loomed like giants overhead, casting long shadows that seemed to swallow her whole as she took a hesitant step into a world filled with strange smells and unfamiliar sounds. With each cautious pawstep, Whiskers felt both exhilarated and frightened, an explorer in a vast urban jungle, desperately searching for a familiar face in the sea of stran

'Story complete!'

## 7. Conclusion

Congratulations! You've learned the basics of using Sutra's streaming capabilities with the OpenAI client. Here's what we covered:

1. **Basic setup** - How to connect to Sutra using the OpenAI client
2. **Simple streaming** - Getting responses word by word
3. **Conversations** - Having back-and-forth chats with streaming
4. **Word counting** - A simple example of processing streamed content
5. **Error handling** - Making your code more robust
6. **Interactive stories** - A fun application of streaming


Happy coding!

## 8. Additional Resources

- [Sutra Documentation](https://docs.sutra.ai)
- [OpenAI API Documentation](https://platform.openai.com/docs)
- [OpenAI Streaming Guide](https://platform.openai.com/docs/api-reference/streaming)

For more examples, check out other notebooks in the Sutra Cookbooks repository.