# Streaming Responses with HelpingAI 🌊

Learn how to implement real-time streaming responses for better user experience, especially useful for long-form content generation and interactive applications.

## 🎯 What You'll Learn
- Basic streaming implementation
- Handling streaming data
- Building real-time interfaces
- Error handling in streams
- Advanced streaming patterns

In [None]:
import os
import time
from HelpingAI import HAI

os.environ["HAI_API_KEY"] = "hl-*******************"
hai = HAI()

print("🌊 Ready to explore streaming responses!")

🌊 Ready to explore streaming responses!


## 🚀 Basic Streaming

Let's start with a simple streaming example.

In [2]:
def basic_streaming_example():
    """Basic streaming response example"""
    print("🌊 Basic Streaming Example:")
    print("=" * 40)
    print("AI Response (streaming): ", end="")
    
    # Create streaming response
    stream = hai.chat.completions.create(
        model="Dhanishtha-2.0-preview",
        messages=[
            {"role": "user", "content": "Tell me a short story about a robot learning to paint."}
        ],
        stream=True,
        temperature=0.8,
        max_tokens=500
    )
    
    # Process the stream
    full_response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            print(content, end="", flush=True)
            full_response += content
            time.sleep(0.02)  # Small delay to simulate real-time typing
    
    print("\n\n" + "=" * 40)
    print(f"✅ Streaming complete! Total characters: {len(full_response)}")
    return full_response

# Run basic streaming example
story = basic_streaming_example()

🌊 Basic Streaming Example:
AI Response (streaming): <think>
A robot learning to paint is a fascinating idea. I want to capture both the technical challenge and the emotional journey—how the robot not only learns the mechanics but also begins to feel something through art. The story should show growth, curiosity, and maybe a touch of wonder.
</think>

  
<think>
I also want to highlight the robot’s perspective: how it feels excitement, frustration, and eventually, joy. The story should feel personal, not just a technical manual. I’ll focus on a small, meaningful moment where the robot’s learning becomes an emotional experience.
</think>

  
Roxy, a robot with sharp eyes and nimble fingers, was programmed to paint landscapes, but the colors always looked too perfect, too lifeless. One day, while experimenting after hours, she began mixing red and blue just for fun. She dipped her brush, hesitated, and then touched the canvas with gentle pressure.

A spark of something—like curiosity—flic

## 🧠 Streaming with Dhanishta 2.0 Thinking

See how thinking processes stream in real-time.

In [3]:
def streaming_with_thinking():
    """Stream Dhanishta 2.0 responses with thinking process"""
    print("🧠 Streaming with Thinking Process:")
    print("=" * 50)
    
    stream = hai.chat.completions.create(
        model="Dhanishtha-2.0-preview",
        messages=[
            {
                "role": "user", 
                "content": "Solve this step by step: If a pizza is cut into 8 equal slices and 3 people eat 2 slices each, what fraction of the pizza is left?"
            }
        ],
        stream=True,
        hide_think=False,  # Show thinking process
        temperature=0.3
    )
    
    full_response = ""
    in_thinking = False
    
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            
            # Detect thinking blocks
            if "<think>" in content:
                in_thinking = True
                print("\n🤔 [THINKING] ", end="")
                content = content.replace("<think>", "")
            
            if "</think>" in content:
                in_thinking = False
                content = content.replace("</think>", "")
                print("\n\n💡 [SOLUTION] ", end="")
            
            # Color coding for different sections
            if in_thinking:
                print(f"\033[94m{content}\033[0m", end="", flush=True)  # Blue for thinking
            else:
                print(content, end="", flush=True)  # Normal for solution
            
            full_response += content
            time.sleep(0.03)
    
    print("\n\n" + "=" * 50)
    return full_response

# Run thinking stream example
math_solution = streaming_with_thinking()

🧠 Streaming with Thinking Process:

🤔 [THINKING] [94m[0m[94m
[0m[94mFirst[0m[94m,[0m[94m I[0m[94m need[0m[94m to[0m[94m figure[0m[94m out[0m[94m how[0m[94m many[0m[94m slices[0m[94m are[0m[94m eaten[0m[94m in[0m[94m total[0m[94m.[0m[94m Each[0m[94m of[0m[94m the[0m[94m [0m[94m3[0m[94m people[0m[94m eats[0m[94m [0m[94m2[0m[94m slices[0m[94m,[0m[94m so[0m[94m that[0m[94m's[0m[94m [0m[94m3[0m[94m ×[0m[94m [0m[94m2[0m[94m =[0m[94m [0m[94m6[0m[94m slices[0m[94m eaten[0m[94m.[0m[94m The[0m[94m pizza[0m[94m was[0m[94m cut[0m[94m into[0m[94m [0m[94m8[0m[94m equal[0m[94m slices[0m[94m to[0m[94m start[0m[94m with[0m[94m.
[0m

💡 [SOLUTION] 

So, 6 slices have been eaten out of 8.


🤔 [THINKING] [94m[0m[94m
[0m[94mNow[0m[94m,[0m[94m to[0m[94m find[0m[94m the[0m[94m fraction[0m[94m of[0m[94m the[0m[94m pizza[0m[94m left[0m[94m,[0m[94m I[0m[94m need[0m[94m to[

## 🎮 Interactive Streaming Interface

Build an interactive streaming chat interface.

In [4]:
class StreamingChatInterface:
    def __init__(self, model="Dhanishtha-2.0-preview"):
        self.hai = HAI()
        self.model = model
        self.conversation = []
        self.system_message = {
            "role": "system",
            "content": "You are a helpful and friendly AI assistant. Provide engaging and informative responses."
        }
    
    def stream_response(self, user_message, show_typing=True):
        """Stream a response to user message"""
        # Add user message to conversation
        self.conversation.append({"role": "user", "content": user_message})
        
        # Prepare messages for API
        messages = [self.system_message] + self.conversation
        
        print(f"👤 You: {user_message}")
        print("🤖 AI: ", end="")
        
        if show_typing:
            # Simulate typing indicator
            for _ in range(3):
                print(".", end="", flush=True)
                time.sleep(0.5)
            print("\r🤖 AI: ", end="")
        
        # Stream the response
        stream = self.hai.chat.completions.create(
            model=self.model,
            messages=messages,
            stream=True,
            temperature=0.7
        )
        
        assistant_response = ""
        for chunk in stream:
            if chunk.choices[0].delta.content:
                content = chunk.choices[0].delta.content
                print(content, end="", flush=True)
                assistant_response += content
                time.sleep(0.02)
        
        print("\n")
        
        # Add assistant response to conversation
        self.conversation.append({"role": "assistant", "content": assistant_response})
        
        return assistant_response
    
    def get_conversation_summary(self):
        """Get a summary of the conversation"""
        total_messages = len(self.conversation)
        user_messages = len([msg for msg in self.conversation if msg["role"] == "user"])
        assistant_messages = len([msg for msg in self.conversation if msg["role"] == "assistant"])
        
        return {
            "total_messages": total_messages,
            "user_messages": user_messages,
            "assistant_messages": assistant_messages
        }

# Create streaming chat interface
chat = StreamingChatInterface()

print("💬 Interactive Streaming Chat Demo:")
print("=" * 50)

# Simulate a conversation
demo_messages = [
    "Hello! Can you help me understand what makes a good story?",
    "That's helpful! Can you give me an example of a compelling character?",
    "Great example! How important is the setting in storytelling?"
]

for message in demo_messages:
    chat.stream_response(message)
    print("-" * 30)

# Show conversation summary
summary = chat.get_conversation_summary()
print(f"\n📊 Conversation Summary: {summary}")

💬 Interactive Streaming Chat Demo:
👤 You: Hello! Can you help me understand what makes a good story?
🤖 AI: .

🤖 AI: <think>
This is a question about storytelling elements. I should consider both technical aspects (structure, character development, plot) and emotional components (resonance, impact, meaning). Good stories connect with people on multiple levels, so I'll need to address both craft and emotional intelligence in my response.
</think>

<ser>
Emotion ==> curiosity, interest in learning
Cause ==> desire to understand the craft of storytelling
Mind ==> seeking knowledge, possibly for creative purposes
Growth ==> providing insights that balance technical and emotional aspects of storytelling
</ser>

That's a wonderful question about the heart of storytelling! A good story has several essential elements that work together like a well-conducted orchestra.

First, there's character development. Readers need to care about the people in your story, to feel their joys and sorrows. The best characters have contradictions, growth, and moments where we see ourselves in them.

<think>
I should als

## 🛡️ Error Handling in Streaming

Robust error handling for streaming responses.

In [5]:
from HelpingAI import HAIError, RateLimitError, AuthenticationError

def robust_streaming(prompt, max_retries=3):
    """Streaming with comprehensive error handling"""
    print(f"🛡️ Robust Streaming: {prompt[:50]}...")
    print("=" * 50)
    
    for attempt in range(max_retries):
        try:
            print(f"Attempt {attempt + 1}: ", end="")
            
            stream = hai.chat.completions.create(
                model="Dhanishtha-2.0-preview",
                messages=[{"role": "user", "content": prompt}],
                stream=True,
                temperature=0.7,
                max_tokens=300
            )
            
            response_parts = []
            
            for chunk in stream:
                try:
                    if chunk.choices[0].delta.content:
                        content = chunk.choices[0].delta.content
                        print(content, end="", flush=True)
                        response_parts.append(content)
                        time.sleep(0.02)
                
                except Exception as chunk_error:
                    print(f"\n⚠️ Chunk error: {chunk_error}")
                    continue
            
            print("\n✅ Streaming completed successfully!")
            return "".join(response_parts)
        
        except RateLimitError:
            print(f"\n⏰ Rate limit hit on attempt {attempt + 1}")
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Waiting {wait_time} seconds before retry...")
                time.sleep(wait_time)
        
        except AuthenticationError:
            print("\n❌ Authentication error - check your API key")
            break
        
        except HAIError as e:
            print(f"\n🚨 API error on attempt {attempt + 1}: {e}")
            if attempt < max_retries - 1:
                print("Retrying...")
                time.sleep(1)
        
        except Exception as e:
            print(f"\n💥 Unexpected error: {e}")
            break
    
    print("❌ All retry attempts failed")
    return None

# Test robust streaming
result = robust_streaming("Write a haiku about artificial intelligence and creativity.")

🛡️ Robust Streaming: Write a haiku about artificial intelligence and cr...
Attempt 1: <think>
A haiku about AI and creativity needs to balance the mechanical, logical nature of AI with the human, imaginative spark of creativity. I want to capture the tension and possibility between the two—how AI can assist or even challenge human creativity, rather than replace it.
</think>

  
<think>
Emotionally, there’s a sense of awe and maybe a bit of apprehension about AI’s growing role in creative fields. I want the haiku to acknowledge that creativity is deeply human, but also open to new possibilities when AI is involved. The poem should feel thoughtful, not just technical.
</think>

  
Code weaves new light,  
Machines learn to dream in code—  
Creativity’s wings.
✅ Streaming completed successfully!


## 📊 Streaming Performance Analysis

Analyze streaming performance and characteristics.

In [6]:
import time
from collections import defaultdict

class StreamingAnalyzer:
    def __init__(self):
        self.reset_metrics()
    
    def reset_metrics(self):
        self.start_time = None
        self.first_token_time = None
        self.chunk_times = []
        self.chunk_sizes = []
        self.total_tokens = 0
        self.total_characters = 0
    
    def start_analysis(self):
        self.reset_metrics()
        self.start_time = time.time()
    
    def process_chunk(self, chunk):
        current_time = time.time()
        
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            
            # Record first token time
            if self.first_token_time is None:
                self.first_token_time = current_time
            
            # Record chunk metrics
            self.chunk_times.append(current_time)
            self.chunk_sizes.append(len(content))
            self.total_characters += len(content)
            
            # Rough token estimation
            self.total_tokens += len(content.split())
            
            return content
        return ""
    
    def get_metrics(self):
        if not self.chunk_times:
            return {"error": "No data collected"}
        
        end_time = self.chunk_times[-1]
        total_duration = end_time - self.start_time
        time_to_first_token = self.first_token_time - self.start_time if self.first_token_time else 0
        
        # Calculate intervals between chunks
        intervals = []
        for i in range(1, len(self.chunk_times)):
            intervals.append(self.chunk_times[i] - self.chunk_times[i-1])
        
        avg_interval = sum(intervals) / len(intervals) if intervals else 0
        
        return {
            "total_duration": round(total_duration, 2),
            "time_to_first_token": round(time_to_first_token, 2),
            "total_chunks": len(self.chunk_times),
            "total_characters": self.total_characters,
            "estimated_tokens": self.total_tokens,
            "avg_chunk_interval": round(avg_interval, 3),
            "characters_per_second": round(self.total_characters / total_duration, 1),
            "tokens_per_second": round(self.total_tokens / total_duration, 1)
        }

def analyze_streaming_performance(prompt, model="Dhanishtha-2.0-preview"):
    """Analyze streaming performance metrics"""
    analyzer = StreamingAnalyzer()
    
    print(f"📊 Analyzing Streaming Performance:")
    print(f"Model: {model}")
    print(f"Prompt: {prompt[:50]}...")
    print("=" * 50)
    
    analyzer.start_analysis()
    
    stream = hai.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        stream=True,
        temperature=0.7,
        max_tokens=400
    )
    
    print("Response: ", end="")
    full_response = ""
    
    for chunk in stream:
        content = analyzer.process_chunk(chunk)
        if content:
            print(content, end="", flush=True)
            full_response += content
    
    print("\n\n📈 Performance Metrics:")
    print("-" * 30)
    
    metrics = analyzer.get_metrics()
    for key, value in metrics.items():
        print(f"{key.replace('_', ' ').title()}: {value}")
    
    return metrics, full_response

# Analyze performance for different models
print("🔬 Performance Analysis:")
print("=" * 60)

test_prompt = "Explain the concept of machine learning in simple terms with examples."

# Test Dhanishtha-2.0-preview
metrics1, response1 = analyze_streaming_performance(test_prompt, "Dhanishtha-2.0-preview")

print("\n" + "=" * 60)

# Test Dhanishta 2.0
metrics2, response2 = analyze_streaming_performance(test_prompt, "Dhanishtha-2.0-preview")

🔬 Performance Analysis:
📊 Analyzing Streaming Performance:
Model: Dhanishtha-2.0-preview
Prompt: Explain the concept of machine learning in simple ...


Response: <think>
First, I want to break down what machine learning is at its core. It’s about computers learning from data, not just following instructions. I should start with a simple analogy, maybe something relatable to everyday life, so it feels less abstract.
</think>

Machine learning is a way for computers to learn patterns from data, just like how humans learn from experience. Instead of being told exactly what to do in every situation, a computer can observe many examples and start to recognize what’s important or what should happen next.

<think>
Now, I want to give concrete examples to make this even clearer. I’ll pick something simple, like sorting shapes or predicting weather, and explain how the computer “learns” over time.
</think>

Imagine you’re teaching a child to sort shapes. You show them many examples of circles, squares, and triangles, and the child starts to notice what makes each shape unique. Over time, they get better at sorting new shapes without you giving

## 🎯 Key Insights About Streaming

From these examples, we can observe important streaming characteristics:

### ⚡ Performance Benefits
- **Perceived Speed**: Users see content immediately
- **Better UX**: No waiting for complete responses
- **Interactive Feel**: More conversational experience
- **Early Feedback**: Users can interrupt if needed

### 🛠️ Implementation Considerations
- **Error Handling**: Robust handling of stream interruptions
- **Buffer Management**: Handling partial content gracefully
- **UI Updates**: Real-time interface updates
- **Performance Monitoring**: Track streaming metrics

### 🎨 Use Cases for Streaming
- **Long-form Content**: Stories, articles, explanations
- **Interactive Chat**: Real-time conversations
- **Live Demonstrations**: Step-by-step tutorials
- **Creative Writing**: Poetry, stories, creative content

## 🚀 Best Practices

- **Handle Errors Gracefully**: Implement retry logic and fallbacks
- **Show Progress**: Visual indicators for streaming status
- **Buffer Wisely**: Balance responsiveness with stability
- **Monitor Performance**: Track metrics for optimization
- **User Control**: Allow users to stop/pause streams

## 📚 Next Steps

- **[04-parameters.ipynb](04-parameters.ipynb)** - Fine-tuning AI behavior


---

**Create engaging real-time AI experiences with streaming! 🌊✨**