# Streaming Responses in Agentic Platform

## Overview
Streaming responses send data incrementally as it becomes available, rather than waiting for the complete response. This is ideal for AI applications where we want to show progress in real-time.

## Event Types
Our platform streams these events via Server-Sent Events (SSE):
- `text_delta`: A chunk of text content
- `text_done`: Text content is complete
- `thinking`: Agent's internal reasoning
- `tool_call`: Tool being called
- `tool_result`: Result from tool
- `error`: Error information
- `done`: Stream completion

Each event includes a unique ID, conversation ID, timestamp, usage statistics (for LLM responses), and additional metadata.

## Example Flow
```python
# Server streams events like:
{
    "session_id": "123",
    "type": "text_delta",
    "text": "Based on my search
}
```

## Benefits
Streaming provides real-time response display. It's not uncommon to see double digit latency within Agentic systems. Streaming intermediate steps keeps the users engaged and lets them know whats going on which in a way, masks the latency from the system. 

To get started, lets take a look at our streaming types

In [None]:
from agentic_platform.core.models.streaming_models import ToolCallEvent, ToolResultEvent, ErrorEvent, DoneEvent, TextDeltaEvent, TextDoneEvent, ThinkingEvent


TextDoneEvent??

In [None]:
TextDeltaEvent??

In [None]:
ToolCallEvent??

Looking at the events above, you can see they all follow a similar flow. Text events have a text field, tool call and tool results have outputted results, etc.. Importantly, each event has a "type" attribute which makes it easier to piece together agent responses on a frontend.

Next lets build a simple agent.

In [None]:
# Next lets create our researcher agent. 
from pydantic_ai import Agent as PyAIAgent

import nest_asyncio
nest_asyncio.apply()


# Create a basic agent with a specialized system prompt
agent = PyAIAgent(
    'bedrock:us.anthropic.claude-3-5-haiku-20241022-v1:0',
    system_prompt="You are a helpful assistant."
)

# The response will be automatically printed by the Agent class
ABSTRACTION_QUESTION = "Explain the concept of abstractions in programming in one paragraph."
response = agent.run_sync(ABSTRACTION_QUESTION)
response.output

The agent above is a simple agent that doesn't do much. one thing you might have noticed in all the agents we've built up to this point is that they take a while to return. End users are often impatient and don't want to wait 10-15 seconds for a response. A way to make the latency less noticable is by streaming results back to the user. Most frameworks support streaming.

In the example below we'll stream the text results back from the agent output it as it comes in.

In [None]:
async with agent.run_stream(ABSTRACTION_QUESTION) as result:
    async for message in result.stream_text(delta=True):  
            print(message)

Using streaming, the latency is much less noticable. However, what if we want to stream intermediate actions back? 

To do that, we'll need to return the structured output as well. Lets add a simple tool to our agent and stream the intermediate tool results back. 

In [None]:
def get_weather(location: str) -> str:
    '''Useful for getting the local weather'''
    return f'The weather in {location} is Sunny and 70 degrees.'

In [None]:
agent.tool_plain(get_weather)

In [None]:
from pydantic_core import to_jsonable_python

nodes = []
async with agent.iter('What is the weather in SF?') as result:
    async for message in result:   
        nodes.append(to_jsonable_python(message))

for n in nodes:
    print(n)

Nice, we're now getting intermediate results out of our pydantic agent. However, we want to convert this into our streaming types to decouple the rest of our code from the specific framework. We've created a converter for this which we'll import below.

In [None]:
from agentic_platform.core.converter.pydanticai_converters import PydanticAIStreamingEventConverter
from agentic_platform.core.models.streaming_models import StreamEvent
from typing import List

for node in nodes:
    events: List[StreamEvent] = PydanticAIStreamingEventConverter.convert_event(node, session_id='abc123')
    if events:
        for event in events:
            print(event)

Perfect. Now that we have our streaming events, it's time to bring it all together. Lets re-run our agent but this tiime lets convert into our message types and stream them back. As part of SSE, data is expected to be prefixed with "data: [YOUR DATA HERE]" folowed by two new lines. 

To stream results back, we'll use the yield. Yield allows allows for real-time streaming without buffering all the events.

In [None]:
import nest_asyncio
import json
from fastapi import FastAPI
from pydantic import BaseModel
from pydantic_core import to_jsonable_python
from typing import AsyncGenerator
import uuid

nest_asyncio.apply()

# Import your converter
from agentic_platform.core.models.api_models import AgenticRequestStream

class AgentRequest(BaseModel):
    prompt: str
    conversation_id: str = "default"

async def generate_agent_events(request: AgenticRequestStream) -> AsyncGenerator[str, None]:
    """Generate Server-Sent Events from the agent stream"""

    session_id: str = request.session_id if request.session_id else str(uuid.uuid4())

    async with agent.iter(request.message.text) as result:
        async for message in result:   
            json_message = to_jsonable_python(message)
            events: List[StreamEvent] = PydanticAIStreamingEventConverter.convert_event(json_message, session_id)
            for event in events:
                sse_data = f"data: {json.dumps(event.model_dump_json(serialize_as_any=True))}\n\n"
                yield sse_data

In [None]:
async for sse_data in generate_agent_events(AgenticRequestStream.from_text(text="What is the weather in SF?", **{'session_id':"abc123"})):
    print(sse_data)