# Streaming and Batching with Chat Models

**Purpose:**  
Illustrate how to use streaming and batching with a chat model backend, and how concurrency controls affect batch execution behavior.


## Core concepts

**Streaming**  
Streaming returns partial model outputs incrementally as tokens or chunks are generated. This reduces perceived latency and enables real-time processing or UI updates before the full response is complete.

**Batching**  
Batching sends multiple prompts in a single logical operation. The runtime schedules and executes them, potentially in parallel, improving throughput and reducing overhead when issuing many independent requests.

**Maximum concurrency**  
`max_concurrency` limits how many batch requests may be in-flight at the same time. This is a backpressure mechanism used to control resource usage, avoid provider rate limits, and stabilize latency under load.


In [None]:
# Environment setup

import os
from dotenv import load_dotenv
load_dotenv()

GROQ_API_KEY = os.getenv("GROQ_API_KEY")

In [None]:
# Model initialization

from langchain_groq import ChatGroq

llm= ChatGroq(model="qwen/qwen3-32b")

In [None]:
# Synchronous invocation (baseline)

llm.invoke("Why do parrots have colorful feathers?")



In [None]:
# Streaming example
# The model yields partial output chunks as they are produced.

for chunk in llm.stream("Why do parrots have colorful feathers?"):
    print(chunk.text, end="|", flush=True)

|<think>|

|First| thought|:| sexual| selection|.| In| many| bird| species|,| males| have| brighter| colors| to| attract| females|.| For| example|,| pe|ac|ocks| have| those| elaborate| tails|.| So| maybe| male| par|rots| have| brighter| feathers| to| show| off| to| females|.| But| not| all| par|rots| are| male|,| right|?| Some| females| have| colorful| feathers| too|.| Maybe| it|'s| mutual| ornament|ation|.| I|'ve| heard| that| in| some| species|,| both| males| and| females| are| colorful|,| which| could| mean| that| both| sexes| choose| mates| based| on| color|.| So| maybe| it|'s| not| just| about| males| attracting| females| but| also| mutual| mate| choice|.

|Another| angle| is| that| the| colors| might| be| a| sign| of| health|.| If| a| par|rot| can| maintain| bright| colors|,| it| might| indicate| that| it|'s| healthy| and| has| good| genes|.| Car|oten|oids|,| which| are| pig|ments| found| in| some| fruits| and| plants|,| might| be| involved| here|.| Par|rots| that| can| get| enou

In [None]:
# Basic batching
# Prompts are executed as a group and responses are returned in order.

responses = llm.batch([
    "Why do parrots have colorful feathers?",
    "How do airplanes fly?",
    "What is quantum computing?"
])
for response in responses:
    print(response)

content='<think>\nOkay, so the user is asking how airplanes fly. I need to explain the basic principles. Let me start by recalling what I know about aerodynamics. Airplanes generate lift, right? But how exactly?\n\nFirst, there\'s the shape of the wings. The wings are airfoilsâ€”curved on top and flat on the bottom. When air flows over them, the air moving over the top has to travel a longer distance, so it goes faster. According to Bernoulli\'s principle, faster-moving air has lower pressure. So the pressure on top is lower than underneath, creating lift. Is that the main reason?\n\nWait, I\'ve heard some people say that\'s an oversimplification. Maybe the angle of attack also plays a role. The angle between the wing and the oncoming air affects how the air flows and the lift generated. If the angle is too steep, you get a stall. So it\'s not just about the wing shape but also how it\'s positioned.\n\nThen there\'s Newton\'s third law. The wings push air downward, and the reaction for

In [None]:
# Batched execution with bounded concurrency
# Only 'max_concurrency' requests are allowed to execute in parallel at any time.

llm.batch(
    ["Why do parrots have colorful feathers?",
    "How do airplanes fly?",
    "What is quantum computing?",
    "Why do parrots have red beaks?",
    "What is the meaning of life?",
    "why is the sky blue?",
    "why black holes are black?"],
    config={
        'max_concurrency': 5,  # Limit to 5 parallel calls
    }
)

[AIMessage(content="<think>\nOkay, so I need to figure out why parrots have colorful feathers. Let's start by thinking about what I know about parrots. They're birds, right? And they're known for their bright colorsâ€”greens, blues, yellows, reds, etc. But why? I remember from biology that animals can have colorful features for various reasons like attracting mates, camouflage, or communication.\n\nFirst, maybe it's about attracting mates. Like, in many bird species, males have brighter colors to show they're healthy and good genes. But wait, parrots are often colorful in both males and females. So maybe it's not just for attracting the opposite sex. Or could it be that both genders use color in some way?\n\nAnother thought: maybe the colors help in communication. Parrots are social animals, so maybe their colors help them recognize each other or signal their mood. For example, some birds can change their color slightly when they're agitated. Do parrots do that?\n\nThen there's the ide

## Summary

- Streaming emits partial outputs early and continuously, reducing perceived latency and enabling real-time handling.
- Batching improves throughput by grouping independent prompts into a single scheduled operation.
- `max_concurrency` provides backpressure, preventing overload and helping comply with provider rate limits or infrastructure constraints.
