# How to stream runnables

* The LangChain Runnable Interface supports both synchronous and asynchronous streaming methods (stream and astream) to make applications feel more responsive by outputting intermediate results as they’re generated

# Key Components for Streaming in LangChain
* LLMs and Chat Models: Language models are usually the slowest part of LLM applications, so streaming their output token-by-token can improve responsiveness. LangChain's streaming API lets us process each token immediately rather than waiting for the entire output.

* Streaming Methods:

   a. sync stream: Streams the final output in chunks.
 
   b.  async astream: Asynchronous version, for environments where async processing is possible.

In [7]:
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini")

chunks = []
for chunk in model.stream("what color is the sky?"):
    chunks.append(chunk)
    print(chunk.content, end="|", flush=True)


|The| color| of| the| sky| can| vary| depending| on| the| time| of| day|,| weather| conditions|,| and| atmospheric| factors|.| During| a| clear| day|,| the| sky| typically| appears| blue| due| to| the| scattering| of| sunlight| by| the| atmosphere|.| At| sunrise| and| sunset|,| the| sky| can| display| a| range| of| colors|,| including| orange|,| pink|,| and| red|.| On| cloudy| or| over|cast| days|,| the| sky| may| appear| gray|.| Additionally|,| phenomena| like| rain|bows| can| create| vibrant| colors| in| the| sky| under| certain| conditions|.||

In [5]:
## Asynchronous streaming

chunks = []
async for chunk in model.astream("what color is the sky?"):
    chunks.append(chunk)
    print(chunk.content, end="|", flush=True)

|The| sky| typically| appears| blue| during| the| day| due| to| the| scattering| of| sunlight| by| the| Earth's| atmosphere|.| However|,| it| can| also| appear| in| various| colors| at| different| times|,| such| as| orange| and| pink| during| sunrise| and| sunset|,| gray| when| over|cast|,| and| even| black| at| night|.| The| color| of| the| sky| can| change| based| on| weather| conditions| and| the| presence| of| particles| in| the| atmosphere|.||

# Chains with LangChain Expression Language (LCEL)
Using LangChain’s Expression Language, you can construct chains of components (e.g., prompt templates, models, parsers) that can stream the output seamlessly.

In [8]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
parser = StrOutputParser()
chain = prompt | model | parser

async for chunk in chain.astream({"topic": "parrot"}):
    print(chunk, end="|", flush=True)


|Why| did| the| par|rot| wear| a| rain|coat|?

|Because| it| wanted| to| be| a| poly|-|umbre|lla|!| 🦜|☔|️||

# Handling JSON Streaming
To stream JSON output while generating it incrementally:

Use JsonOutputParser to operate on partial JSON.

Use generator functions (yield) to handle partial data and operate on streams incrementally.

In [1]:
# from langchain_core.output_parsers import JsonOutputParser

# async def _extract_country_names_streaming(input_stream):
#     country_names_so_far = set()
#     async for input in input_stream:
#         if "countries" in input:
#             for country in input["countries"]:
#                 name = country.get("name")
#                 if name and name not in country_names_so_far:
#                     yield name
#                     country_names_so_far.add(name)

# chain = model | JsonOutputParser() | _extract_country_names_streaming

# async for text in chain.astream("output a list of the countries..."):
#     print(text, end="|", flush=True)


# Handling Non-streaming Components
Some components, like Retrievers, don’t natively support streaming. For these cases:

Place the non-streaming components early in the chain.

Streaming starts after the last non-streaming step, maintaining partial streaming from there.

LangChain’s streaming approach, along with LCEL chains, allows for flexibility in handling complex LLM applications that feel responsive to users by delivering intermediate outputs rapidly.

In [3]:
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

vectorstore = FAISS.from_texts(
    ["harrison worked at kensho", "harrison likes spicy food"],
    embedding=OpenAIEmbeddings(),
)
retriever = vectorstore.as_retriever()

chunks = [chunk for chunk in retriever.stream("where did harrison work?")]
chunks

[[Document(metadata={}, page_content='harrison worked at kensho'),
  Document(metadata={}, page_content='harrison likes spicy food')]]

In [7]:
pip install langchain





[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.output_parsers import StrOutputParser
from langchain.runnables import RunnablePassthrough

# Assuming 'retriever' is already defined
retriever = ...  # Your retriever initialization code here

# Define your model, for example, using OpenAI's GPT
model = OpenAI(api_key="write your api key here")  # Replace with your model initialization

# Define the prompt template
prompt = PromptTemplate(input_variables=["context", "question"], template="Context: {context}\nQuestion: {question}")

# Construct the retrieval chain
retrieval_chain = (
    {
        "context": retriever.with_config(run_name="Docs"),
        "question": RunnablePassthrough(),
    }
    | prompt
    | model
    | StrOutputParser()
)


ImportError: cannot import name 'StrOutputParser' from 'langchain.output_parsers' (c:\Users\Admin\Desktop\10-20-2024\venv\lib\site-packages\langchain\output_parsers\__init__.py)

In [None]:
zfor chunk in retrieval_chain.stream(
    "Where did harrison work? " "Write 3 made up sentences about this place."
):
    print(chunk, end="|", flush=True)