#Using Streaming in LangChain part 1
Made by: Wilfredo Aaron Sosa Ramos (AI Lab Manager at RealityAI Labs)

In [1]:
!pip install -q langchain langchain-core langchain-community langchain-google-genai faiss-cpu

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.9/2.5 MB[0m [31m35.5 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.5/2.5 MB[0m [31m44.3 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m30.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m31.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m411.6/411.6 kB[0m [31m31.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━

##1. Setting Up the Environment

In [2]:
import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
    os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google GenAI: ")

from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model="gemini-1.5-pro", temperature=0)

Enter API key for Google GenAI: ··········


##2. Complex Use Case: Real-Time Binary Tree Processing

###Step 1: LLMs and Chat Models

In [3]:
chunks = []
for chunk in model.stream("Validate the structure of a binary tree where root = 10, left = 5, right = 15."):
    chunks.append(chunk)
    print(chunk.content, end="|", flush=True)

A| binary tree with root = 10, left = 5, and right| = 15 is a valid binary tree structure.  It adheres to the| fundamental property of a binary tree:

* **Each node has at most two children:** The root (10) has two children (5 and 1|5).  Neither 5 nor 15 have children in this example, but they *could* have up to two each if the tree were extended.|

* **Left child is smaller (in a binary *search* tree):**  If we're talking about a binary *search* tree (BST), then this structure is also valid because the left child (5) is smaller| than the root (10), and the right child (15) is larger than the root.  However, if it's just a general binary tree (not a search tree), the values could be in any order.

|Here's a visual representation:

```
    10
   /  \
  5   15
```

Therefore, the structure is valid for both a general binary tree and a binary search tree.
|

In [4]:
# Alternatively, with async streaming:
chunks = []
async for chunk in model.astream("Validate the structure of a binary tree where root = 10, left = 5, right = 15."):
    chunks.append(chunk)
    print(chunk.content, end="|", flush=True)

print(chunks[0])

A| binary tree with root = 10, left = 5, and right| = 15 is a valid binary search tree (BST).

Here'|s why:

* **Binary Tree Structure:** It adheres to the basic structure of a binary tree.  A node (10) has at most two| children (5 and 15).

* **Binary Search Tree Property:**  A BST requires that for every node:
    * All nodes in its| left subtree have values *less than* the node's value.
    * All nodes in its right subtree have values *greater than* the node's value.

In this case:

* 5 (left child)| is less than 10 (root).
* 15 (right child) is greater than 10 (root).

Therefore, this simple tree satisfies the conditions of both a binary tree and a binary search tree.
||content='A' additional_kwargs={} response_metadata={'safety_ratings': []} id='run-2e4360da-b916-4af2-a654-12e70ab1dd90' usage_metadata={'input_tokens': 26, 'output_tokens': 0, 'total_tokens': 26, 'input_token_details': {'cache_read': 0}}


###Step 2: Using Chains for Binary Tree Validation

In [5]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_template("Generate a binary tree structure with the root {root}, left child {left}, and right child {right}.")
parser = StrOutputParser()
chain = prompt | model | parser

async for chunk in chain.astream({"root": 10, "left": 5, "right": 15}):
    print(chunk, end="|", flush=True)

```|python
class Node:
    def __init__(self, data):
|        self.data = data
        self.left = None
        self|.right = None

# Create nodes
root = Node(10)
root.left = Node(5)
root.right = Node(|15)


# (Optional) Function to print the tree (inorder traversal) for verification
def inorder_traversal(node):
    if node:|
        inorder_traversal(node.left)
        print(node.data, end=" ")
        inorder_traversal(node.right)

print("Inorder traversal:")
inorder_traversal(root)  # Output|: 5 10 15
```


This code defines a `Node` class to represent the nodes of the binary tree.  It then creates the root node with a value of 10 and adds the left child| (5) and right child (15).  The `inorder_traversal` function is included to demonstrate how you can traverse and print the tree's contents to verify the structure.  The output of the `inorder_traversal` will be `5 10 15`, confirming the correct structure.||

###Step 3: Working with Input Streams to Process Nested Structures

In [13]:
from langchain_core.output_parsers import JsonOutputParser

chain = (
    model | JsonOutputParser()
)  # Due to a bug in older versions of Langchain, JsonOutputParser did not stream results from some models
async for text in chain.astream(
    "output a list of the countries USA, france and peru and their populations in JSON format. "
    'Use a dict with an outer key of "countries" which contains a list of countries. '
    "Each country should have the key `name` and `population`"
):
    print(text, flush=True)

{'countries': [{}]}
{'countries': [{'name': 'USA', 'population': 339996}]}
{'countries': [{'name': 'USA', 'population': 339996563}, {'name': 'France', 'population': 64626652}]}
{'countries': [{'name': 'USA', 'population': 339996563}, {'name': 'France', 'population': 64626652}, {'name': 'Peru', 'population': 34049588}]}


In [15]:
from langchain_core.output_parsers import (
    JsonOutputParser,
)


# A function that operates on finalized inputs
# rather than on an input_stream
def _extract_country_names(inputs):
    """A function that does not operates on input streams and breaks streaming."""
    if not isinstance(inputs, dict):
        return ""

    if "countries" not in inputs:
        return ""

    countries = inputs["countries"]

    if not isinstance(countries, list):
        return ""

    country_names = [
        country.get("name") for country in countries if isinstance(country, dict)
    ]
    return country_names


chain = model | JsonOutputParser() | _extract_country_names

async for text in chain.astream(
    "output a list of the countries france, spain and japan and their populations in JSON format. "
    'Use a dict with an outer key of "countries" which contains a list of countries. '
    "Each country should have the key `name` and `population`"
):
    print(text, end="|", flush=True)

['France', 'Spain', 'Japan']|

###Step 4: Non-Streaming Components with Binary Tree Information

In [16]:
from langchain_community.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_google_genai import GoogleGenerativeAIEmbeddings

# Define a prompt for theoretical binary tree context
template = """Answer the question based only on the following binary tree context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

# Set up a vectorstore retriever with theoretical binary tree information
vectorstore = FAISS.from_texts(
    [
        "A binary tree is a hierarchical data structure in which each node has at most two children, referred to as the left child and the right child.",
        "Binary trees are used in search algorithms and database indexing due to their efficient properties.",
        "In a balanced binary tree, the height difference between left and right subtrees of any node is at most one."
    ],
    embedding=GoogleGenerativeAIEmbeddings(model="models/embedding-001"),
)
retriever = vectorstore.as_retriever()

# Retrieve and answer questions about binary trees
retrieval_chain = (
    {
        "context": retriever.with_config(run_name="BinaryTreeTheoryDocs"),
        "question": RunnablePassthrough(),
    }
    | prompt
    | model
    | StrOutputParser()
)

for chunk in retrieval_chain.stream(
    "Explain the properties of a balanced binary tree and its use in search algorithms."
):
    print(chunk, end="|", flush=True)

A| balanced binary tree has a maximum height difference of one between the left and right sub|trees of any node.  This property contributes to efficient searching because it keeps the| tree relatively shallow, minimizing the number of levels that need to be traversed.  While the provided text mentions binary trees are used in search algorithms due to their efficient| properties, it doesn't specifically link the *balanced* property to search algorithm efficiency.
|