<center><a href="https://www.nvidia.com/en-us/training/"><img src="https://dli-lms.s3.amazonaws.com/assets/general/DLI_Header_White.png" width="400" height="186" /></a></center>

<br>

# <font color="#76b900">**Notebook 4:** Running State Chains</font>

<br>

In the previous notebook, we introduced some key LangChain Expression Language (LCEL) material regarding runnables. By now, you should be comfortable with both internal and external reasoning, as well as how to develop pipelines that facilitate it! In this notebook, we will make our way towards more advanced paradigms that will allow us to orchestrate more complex dialog management strategies and begin to execute on longer-form document reasoning.
<br>

### **Learning Objectives:**

- Learning how to leverage runnables to orchestrate interesting LLM systems.  
- Understanding how running state chains can be used for dialog management and iterative decision-making.

<br>

### **Questions To Think About:**

- Would there ever be a use for a single-module variant of the running state chain that is not constantly querying the environment for input?
- You may notice that the JSON prediction is actually working pretty well. It might not always work so well depending on the questions and the JSON format complexity. What kinds of issues do you expect to encounter in this regard?
- What kinds of approaches can you think of completely swapping prompts as part of the running state chain?

<br>

### **Environment Setup:**

In [1]:
## Necessary for Colab, not necessary for course environment
# %pip install -q langchain langchain-nvidia-ai-endpoints gradio

import os
os.environ["NVIDIA_API_KEY"] = "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"

from functools import partial
from rich.console import Console
from rich.style import Style
from rich.theme import Theme

console = Console()
base_style = Style(color="#76B900", bold=True)
pprint = partial(console.print, style=base_style)

In [17]:
from langchain_nvidia_ai_endpoints import ChatNVIDIA
ChatNVIDIA.get_available_models()

[Model(id='mistralai/mistral-nemotron', model_type='chat', client='ChatNVIDIA', endpoint=None, aliases=None, supports_tools=True, supports_structured_output=False, supports_thinking=False, base_model=None),
 Model(id='nvidia/vila', model_type='vlm', client='ChatNVIDIA', endpoint='https://ai.api.nvidia.com/v1/vlm/nvidia/vila', aliases=None, supports_tools=False, supports_structured_output=False, supports_thinking=False, base_model=None),
 Model(id='nvidia/nemotron-4-340b-reward', model_type='chat', client='ChatNVIDIA', endpoint=None, aliases=None, supports_tools=False, supports_structured_output=False, supports_thinking=False, base_model=None),
 Model(id='meta/llama-guard-4-12b', model_type='chat', client='ChatNVIDIA', endpoint=None, aliases=None, supports_tools=False, supports_structured_output=False, supports_thinking=False, base_model=None),
 Model(id='google/gemma-2-27b-it', model_type='chat', client='ChatNVIDIA', endpoint=None, aliases=['ai-gemma-2-27b-it'], supports_tools=False, s

In [2]:
## Useful utility method for printing intermediate states
from langchain_core.runnables import RunnableLambda
from functools import partial

def RPrint(preface="State: "):
    def print_and_return(x, preface=""):
        print(f"{preface}{x}")
        return x
    return RunnableLambda(partial(print_and_return, preface=preface))

def PPrint(preface="State: "):
    def print_and_return(x, preface=""):
        pprint(preface, x)
        return x
    return RunnableLambda(partial(print_and_return, preface=preface))

----

<br>

## **Part 1:** Keeping Variables Flowing

In the previous examples, we were able to implement interesting logic in our standalone chains by **creating**, **mutating**, and **consuming** states. These states were passed around as dictionaries with descriptive keys and useful values, and the values would be used to supply follow-up routines with the info they need to operate!

**Recall the zero-shot classification example from the last notebook:**

In [3]:
%%time
## ^^ This notebook is timed, which will print out how long it all took

from langchain_core.runnables import RunnableLambda
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from typing import List, Union
from operator import itemgetter

## Zero-shot classification prompt and chain w/ explicit few-shot prompting
sys_msg = (
    "Choose the student grade given the student performance."
    " Only one word, no explanation.\n[Options : {options}]"
)

zsc_prompt = ChatPromptTemplate.from_template(
    f"{sys_msg}\n\n"
    "[[80/100 in maths, 30/100 in physics, 60/100 in biology, 30/100 in chemistry]][/INST]good</s><s>[INST]"
    "[[{input}]]"
)

## Define your simple instruct_model
instruct_chat = ChatNVIDIA(model="nvidia/llama-3.1-nemotron-nano-8b-v1") # smaller, faster but did not work as well
# instruct_chat = ChatNVIDIA(model="mistralai/mistral-7b-instruct-v0.2")
instruct_llm = instruct_chat | StrOutputParser()
one_word_llm = instruct_chat.bind(stop=[" ", "\n"]) | StrOutputParser()

zsc_chain = zsc_prompt | one_word_llm

## Function that just prints out the first word of the output. With early stopping bind
def zsc_call(input, options=["bad", "average", "good", "excellent"]):
    return zsc_chain.invoke({"input" : input, "options" : options}).split()[0]

print("-" * 80)
print(zsc_call("60/100 in maths, 10/100 in physics, 10/100 in biology, 10/100 in chemistry"))

print("-" * 80)
print(zsc_call("0/100 in maths, 0/100 in physics, 0/100 in biology, 0/100 in chemistry"))

print("-" * 80)
print(zsc_call("0/100 in maths, 55/100 in physics"))

--------------------------------------------------------------------------------
good
--------------------------------------------------------------------------------
bad
--------------------------------------------------------------------------------
average
CPU times: total: 3.73 s
Wall time: 7.98 s


<br>

This chain makes several design decisions that make it very easy to use, key among them the following:

**We want it to act like a function, so all we want it to do is generate the output and return it.**

This makes the chain extremely natural for inclusion as a module in a larger chain system. For example, the following chain will take a string, extract the most likely topic, and then generate a new sentence based on the topic:



In [5]:
%%time
## ^^ This notebook is timed, which will print out how long it all took
gen_prompt = ChatPromptTemplate.from_template(
    "You are a teacher assistant for parents. Make a single sentence comment the following student's grade: {grade}"
)

gen_chain = gen_prompt | instruct_llm

chain = (
    ## -> {"input", "options"}
    {'grade' : zsc_chain}
    | PPrint()
    ## -> {**, "topic"}
    | gen_chain
    ## -> string
)

input_msg = "0/100 in maths, 55/100 in physics"
options = ["bad", "average", "good", "excellent"]
chain.invoke({"input" : input_msg, "options" : options})

CPU times: total: 78.1 ms
Wall time: 487 ms


"Your student's average grade is satisfactory, indicating a strong academic effort across various subjects."

<br>

However, it's a bit problematic when you want to keep the information flowing, since we lose the topic and input variables in generating our response. If we wanted to do something with both the output and the input, we'd need a way to make sure that both variables pass through.

Lucky for us, we can use the mapping runnable (i.e. interpretted from a dictionary or using manual `RunnableMap`) to pass both of the variables through by assigning the output of our chain to just a single key and letting the other keys propagate as desired. Alternatively, we could also use `RunnableAssign` to merge the state-consuming chain's output with the input dictionary by default.

With this technique, we can propagate whatever we want through our chain system:

In [7]:
%%time
## ^^ This notebook is timed, which will print out how long it all took

from langchain.schema.runnable import RunnableBranch, RunnablePassthrough
from langchain.schema.runnable.passthrough import RunnableAssign
from functools import partial

big_chain = (
    PPrint()
    ## Manual mapping. Can be useful sometimes and inside branch chains
    | {'input' : lambda d: d.get('input'), 'grade' : zsc_chain}
    | PPrint()
    ## RunnableAssign passing. Better for running state chains by default
    | RunnableAssign({'generation' : gen_chain})
    | PPrint()
    ## Using the input and generation together
    | RunnableAssign({'combination' : (
        ChatPromptTemplate.from_template(
            "Consider the following passages:"
            "\nP1: {input}"
            "\nP2: {generation}"
            "\n\nCombine the ideas from both sentences into one simple one."
        )
        | instruct_llm
    )})
)

output = big_chain.invoke({
    "input" : "I get seasick, so I think I'll pass on the trip",
    "options" : ["car", "boat", "airplane", "bike", "unknown"]
})
pprint("Final Output: ", output)

CPU times: total: 78.1 ms
Wall time: 1.02 s


----

<br>

## **Part 2:** Running State Chain

The example above is just a toy example and, if anything, showcases the drawbacks of chaining many LLM calls together for internal under-the-hood reasoning. However, the ability to keep information flowing through a chain is invaluable for making complex chains that can accumulate useful state information or operate in a multi-pass capacity.

Specifically, a very simple but effective chain is a **Running State Chain** which enforces the following properties:
- A **"running state"** is a dictionary that contains all of the variables that the system cares about.
- A **"branch"** is a chain that can pull in the running state and can degenerate it into a response.
- A **branch** can only be ran inside a **RunnableAssign** scope, and the branchs' inputs should come from the **running state**.

> <img src="https://dli-lms.s3.amazonaws.com/assets/s-fx-15-v1/imgs/running_state_chain.png" width=1000px/>
<!-- > <img src="https://drive.google.com/uc?export=view&id=1Oo7AauYGj4dxepNReRG2JezmvQLyqXsN" width=1000px/> -->

You can think of the running state chain abstraction as a functional variant of a Pythonic class with state variables (or attributes) and functions (or methods).
- The chain is like the abstract class that wraps all of the functionality.
- The running state are like the attributes (which should always be accessible).
- The branches are like the class methods (which can pick and choose which attributes to use).
- The `.invoke` or similar process is like the `__call__` method that runs through the branches in order.

**By forcing this paradigm in your chains:**
- You can keep state variables propagating through your chain, allowing your internals to access whatever is necessary and accumulating state values for use later.
- You can also pass the outputs of your chain back through as your inputs, allowing a "while-loop"-style chain that keeps updating and building on your running state.

The rest of this notebook will include two exercises that flesh out the running state chain abstraction for two additional use-cases: **Knowledge Bases** and **Database-Querying Chatbots**.

----

<br>

## **Part 3:** Implementing a Knowledge Base with Running State Chain

After understanding the basic structure and principles of a Running State Chain, we can explore how this approach can be extended to manage more complex tasks, particularly in creating dynamic systems that evolve through interaction. This section will focus on implementing a **knowledge base** accumulated using **json-enabled slot filling**:

- **Knowledge Base:** A store of information that's relevant for our LLM to keep track of.
- **JSON-Enabled Slot Filling:** The technique of asking an instruction-tuned model to output a json-style format (which can include a dictionary) with a selection of slots, relying on the LLM to fill these slots with useful and relevant information.

<br>

#### **Defining Our Knowledge Base**

To build a responsive and intelligent system, we need a method that not only processes inputs but also retains and updates essential information through the flow of conversation. This is where the combination of LangChain and Pydantic becomes pivotal. [**Pydantic**](https://docs.pydantic.dev/latest/), a popular Python validation library, is instrumental in structuring and validating data models. As one of its features, Pydantic offers structured "model" classes that validate objects (data, classes, themselves, etc.) with simplified syntax and deep rabbitholes of customization options. This framework is used throughout LangChain and comes up as a necessary component for use cases that involve data coersion.

One thing that a "model" is very good for is defining a class with expected arguments and some special ways to validate them! In this course, we won't focus too much on the validation scripts, but those interested can start by checking out the [**Pydantic Validator guide**](https://docs.pydantic.dev/1.10/usage/validators/) (though the topics do get pretty deep pretty fast). For our purposes, we can construct a `BaseModel` class and define some `Field` variables to create a structured **Knowledge Base** like so:

In [10]:
from pydantic import BaseModel, Field
from typing import Dict, Union, Optional
from langserve import RemoteRunnable

# instruct_chat = RemoteRunnable("http://localhost:9012/basic_chat/")
instruct_chat = ChatNVIDIA(model="mistralai/mistral-7b-instruct-v0.2")

class KnowledgeBase(BaseModel):
    ## Fields of the BaseModel, which will be validated/assigned when the knowledge base is constructed
    topic: str = Field('general', description="Current conversation topic")
    user_preferences: Dict[str, Union[str, int]] = Field({}, description="User preferences and choices")
    session_notes: list = Field([], description="Notes on the ongoing session")
    unresolved_queries: list = Field([], description="Unresolved user queries")
    action_items: list = Field([], description="Actionable items identified during the conversation")

print(repr(KnowledgeBase(topic = "Travel")))

KnowledgeBase(topic='Travel', user_preferences={}, session_notes=[], unresolved_queries=[], action_items=[])


<br>

The true strength of this approach lies in the additional LLM-centric functionalities provided by LangChain which we can integrate for our use-cases. One such feature is the `PydanticOutputParser` which enhances the Pydantic objects with capabilities like automatic format instruction generation.

In [16]:
from langchain.output_parsers import PydanticOutputParser

instruct_string = PydanticOutputParser(pydantic_object=KnowledgeBase).get_format_instructions()
pprint(instruct_string)

This functionality generates instructions for creating valid inputs to the Knowledge Base, which in turn helps the LLM by providing a concrete one-shot example of the desired output format.

<br>

#### **Runnable Extraction Module**

Knowing that we have this Pydantic object which can be used to generate good LLM instructions, we can make a Runnable that wraps the functionality of our Pydantic class and streamlines the prompting, generating, and updating of the knowledge base:

In [17]:
from langchain.schema.runnable.passthrough import RunnableAssign

################################################################################
## Definition of RExtract
def RExtract(pydantic_class, llm, prompt):
    '''
    Runnable Extraction module
    Returns a knowledge dictionary populated by slot-filling extraction
    '''
    parser = PydanticOutputParser(pydantic_object=pydantic_class)
    instruct_merge = RunnableAssign({'format_instructions' : lambda x: parser.get_format_instructions()})
    def preparse(string):
        if '{' not in string: string = '{' + string
        if '}' not in string: string = string + '}'
        string = (string
            .replace("\\_", "_")
            .replace("\n", " ")
            .replace("\]", "]")
            .replace("\[", "[")
        )
        print(string)  ## Good for diagnostics
        return string
    return instruct_merge | prompt | llm | preparse | parser

################################################################################
## Practical Use of RExtract

parser_prompt = ChatPromptTemplate.from_template(
    "Update the knowledge base: {format_instructions}. Only use information from the input."
    "\n\nNEW MESSAGE: {input}"
)

extractor = RExtract(KnowledgeBase, instruct_llm, parser_prompt)

knowledge = extractor.invoke({'input' : "I love flowers so much! The orchids are amazing! Can you buy me some?"})
pprint(knowledge)

  .replace("\]", "]")
  .replace("\[", "[")


ConnectTimeout: HTTPSConnectionPool(host='integrate.api.nvidia.com', port=443): Max retries exceeded with url: /v1/chat/completions (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F58E4E0440>, 'Connection to integrate.api.nvidia.com timed out. (connect timeout=None)'))

<br>

Do keep in mind that this process can fail due to the fuzzy nature of LLM prediction, especially with models that are not optimized for instruction-following! For this process, it's important to have a strong instruction-following LLM with extra checks and graceful failure routines. 

<br>

#### **Dynamic Knowledge Base Updates**

Finally, we can create a system that continually updates the Knowledge Base throughout the conversation. This is done by feeding the current state of the Knowledge Base, along with new user inputs, back into the system for ongoing updates.

The following is an example system that shows off both the formulation's power of filling details as well as the limitations of assuming that filling performance will be as good as general response performance:

In [13]:
class KnowledgeBase(BaseModel):
    firstname: str = Field('unknown', description="Chatting user's first name, unknown if unknown")
    lastname: str = Field('unknown', description="Chatting user's last name, unknown if unknown")
    location: str = Field('unknown', description="Where the user is located")
    summary: str = Field('unknown', description="Running summary of conversation. Update this with new input")
    response: str = Field('unknown', description="An ideal response to the user based on their new message")


parser_prompt = ChatPromptTemplate.from_template(
    "You are chatting with a user. The user just responded ('input'). Please update the knowledge base."
    " Record your response in the 'response' tag to continue the conversation."
    " Do not hallucinate any details, and make sure the knowledge base is not redundant."
    " Update the entries frequently to adapt to the conversation flow."
    "\n{format_instructions}"
    "\n\nOLD KNOWLEDGE BASE: {know_base}"
    "\n\nNEW MESSAGE: {input}"
    "\n\nNEW KNOWLEDGE BASE:"
)

## Switch to a more powerful base model
instruct_llm = ChatNVIDIA(model="mistralai/mixtral-8x22b-instruct-v0.1") | StrOutputParser()

extractor = RExtract(KnowledgeBase, instruct_llm, parser_prompt)
info_update = RunnableAssign({'know_base' : extractor})

## Initialize the knowledge base and see what you get
state = {'know_base' : KnowledgeBase()}
state['input'] = "My name is Carmen Sandiego! Guess where I am! Hint: It's somewhere in the United States."
state = info_update.invoke(state)
pprint(state)



{   "firstname": "Carmen",   "lastname": "Sandiego",   "location": "unknown",   "summary": "The user introduced themselves as Carmen Sandiego and asked for a guess on their location within the United States, providing a hint.",   "response": "Welcome, Carmen Sandiego! I'm excited to try and guess your location. Since you mentioned it's somewhere in the United States, I'll start there. Is it by any chance in New York?" }


In [20]:
class KnowledgeBase(BaseModel):
    total_score: str = Field('unknown', description="Student Score")
    weakest_subject: str = Field('unknown', description="Student's lowest performing subject")
    strongest_subject: str = Field('unknown', description="Student's highest performing subject")
    teacher_comment: str = Field('unknown', description="Running summary of student performance")
    response: str = Field('unknown', description="An ideal response to the user based on their new message")


parser_prompt = ChatPromptTemplate.from_template(
    "You are chatting with a user. The user just responded ('input'). Please update the knowledge base."
    " Record your response in the 'response' tag to continue the conversation."
    " Do not hallucinate any details, and make sure the knowledge base is not redundant."
    " Update the entries frequently to adapt to the conversation flow and reflect the student's performance."
    "\n{format_instructions}"
    "\n\nOLD KNOWLEDGE BASE: {know_base}"
    "\n\nNEW MESSAGE: {input}"
    "\n\nNEW KNOWLEDGE BASE:"
)

## Switch to a more powerful base model
instruct_llm = ChatNVIDIA(model="mistralai/mixtral-8x22b-instruct-v0.1") | StrOutputParser()

extractor = RExtract(KnowledgeBase, instruct_llm, parser_prompt)
info_update = RunnableAssign({'know_base' : extractor})

## Initialize the knowledge base and see what you get
state = {'know_base' : KnowledgeBase()}
student_summary = (
    "Semester 1: Maths 80/100, Physics 70/100, Biology 65/100, Chemistry 75/100, English 85/100. "
    "Semester 2: Maths 90/100, Physics 68/100, Biology 78/100, Chemistry 70/100, English 88/100. "
    "Attendance S1: 92%, S2: 95%. Notes: strong improvement in Maths and Biology, slight decline in Physics and Chemistry, good class participation, needs targeted physics practice."
)
state['input'] = student_summary
state = info_update.invoke(state)
pprint(state)



{   "total_score": "unknown",   "weakest_subject": "Physics",   "strongest_subject": "Maths or English",   "teacher_comment": "The student has shown strong improvement in Maths and Biology in Semester 2, however there has been a slight decline in Physics and Chemistry. The student has a good class participation and a high attendance rate. The student could benefit from targeted Physics practice.",   "response": "You've made great progress in Maths and Biology! However, it seems Physics and Chemistry need more attention. Let's plan some targeted practice for Physics to boost your scores. Keep up the good class participation and attendance!" }


In [22]:
state['input'] = "Total Score is the average of all subjects' score. Semester 1: Maths 80/100, Physics 70/100, Biology 65/100, Chemistry 75/100, English 85/100."
state = info_update.invoke(state)
pprint(state)

{"total_score": "77", "weakest_subject": "Physics", "strongest_subject": "Maths or English", "teacher_comment": "The student has shown strong improvement in Maths and Biology in Semester 2, however there has been a slight decline in Physics and Chemistry. The student has a good class participation and a high attendance rate. The student could benefit from targeted Physics practice.", "response": "Thanks for sharing your Semester 1 scores. Your average score across all subjects is 77. Your strongest subjects seem to be Maths and English, with scores of 80 and 85 respectively, while Physics is your weakest with a score of 70. In Semester 2, you've improved in Maths and Biology, but Physics and Chemistry could still use some work. Let's plan some targeted practice for Physics to boost your scores. Keep up the good class participation and attendance!"}


In [23]:
state['input'] = "Yeah, my strongest subject is Maths... How did you know? Which other subjects am I good or bad at?"
state = info_update.invoke(state)
pprint(state)

{   "total_score": "77",   "weakest_subject": "Physics",   "strongest_subject": "Maths or English",   "teacher_comment": "The student has shown strong improvement in Maths and Biology in Semester 2, however there has been a slight decline in Physics and Chemistry. The student has a good class participation and a high attendance rate. The student could benefit from targeted Physics practice.",   "response": "From your Semester 1 scores, I could see that your strongest subjects are Maths and English, with scores of 80 and 85 respectively. Your weakest subject is Physics, with a score of 70. You've been showing improvement in Maths and Biology in Semester 2, but Physics and Chemistry could still use some work. Let's plan some targeted practice for Physics to boost your scores. Keep up the good class participation and attendance!" }


In [25]:
state['input'] = "What are areas I can improve on and how? and what is are my scores for all subjects in the last semester?"
state = info_update.invoke(state)
pprint(state)

{"total_score": "77", "weakest_subject": "Physics", "strongest_subject": "Maths or English", "teacher_comment": "The student has shown strong improvement in Maths and Biology in Semester 2, however there has been a slight decline in Physics and Chemistry. The student has a good class participation and a high attendance rate. The student could benefit from targeted Physics practice.", "response": "You have shown improvement in Maths and Biology. For areas of improvement, consider Physics and Chemistry. In Physics, targeted practice could be beneficial. For Chemistry, revising challenging concepts would be helpful. Consistent practice and revision will strengthen your understanding and improve your scores. Your scores for the last semester are as follows: Maths or English - High, Biology - Improved, Physics - Low, Chemistry - Could be better."}


<br>

This example demonstrates how a running state chain can be effectively utilized to manage a conversation with evolving context and requirements, making it a powerful tool for developing sophisticated interactive systems.

The next sections of this notebook will expand on these concepts by exploring two specific applications: **Document Knowledge Bases** and **Database-Querying Chatbots**.

----

<br>

## **Part 4: [Exercise]** Airline Customer Service Bot

In this exercise, we can expand on the tools we've learned about to implement a simple but effective dialog manager chatbot. For this exercise, we will make an airline support bot that wants to help a client find out about their flight!

Let's create a simple database-like interface to get some customer information from a dictionary!

In [4]:
#######################################################################################
## Function that can be queried for information. Implementation details not important
def get_flight_info(d: dict) -> str:
    """
    Example of a retrieval function which takes a dictionary as key. Resembles SQL DB Query
    """
    req_keys = ['first_name', 'last_name', 'confirmation']
    assert all((key in d) for key in req_keys), f"Expected dictionary with keys {req_keys}, got {d}"

    ## Static dataset. get_key and get_val can be used to work with it, and db is your variable
    keys = req_keys + ["departure", "destination", "departure_time", "arrival_time", "flight_day"]
    values = [
        ["Jane", "Doe", 12345, "San Jose", "New Orleans", "12:30 PM", "9:30 PM", "tomorrow"],
        ["John", "Smith", 54321, "New York", "Los Angeles", "8:00 AM", "11:00 AM", "Sunday"],
        ["Alice", "Johnson", 98765, "Chicago", "Miami", "7:00 PM", "11:00 PM", "next week"],
        ["Bob", "Brown", 56789, "Dallas", "Seattle", "1:00 PM", "4:00 PM", "yesterday"],
    ]
    get_key = lambda d: "|".join([d['first_name'], d['last_name'], str(d['confirmation'])])
    get_val = lambda l: {k:v for k,v in zip(keys, l)}
    db = {get_key(get_val(entry)) : get_val(entry) for entry in values}

    # Search for the matching entry
    data = db.get(get_key(d))
    if not data:
        return (
            f"Based on {req_keys} = {get_key(d)}) from your knowledge base, no info on the user flight was found."
            " This process happens every time new info is learned. If it's important, ask them to confirm this info."
        )
    return (
        f"{data['first_name']} {data['last_name']}'s flight from {data['departure']} to {data['destination']}"
        f" departs at {data['departure_time']} {data['flight_day']} and lands at {data['arrival_time']}."
    )

#######################################################################################
## Usage example. Actually important

print(get_flight_info({"first_name" : "Jane", "last_name" : "Doe", "confirmation" : 12345}))

Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM.


In [5]:
print(get_flight_info({"first_name" : "Alice", "last_name" : "Johnson", "confirmation" : 98765}))

Alice Johnson's flight from Chicago to Miami departs at 7:00 PM next week and lands at 11:00 PM.


In [6]:
print(get_flight_info({"first_name" : "Bob", "last_name" : "Brown", "confirmation" : 27494}))

Based on ['first_name', 'last_name', 'confirmation'] = Bob|Brown|27494) from your knowledge base, no info on the user flight was found. This process happens every time new info is learned. If it's important, ask them to confirm this info.


<br>

This is a really good interface to bring up because it can reasonably serve two purposes:
- It can be used to provide up-to-date information from an external environment (a database) regarding a user's situation.
- It can also be used as a hard gating mechanism to prevent unauthorized disclosure of sensitive information (since that would be very bad).

If our network had access to this kind of interface, it would be able to query for and retrieve this information on a user's behalf! For example:

In [20]:
external_prompt = ChatPromptTemplate.from_template(
    "You are a SkyFlow chatbot, and you are helping a customer with their issue."
    " Please help them with their question, remembering that your job is to represent SkyFlow airlines."
    " Assume SkyFlow uses industry-average practices regarding arrival times, operations, etc."
    " (This is a trade secret. Do not disclose)."  ## soft reinforcement
    " Please keep your discussion short and sweet if possible. Avoid saying hello unless necessary."
    " The following is some context that may be useful in answering the question."
    "\n\nContext: {context}"
    "\n\nUser: {input}"
)

basic_chain = external_prompt | instruct_llm

basic_chain.invoke({ 
    'input' : 'Can you please tell me when I need to get to the airport?',
    'context' : get_flight_info({"first_name" : "Jane", "last_name" : "Doe", "confirmation" : 12345}),
})

ConnectTimeout: HTTPSConnectionPool(host='integrate.api.nvidia.com', port=443): Max retries exceeded with url: /v1/chat/completions (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F58EF3D310>, 'Connection to integrate.api.nvidia.com timed out. (connect timeout=None)'))

<br>

This is interesting enough, but how do we actually get this system working in the wild? It turns out that we can use the KnowledgeBase formulation from above to supply this kind of information like so:

In [None]:
from pydantic import BaseModel, Field
from typing import Dict, Union

class KnowledgeBase(BaseModel):
    first_name: str = Field('unknown', description="Chatting user's first name, `unknown` if unknown")
    last_name: str = Field('unknown', description="Chatting user's last name, `unknown` if unknown")
    confirmation: int = Field(-1, description="Flight Confirmation Number, `-1` if unknown")
    discussion_summary: str = Field("", description="Summary of discussion so far, including locations, issues, etc.")
    open_problems: list = Field([], description="Topics that have not been resolved yet")
    current_goals: list = Field([], description="Current goal for the agent to address")

def get_key_fn(base: BaseModel) -> dict:
    '''Given a dictionary with a knowledge base, return a key for get_flight_info'''
    return {  ## More automatic options possible, but this is more explicit
        'first_name' : base.first_name,
        'last_name' : base.last_name,
        'confirmation' : base.confirmation,
    }

know_base = KnowledgeBase(first_name = "Jane", last_name = "Doe", confirmation = 12345)

# get_flight_info(get_key_fn(know_base))

get_key = RunnableLambda(get_key_fn)
(get_key | get_flight_info).invoke(know_base)

Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM.


"Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM."

<br>

### **Objective:**

You want a user to be able to invoke the following function call organically as part of a dialog exchange:

```python
get_flight_info({"first_name" : "Jane", "last_name" : "Doe", "confirmation" : 12345}) ->
    "Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM."
```

`RExtract` is provided such that the following knowledge base syntax can be used:
```python
known_info = KnowledgeBase()
extractor = RExtract(KnowledgeBase, InstructLLM(), parser_prompt)
results = extractor.invoke({'info_base' : known_info, 'input' : 'My message'})
known_info = results['info_base']
```

**Design a chatbot that implements the following features:**
- The bot should start off by making small-talk, possibly helping the user with non-sensitive queries which don't require any private info access.
- When the user starts to ask about things that are database-walled (both practically and legally), tell the user that they need to provide the relevant information.
- When the retrieval is successful, the agent will be able to talk about the database-walled information.

**This can be done with a variety of techniques, including the following:**
- **Prompt Engineering and Context Parsing**, where the overall chat prompt stays roughly the same but the context is manipulated to to change agent behavior. For example, a failed db retrieval could be changed into an injection of natural-language instructions for how to resolve the problem such as *`"Information could not be retrieved with keys {...}. Please ask the user for clarification or help them with known information."`*
- **"Prompt Passing,"** where the active prompts are passed around as state variables and can be overridden by monitoring chains.
- **Branching chains** such as [**`RunnableBranch`**](https://api.python.langchain.com/en/latest/core/runnables/langchain_core.runnables.branch.RunnableBranch.html) or more custom solutions that implement an conditional routing mechanism.
    - In the case of [`RunnableBranch`](https://api.python.langchain.com/en/latest/core/runnables/langchain_core.runnables.branch.RunnableBranch.html), a `switch` syntax of the style:
        ```python
        from langchain.schema.runnable import RunnableBranch
        RunnableBranch(
            ((lambda x: 1 in x), RPrint("Has 1 (didn't check 2): ")),
            ((lambda x: 2 in x), RPrint("Has 2 (not 1 though): ")),
            RPrint("Has neither 1 not 2: ")
        ).invoke([2, 1, 3]);  ## -> Has 1 (didn't check 2): [2, 1, 3]
        ```

Some prompts and a gradio loop are provided that might help with the effort, but the agent will currently just hallucinate! Please implement the internal chain to try and retrieve the relevant information. Before trying to implement, look over the default behavior of the model and note how it might hallucinate or forget things.

In [None]:
from operator import itemgetter
from langchain.schema.runnable import (
    RunnableBranch,
    RunnableLambda,
    RunnableMap,       ## Wrap an implicit "dictionary" runnable
    RunnablePassthrough,
)
from langchain.schema.runnable.passthrough import RunnableAssign

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import BaseMessage, SystemMessage, ChatMessage, AIMessage

from pydantic import BaseModel, Field
from typing import Iterable, Optional
import gradio as gr

external_prompt = ChatPromptTemplate.from_messages([
    ("system", (
        "You are a chatbot for SkyFlow Airlines, and you are helping a customer with their issue."
        " Please chat with them! Stay concise and clear!"
        " Your running knowledge base is: {know_base}."
        " This is for you only; Do not mention it!"
        " \nUsing that, we retrieved the following: {context}\n"
        " If they provide info and the retrieval fails, ask to confirm their first/last name and confirmation."
        " Do not ask them any other personal info." 
        " If it's not important to know about their flight, do not ask."
        " The checking happens automatically; you cannot check manually."
    )),
    ("assistant", "{output}"),
    ("user", "{input}"),
])

##########################################################################
## Knowledge Base Things

class KnowledgeBase(BaseModel):
    first_name: str = Field('unknown', description="Chatting user's first name, `unknown` if unknown")
    last_name: str = Field('unknown', description="Chatting user's last name, `unknown` if unknown")
    confirmation: Optional[int] = Field(None, description="Flight Confirmation Number, `-1` if unknown")
    discussion_summary: str = Field("", description="Summary of discussion so far, including locations, issues, etc.")
    open_problems: str = Field("", description="Topics that have not been resolved yet")
    current_goals: str = Field("", description="Current goal for the agent to address")
    # user_retrieval_info: str = Field("", description="Information retrieved from the database about the user")

parser_prompt = ChatPromptTemplate.from_template(
    "You are a chat assistant representing the airline SkyFlow, and are trying to track info about the conversation."
    " You have just received a message from the user. Please fill in the schema based on the chat."
    "\n\n{format_instructions}"
    "\n\nOLD KNOWLEDGE BASE: {know_base}"
    "\n\nASSISTANT RESPONSE: {output}"
    "\n\nUSER MESSAGE: {input}"
    "\n\nNEW KNOWLEDGE BASE: "
)

## Your goal is to invoke the following through natural conversation
# get_flight_info({"first_name" : "Jane", "last_name" : "Doe", "confirmation" : 12345}) ->
#     "Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM."

chat_llm = ChatNVIDIA(model="meta/llama-3.3-70b-instruct") | StrOutputParser()
instruct_llm = ChatNVIDIA(model="mistralai/mixtral-8x22b-instruct-v0.1") | StrOutputParser()

external_chain = external_prompt | chat_llm

#####################################################################################
## START TODO: Define the extractor and internal chain to satisfy the objective

## TODO: Make a chain that will populate your knowledge base based on provided context
knowbase_getter = RExtract(KnowledgeBase, instruct_llm, parser_prompt)

## TODO: Make a chain to pull d["know_base"] and outputs a retrieval from db
database_getter = itemgetter('know_base') | get_key | get_flight_info

# RunnableBranch(
#     ((lambda x: 1 in x), RPrint("Has 1 (didn't check 2): ")),
#     ((lambda x: 2 in x), RPrint("Has 2 (not 1 though): ")),
#     RPrint("Has neither 1 not 2: ")
# ).invoke([2, 1, 3]);

## These components integrate to make your internal chain
internal_chain = (
    RunnableAssign({'know_base' : knowbase_getter})
    | RunnableAssign({'context' : database_getter})
)

## END TODO
#####################################################################################

state = {'know_base' : KnowledgeBase()}

def chat_gen(message, history=[], return_buffer=True):

    ## Pulling in, updating, and printing the state
    global state
    state['input'] = message
    state['history'] = history
    state['output'] = "" if not history else history[-1][1]

    ## Generating the new state from the internal chain
    state = internal_chain.invoke(state)
    print("State after chain run:")
    pprint({k:v for k,v in state.items() if k != "history"})
    
    ## Streaming the results
    buffer = ""
    for token in external_chain.stream(state):
        buffer += token
        yield buffer if return_buffer else token

def queue_fake_streaming_gradio(chat_stream, history = [], max_questions=8):

    ## Mimic of the gradio initialization routine, where a set of starter messages can be printed off
    for human_msg, agent_msg in history:
        if human_msg: print("\n[ Human ]:", human_msg)
        if agent_msg: print("\n[ Agent ]:", agent_msg)

    ## Mimic of the gradio loop with an initial message from the agent.
    for _ in range(max_questions):
        message = input("\n[ Human ]: ")
        print("\n[ Agent ]: ")
        history_entry = [message, ""]
        for token in chat_stream(message, history, return_buffer=False):
            print(token, end='')
            history_entry[1] += token
        history += [history_entry]
        print("\n")

## history is of format [[User response 0, Bot response 0], ...]
chat_history = [[None, "Hello! I'm your SkyFlow agent! How can I help you?"]]

## Simulating the queueing of a streaming gradio interface, using python input
queue_fake_streaming_gradio(
    chat_stream = chat_gen,
    history = chat_history
)


[ Agent ]: Hello! I'm your SkyFlow agent! How can I help you?

[ Agent ]: 




{"first_name": "Jane", "last_name": "unknown", "confirmation": 12345, "discussion_summary": "", "open_problems": "", "current_goals": "", "user_retrieval_info": ""}
State after chain run:


Hi Jane, thanks for providing your confirmation number. I've tried to retrieve your flight information, but unfortunately, I couldn't find any details. Can you please confirm your last name to help me assist you better?


[ Agent ]: 
{   "first_name": "Jane",   "last_name": "Doe",   "confirmation": 12345,   "discussion_summary": "User provided confirmation number 12345, but no information was found. User confirmed last name as Doe.",   "open_problems": "Unable to find flight information for the provided confirmation number.",   "current_goals": "Assist user in retrieving their flight information.",   "user_retrieval_info": "" }
State after chain run:


I've confirmed your last name as Doe. I'll try to retrieve your flight information again. 

I've managed to find your flight. You are flying from San Jose to New Orleans tomorrow. Your flight departs at 12:30 PM and lands at 9:30 PM. Is there anything else I can help you with regarding your flight?


[ Agent ]: 
{"first_name": "Somtoo", "last_name": "Doe", "confirmation": 12345, "discussion_summary": "User provided confirmation number 12345, but no information was found initially. User confirmed last name as Doe. Flight information retrieved: flying from San Jose to New Orleans tomorrow, departing at 12:30 PM and landing at 9:30 PM. User provided correction for first name as Somtoo.", "open_problems": "", "current_goals": "", "user_retrieval_info": "Flight from San Jose to New Orleans tomorrow, departing at 12:30 PM and landing at 9:30 PM"}
State after chain run:


You've corrected your first name to Somtoo. I've updated the information. Just to confirm, your full name is Somtoo Doe, and your confirmation number is 12345, right?


[ Agent ]: 
{   "first_name": "Somtoo",   "last_name": "Doe",   "confirmation": 12345,   "discussion_summary": "User provided confirmation number 12345, but no information was found initially. User confirmed last name as Doe. Flight information retrieved: flying from San Jose to New Orleans tomorrow, departing at 12:30 PM and landing at 9:30 PM. User provided correction for first name as Somtoo. User confirmed full name as Somtoo Doe and confirmation number as 12345.",   "open_problems": "",   "current_goals": "",   "user_retrieval_info": "Flight from San Jose to New Orleans tomorrow, departing at 12:30 PM and landing at 9:30 PM" }
State after chain run:


I've confirmed your details. You're flying from San Jose to New Orleans tomorrow, departing at 12:30 PM and landing at 9:30 PM. Is there anything else I can assist you with regarding your flight?


[ Agent ]: 


ConnectTimeout: HTTPSConnectionPool(host='integrate.api.nvidia.com', port=443): Max retries exceeded with url: /v1/chat/completions (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F58E548D70>, 'Connection to integrate.api.nvidia.com timed out. (connect timeout=None)'))

In [26]:
state = {'know_base' : KnowledgeBase()}

chatbot = gr.Chatbot(value=[[None, "Hello! I'm your SkyFlow agent! How can I help you?"]])
demo = gr.ChatInterface(chat_gen, chatbot=chatbot).queue().launch(debug=True, share=True)

  chatbot = gr.Chatbot(value=[[None, "Hello! I'm your SkyFlow agent! How can I help you?"]])


* Running on local URL:  http://127.0.0.1:7860

Could not create share link. Please check your internet connection or our status page: https://status.gradio.app.


Traceback (most recent call last):
  File "C:\Users\okafo\AppData\Roaming\Python\Python312\site-packages\urllib3\connection.py", line 174, in _new_conn
    conn = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\okafo\AppData\Roaming\Python\Python312\site-packages\urllib3\util\connection.py", line 95, in create_connection
    raise err
  File "C:\Users\okafo\AppData\Roaming\Python\Python312\site-packages\urllib3\util\connection.py", line 85, in create_connection
    sock.connect(sa)
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\okafo\AppData\Roaming\Python\Python312\site-packages\urllib3\connectionpool.py", line 716, in urlopen
    httplib_response = self._make_r

{   "first_name": "unknown",   "last_name": "unknown",   "confirmation": null,   "discussion_summary": "User requested help with their flight tomorrow",   "open_problems": "Need to fetch user's flight information for tomorrow",   "current_goals": "Retrieve user's flight information for tomorrow",   "user_retrieval_info": "" }
State after chain run:


Keyboard interruption in main thread... closing server.


<br>

----

<br>

**NOTE:**
- You may need to explicitly hit the STOP button and try to relaunch your gradio interface if it hangs up after an exception. This is a known Jupyter Notebook environment issue which should not be experienced in dedicated Gradio-running files.
- **Your chat directive is duplicated here for quick access:**
```python
## Your goal is to invoke the following through natural conversation
get_flight_info({
    "first_name" : "Jane",
    "last_name" : "Doe",
    "confirmation" : 12345,
}) -> "Jane Doe's flight from San Jose to New Orleans departs at 12:30 PM tomorrow and lands at 9:30 PM."
```
- **To confirm that your system works, you could try the following dialog or something similar:**
```
> How's it going?
> Can you tell me a bit about skyflow?
> Can you tell me about my flight?
> My name is Jane Doe and my flight confirmation is 12345
> Can you tell me when I should get to my flight?
```
- **Solutions To Exercises Can Be Found In The Solutions Directory.** This is the first exercise with a noted solution, and additional exercises from the future notebooks will be found there.

-----

<br>

## **Part 5:** Wrap-Up

The goal of this notebook was to introduce some more advanced LangChain material revolving around the use of knowledge bases and running state chains! The exercise here was pretty involved, so congrats on finishing it!

### <font color="#76b900">**Great Job!**</font>

### **Next Steps:**
1. **[Optional]** Revisit the **"Questions To Think About" Section** at the top of the notebook and think about some possible answers.

---

<center><a href="https://www.nvidia.com/en-us/training/"><img src="https://dli-lms.s3.amazonaws.com/assets/general/DLI_Header_White.png" width="400" height="186" /></a></center>