### Implementing the Architecture:

```

            -------------                                               
            |           |                             
            | __start__ |         
            -------------
                  |
                  |  <----- Q1. How are you?,  Q2. What is the current GDP of USA?
                  |
                 \ /
            -------------
            |           |
            | supervisor|  <- The supervisor calls the LLM node for the Q1. How are you? then ends.
            -------------  <- The supervisor calls the RAG node for the Q2. What is the current GDP of USA?.
                /   \
               /     \
              /       \
        LLM CALL    RAG CALL    
             |         | 
             |         |
            \ /       \ /
    -------------     -------------            -------------
    |           |     |            |           |            |
    |  LLM      |     |   RAG      | ------->  |  Vector DB |  <---- Data (Eg.: USA Industry)
    -------------     -------------            --------------
            |               |
            |               |
            \               /
             \             /
             \ /         \ /
              -------------
              |           |
              |  __end__  |
              -------------
    
```
____________

#### Configuring the Model:

In [None]:
from langchain_google_genai import ChatGoogleGenerativeAI
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

output = model.invoke("hi")
output

AIMessage(content='Hi there! How can I help you today?', additional_kwargs={}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'model_name': 'gemini-1.5-flash', 'safety_ratings': []}, id='run--6bbb5cbb-3435-4840-9d3d-c9b9545e491f-0', usage_metadata={'input_tokens': 1, 'output_tokens': 11, 'total_tokens': 12, 'input_token_details': {'cache_read': 0}})

In [2]:
output.content

'Hi there! How can I help you today?'

_____________

#### Configuring the embedding model:

In [3]:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en")

len(embeddings.embed_query("Hi"))

  from .autonotebook import tqdm as notebook_tqdm


384

_________
#### Taking Data -> Embed -> Store in VDB

In [4]:
from langchain_community.document_loaders import TextLoader, DirectoryLoader
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter

##### 1. Loading the data

In [5]:
loader = DirectoryLoader("/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2", glob="./*txt", loader_cls=TextLoader)

In [6]:
docs=loader.load()

In [7]:
docs

[Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content="🇺🇸 Overview of the U.S. Economy\nThe United States of America possesses the largest economy in the world in terms of nominal GDP, making it the most powerful economic force globally. It operates under a capitalist mixed economy, where the private sector dominates, but the government plays a significant regulatory and fiscal role. With a population of over 335 million people and a high level of technological advancement, the U.S. economy thrives on a foundation of consumer spending, innovation, global trade, and financial services. It has a highly diversified structure with strong sectors in technology, healthcare, finance, real estate, defense, and agriculture.\n\nU.S. GDP – Size, Composition, and Global Share\nAs of 2024, the United States’ nominal GDP is estimated to be around $28 trillion USD, accounting for approximately 25% of the global economy. It ranks #

In [8]:
docs[0].page_content

"🇺🇸 Overview of the U.S. Economy\nThe United States of America possesses the largest economy in the world in terms of nominal GDP, making it the most powerful economic force globally. It operates under a capitalist mixed economy, where the private sector dominates, but the government plays a significant regulatory and fiscal role. With a population of over 335 million people and a high level of technological advancement, the U.S. economy thrives on a foundation of consumer spending, innovation, global trade, and financial services. It has a highly diversified structure with strong sectors in technology, healthcare, finance, real estate, defense, and agriculture.\n\nU.S. GDP – Size, Composition, and Global Share\nAs of 2024, the United States’ nominal GDP is estimated to be around $28 trillion USD, accounting for approximately 25% of the global economy. It ranks #1 in the world by nominal GDP, far ahead of China (which ranks 2nd). The U.S. GDP per capita is also among the highest, hover

##### 2. Chunking the Data

In [9]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)

In [10]:
new_docs = text_splitter.split_documents(documents=docs)

In [11]:
new_docs

[Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='🇺🇸 Overview of the U.S. Economy'),
 Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='The United States of America possesses the largest economy in the world in terms of nominal GDP, making it the most powerful economic force globally. It operates under a capitalist mixed economy,'),
 Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='It operates under a capitalist mixed economy, where the private sector dominates, but the government plays a significant regulatory and fiscal role. With a population of over 335 million people and a'),
 Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='a population of over 335 million people and a high level of

In [12]:
doc_string = [docs.page_content for docs in new_docs]

In [13]:
doc_string

['🇺🇸 Overview of the U.S. Economy',
 'The United States of America possesses the largest economy in the world in terms of nominal GDP, making it the most powerful economic force globally. It operates under a capitalist mixed economy,',
 'It operates under a capitalist mixed economy, where the private sector dominates, but the government plays a significant regulatory and fiscal role. With a population of over 335 million people and a',
 'a population of over 335 million people and a high level of technological advancement, the U.S. economy thrives on a foundation of consumer spending, innovation, global trade, and financial services.',
 'innovation, global trade, and financial services. It has a highly diversified structure with strong sectors in technology, healthcare, finance, real estate, defense, and agriculture.',
 'U.S. GDP – Size, Composition, and Global Share',
 'As of 2024, the United States’ nominal GDP is estimated to be around $28 trillion USD, accounting for approximately 

In [14]:
len(doc_string)

55

##### 3. Storing the embedded data chunks in ChromaDB

**Note:** If you see telemetry warnings like "Failed to send telemetry event", these are harmless and don't affect functionality. We've disabled telemetry below to avoid these warnings.

In [15]:
# Suppress ChromaDB telemetry warnings
import io
import sys
import contextlib

@contextlib.contextmanager
def suppress_stdout_stderr():
    """Context manager to suppress stdout and stderr temporarily"""
    old_stdout, old_stderr = sys.stdout, sys.stderr
    try:
        sys.stdout = sys.stderr = io.StringIO()
        yield
    finally:
        sys.stdout, sys.stderr = old_stdout, old_stderr

In [16]:
# Create ChromaDB while suppressing telemetry output

print("Creating ChromaDB vector database...")
with suppress_stdout_stderr():
    db = Chroma.from_documents(new_docs, embeddings)
    
print("✅ ChromaDB created successfully!")

Creating ChromaDB vector database...
✅ ChromaDB created successfully!


In [17]:
retriever = db.as_retriever(search_kwargs={"k": 3})

In [18]:
retriever.invoke("What is industrial growth of USA?")

Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given


[Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='Looking forward, the U.S. economy is expected to grow at a moderate pace, powered by innovation in AI, green energy, robotics, biotech, and quantum computing. The Biden administration’s Inflation'),
 Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='The U.S. economy remains the engine of global growth, backed by unmatched innovation, financial dominance, and a strong institutional framework. Its $28 trillion GDP and influence over global'),
 Document(metadata={'source': '/Users/sanyuktatuti/Documents/AGENTIC_AI_Krish_Naik/3-LangGraph/data2/usa.txt'}, page_content='The U.S. maintains its GDP growth through strong innovation, entrepreneurship, and investment in R&D. With companies like Apple, Google, Amazon, Microsoft, and Tesla leading global markets, the U.S.')]

______________________

#### Langgraph Workflow Creation

##### 1. Creating Pydantic Class for Data Input/Output Validation

In [62]:
import operator
from typing import List
from pydantic import BaseModel, Field
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from langchain_core.runnables import RunnablePassthrough
from langgraph.graph import StateGraph, START, END



In [20]:
class TopicSelectionParser(BaseModel):
    Topic: str = Field(description="Selected Topic")
    Reasoning: str = Field(description="Reasoning for selecting the topic")

In [22]:
parser = PydanticOutputParser(pydantic_object=TopicSelectionParser)

In [23]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"Topic": {"description": "Selected Topic", "title": "Topic", "type": "string"}, "Reasoning": {"description": "Reasoning for selecting the topic", "title": "Reasoning", "type": "string"}}, "required": ["Topic", "Reasoning"]}\n```'

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"Topic": {"description": "Selected Topic", "title": "Topic", "type": "string"}, "Reasoning": {"description": "Reasoning for selecting the topic", "title": "Reasoning", "type": "string"}}, "required": ["Topic", "Reasoning"]}\n```'


##### 2. We create ```AgentState``` to create flow if input through different Nodes

In [24]:
class Agentstate(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]

##### 2.1 Understanding the concept of ```Agentstate```:

In [25]:
# Created a dictionary to store the state of the agent

AgentState={}

In [26]:
# Created a key called ```messages``` in the Agentstate dictionary
# Assigned an empty list value to the key ```messages```

AgentState ["messages"]=[]

In [27]:
AgentState

{'messages': []}

In [28]:
# Appending values to the Agentstate dictionary to the key ```messages```

AgentState["messages"].append("Hi, how are you?")

In [29]:
AgentState

{'messages': ['Hi, how are you?']}

In [30]:
AgentState["messages"].append("What are you doing?")

In [31]:
AgentState

{'messages': ['Hi, how are you?', 'What are you doing?']}

In [32]:
AgentState["messages"].append("I hope everything is going well")

In [33]:
AgentState

{'messages': ['Hi, how are you?',
  'What are you doing?',
  'I hope everything is going well']}

In [34]:
# class AgentState(TypedDict):
#    messages: Annotated[Sequence[BaseMessage], operator.add]

# class Agentstate(TypedDict): 
# We have an Agentstate of type dictionary (TypedDict)

# messages: Annotated[Sequence[BaseMessage], operator.add]
# here we have a key called messages in the Agentstate dictionary

# Annotated[Sequence[BaseMessage], operator.add] is a validation
# Annotated means that the value of the key messages should be a sequence of BaseMessage
# Here, example of list of BaseMessage: ['Hi, how are you?', 'What are you doing?','I hope everything is going well']

# Sequence[BaseMessage] means that the value of the key messages should be a sequence of BaseMessage
# BaseMessage is a class that represents a message in the agent state

# operator.add is a function that adds two BaseMessage objects together

# TypedDict is a class that represents a dictionary with typed keys and values

In [35]:
AgentState["messages"][-1]

# returns the last message in the list

'I hope everything is going well'

In [36]:
AgentState["messages"][0]

# returns the first message in the list

'Hi, how are you?'

#### 3. Creating Nodes

When we create Nodes we define the state of the Nodes, here the state belongs class Agentstate.



In [37]:
from langchain_core.prompts import PromptTemplate

In [38]:
def function1(state: Agentstate):
    
    # Supervisor function - unerstands teh context of the question if it is related to USA or not

    # ["messages"][-1] fetches the last message in the list
    question = state["messages"][-1]

    print("Question: ", question)

    template = """Your task is to classify the given user query into one of the following categories: [USA, Not Related. Only respond with the category name and nothing else]
    
    User Query: {question}
    
    Please respond in the following JSON format:
    {format_instructions}"""

    prompt = PromptTemplate(
        template=template,  # Fixed: Use the actual template
        input_variables=["question"],  # Fixed: input_variables (plural)
        partial_variables={"format_instructions": parser.get_format_instructions()}
    )

    chain = prompt | model | parser

    response = chain.invoke({"question": question})

    print("Parsed response", response)

    return {"messages": [response.Topic]}
    # Here we are returning the Topic from the response, the "Topic" is the Pydantic parser class we created earlier.



In [39]:
state={"messages": ["What is today's weather?"]}

function1(state)

Question:  What is today's weather?
Parsed response Topic='Not Related' Reasoning='The query is a general weather question, not specific to the USA.'


{'messages': ['Not Related']}

In [40]:
state={"messages": ["What is the GDP of the USA?"]}

function1(state)

Question:  What is the GDP of the USA?
Parsed response Topic='USA' Reasoning='The query explicitly asks for the GDP of the USA.'


{'messages': ['USA']}

In [41]:
def router(state: Agentstate):

    # Router function - decides whether to route to RAG or LLM based on the last message in the list returned by the function1

    print("-> Router ->")

    last_message=state["messages"][-1]
    print("Last message: ", last_message)

    if"usa" in last_message.lower():
        return "RAG Call"
    else:
        return "LLM Call"

In [42]:
def format_docs(docs):
    return "\n\n".join([doc.page_content for doc in docs])

In [43]:
def function2(state: Agentstate):

    # RAG FUNCTION
    
    print("-> RAG Call ->")
    
    # Get the question from state
    question = state["messages"][0]
    print("Question:", question)
    
    try:
        # First get relevant documents using the retriever
        docs = retriever.get_relevant_documents(str(question))  # Ensure question is a string
        context = format_docs(docs)
        print("Retrieved relevant documents")

        # retriever.get_relevant_documents(str(question)):
        # retriever is our ChromaDB retriever that we set up earlier with k=3 (meaning it will return the 3 most relevant documents)
        # str(question) converts the question to a string to ensure the retriever gets the right format
        # This function uses the HuggingFace embeddings (BAAI/bge-small-en) to:
        # Convert the question into an embedding vector
        # Find the most similar documents in the vector database
        # Return those documents

        # context = format_docs(docs):
        # docs is a list of Document objects returned by the retriever
        # format_docs() is our helper function that combines all the document contents into a single string
        # It joins the documents with newlines between them so they're readable
                
        # Create the prompt template
        prompt = PromptTemplate(
            template="""You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.

            Question: {question}
            Context: {context}

            Answer:""",
            input_variables=["question", "context"]
        )
        
        # Format the prompt with our inputs
        formatted_prompt = prompt.format(
            question=str(question),  # Ensure question is a string
            context=context
        )
        
        # Get response from the model
        response = model.invoke(formatted_prompt)
        
        return {"messages": [response.content]}
        
    except Exception as e:
        print(f"Error in RAG function: {str(e)}")
        return {"messages": ["I apologize, but I encountered an error while processing your question."]}

In [44]:
def function3(state: Agentstate):
    
    # LLM FUNCTION

    print("-> LLM Call ->")
    
    question =state["messages"][0]

    # Normal LLM Call

    complete_query = "Answer the following question with you knowledge of the real world. Following is the question: " + question
    response = model.invoke(complete_query)
    return{"messages": [response.content]}

##### 4. Creating StateGraphs

In [45]:
from langgraph.graph import StateGraph, END

##### 4.1 Creating Workflow

In [46]:
# Fix: Use the TypedDict class 'Agentstate', not the dictionary 'AgentState'

workflow = StateGraph(Agentstate)

In [47]:
workflow.add_node("Supervisor", function1)


<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [48]:
workflow.add_node("RAG", function2)

<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [49]:
workflow.add_node("LLM", function3)

<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [50]:
workflow.set_entry_point("Supervisor")

<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [51]:
workflow.add_conditional_edges(
    "Supervisor",
    router,
    {
        "RAG Call": "RAG",
        "LLM Call": "LLM"
    }
)

<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [52]:
workflow.add_edge("RAG", END)
workflow.add_edge("LLM", END)

<langgraph.graph.state.StateGraph at 0x33ce8ef90>

In [53]:
app = workflow.compile()

In [54]:
state={"messages": ["What is the meaning of Life?"]}

In [55]:
app.invoke(state)

Question:  What is the meaning of Life?
Parsed response Topic='Not Related' Reasoning='The query is a philosophical question and not related to the USA.'
-> Router ->
Last message:  Not Related
-> LLM Call ->


{'messages': ['What is the meaning of Life?',
  'Not Related',
  "There's no single, universally accepted answer to the meaning of life.  The meaning of life is a deeply personal and philosophical question, and different individuals and cultures have vastly different perspectives.\n\nSome common perspectives include:\n\n* **Nihilism:**  The belief that life is inherently without meaning or purpose.\n* **Existentialism:** The belief that individuals create their own meaning and purpose through their choices and actions.  There's no pre-ordained meaning; we must define it ourselves.\n* **Absurdism:** The belief that the search for meaning in a meaningless universe is inherently absurd, but that we should embrace this absurdity and live authentically despite it.\n* **Spiritual and Religious Beliefs:** Many religions offer answers about the meaning of life, often involving serving a higher power, following divine commandments, or achieving enlightenment or salvation.  These meanings often 

In [56]:
state={"messages": ["What is the GDP of the USA?"]}

app.invoke(state)

Question:  What is the GDP of the USA?
Parsed response Topic='USA' Reasoning='The query explicitly asks for the GDP of the USA.'
-> Router ->
Last message:  USA
-> RAG Call ->
Question: What is the GDP of the USA?


  docs = retriever.get_relevant_documents(str(question))  # Ensure question is a string


Retrieved relevant documents


{'messages': ['What is the GDP of the USA?',
  'USA',
  'The nominal GDP of the USA is approximately $28 trillion USD as of 2024.  This represents about 25% of the global economy.  It holds the #1 ranking worldwide by nominal GDP.']}

In [57]:
state={"messages": ["What is the weather in the USA right now?"]}

app.invoke(state)

Question:  What is the weather in the USA right now?
Parsed response Topic='USA' Reasoning='The query explicitly asks about the weather in the USA.'
-> Router ->
Last message:  USA
-> RAG Call ->
Question: What is the weather in the USA right now?
Retrieved relevant documents


{'messages': ['What is the weather in the USA right now?',
  'USA',
  "I don't know.  The provided text focuses on the US economy, not the current weather."]}

In [58]:
state={"messages": ["Can you tell me the industrial growth of world's most powerful economy?"]}

app.invoke(state)

Question:  Can you tell me the industrial growth of world's most powerful economy?
Parsed response Topic='USA' Reasoning="The query asks about the industrial growth of the world's most powerful economy, which is generally considered to be the USA."
-> Router ->
Last message:  USA
-> RAG Call ->
Question: Can you tell me the industrial growth of world's most powerful economy?
Retrieved relevant documents


Retrying langchain_google_genai.chat_models._chat_with_retry.<locals>._chat_with_retry in 2.0 seconds as it raised InternalServerError: 500 Internal error encountered..


{'messages': ["Can you tell me the industrial growth of world's most powerful economy?",
  'USA',
  "The provided text states the U.S. has a $28 trillion GDP and is the engine of global growth.  It doesn't give specific industrial growth figures, however.  Therefore, I cannot answer your question about the precise industrial growth rate."]}

In [59]:
state={"messages": ["Can you tell me the industrial growth of world's poor economy?"]}

app.invoke(state)

Question:  Can you tell me the industrial growth of world's poor economy?
Parsed response Topic='Not Related' Reasoning="The query asks about the industrial growth of the world's poor economy, which is a global economic issue not specific to the USA."
-> Router ->
Last message:  Not Related
-> LLM Call ->


{'messages': ["Can you tell me the industrial growth of world's poor economy?",
  'Not Related',
  'There\'s no single, easily quantifiable answer to "the industrial growth of the world\'s poor economies."  The term "poor economies" itself is broad and encompasses a vast diversity of nations at different stages of development, facing different challenges, and experiencing varying levels of industrialization.  However, we can make some general observations:\n\n* **Uneven Growth:** Industrial growth in poorer economies is highly uneven. Some countries have experienced significant industrial expansion, often driven by specific sectors like textiles, manufacturing, or resource extraction. Others remain largely agrarian, with limited industrial capacity.  This disparity is influenced by factors like access to capital, infrastructure, technology, education, and political stability.\n\n* **Shifting Manufacturing Centers:**  A significant portion of global manufacturing has shifted from develo

___________

#### Breakdown of the architecture

```
                            Input (user query)
                                |
                                |    "Hi"
                               \ /
                            -------------
       state =  --------->  |           |
    {"messages" :[]}        |Supervisor |   {"messages": ["Hi"]}
 we validate the state as:  |  Agent    |   We use ```Annotation``` just to describe the nehaviour of the variable
class Agenstate(typedict)   -------------
                    <state["messages"][-1]> => Hi   Topic = "Not Related" -|  Parser (PydanticOutputParser)
                                  |                     Reasoning ="..."      -|
                                  |
                                 \ / 
                            -------------
       state =  --------->  |           |
    {"messages" :           |   Router  |   
 ["Hi", "Not Related]}      |           |   
                            -------------
                <state["messages"][-1]> => Not Related
                               /
                              /
                             /
                        LLM Call  
                            |
                            |
                         <code> <- Final Answer 
                     state = {"messages": ["Hi", "Not related"]}
                     Model("Hi") ---> response {"messages": ["Hi", "Not related", "Final Answer"]}  
                     Output {"messages": ["Hi", "Not related", "Final Answer"]}

                     Thus, result = state["messages"][-1] => Final Answer
```

________

### Assignment

1. Create a Supervisor node
2. Router
3.  3.1 LLM Call (llm node) \
    3.2 RAG (rag node) \
    3.3 Web Crawler (internet) (fetch the info real time from internet) \
4. Create one more node for validation of generated output ---> explore the validation part
5. If validation fails, go to the supervisor again and then supervisor will again decide what needs to be called next
6. Once the validation pass, then only generate the final output.

_______