# LangGraph and LangSmith - Agentic RAG Powered by LangChain

In the following notebook we'll complete the following tasks:

- 🤝 Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Creating our Tool Belt
  4. Creating Our State
  5. Creating and Compiling A Graph!

  - 🤝 Breakout Room #2:
  1. Evaluating the LangGraph Application with LangSmith
  2. Adding Helpfulness Check and "Loop" Limits
  3. LangGraph for the "Patterns" of GenAI

# 🤝 Breakout Room #1

## Part 1: LangGraph - Building Cyclic Applications with LangChain

LangGraph is a tool that leverages LangChain Expression Language to build coordinated multi-actor and stateful applications that includes cyclic behaviour.

### Why Cycles?

In essence, we can think of a cycle in our graph as a more robust and customizable loop. It allows us to keep our application agent-forward while still giving the powerful functionality of traditional loops.

Due to the inclusion of cycles over loops, we can also compose rather complex flows through our graph in a much more readable and natural fashion. Effectively allowing us to recreate application flowcharts in code in an almost 1-to-1 fashion.

### Why LangGraph?

Beyond the agent-forward approach - we can easily compose and combine traditional "DAG" (directed acyclic graph) chains with powerful cyclic behaviour due to the tight integration with LCEL. This means it's a natural extension to LangChain's core offerings!

## Task 1:  Dependencies

We'll first install all our required libraries.

> NOTE: If you're running this locally - please skip this step.

In [None]:
!pip install -qU langchain langchain_openai langchain-community langgraph arxiv

## Task 2: Environment Variables

We'll want to set both our OpenAI API key and our LangSmith environment variables.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [2]:
os.environ["TAVILY_API_KEY"] = getpass.getpass("TAVILY_API_KEY")

In [3]:
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE5 - LangGraph - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key: ")

## Task 3: Creating our Tool Belt

As is usually the case, we'll want to equip our agent with a toolbelt to help answer questions and add external knowledge.

There's a tonne of tools in the [LangChain Community Repo](https://github.com/langchain-ai/langchain/tree/master/libs/community/langchain_community/tools) but we'll stick to a couple just so we can observe the cyclic nature of LangGraph in action!

We'll leverage:

- [Tavily Search Results](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/tools/tavily_search/tool.py)
- [Arxiv](https://github.com/langchain-ai/langchain/tree/master/libs/community/langchain_community/tools/arxiv)

####🏗️ Activity #1:

Please add the tools to use into our toolbelt.

> NOTE: Each tool in our toolbelt should be a method.

In [4]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.tools.arxiv.tool import ArxivQueryRun

tavily_tool = TavilySearchResults(max_results=5) # this tool enables users to retrieve up-to-date online information directly within the Langchain framework

tool_belt = [
    tavily_tool, # tool to  facilitate real-time web searches by leveraging the Tavily API 
    ArxivQueryRun(), # tool to facilitate the execution of queries within the Langchain framework
]

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b>

The Langchain community tools library is a library of chat models, retrievers, tools / toolkits, document loaders, vector stores, embedding models, and more.

The objective of this notebook is to create an AI agent with a toolbelt to help the agent answer questions and add external knowledge.  

(1) To answer questions: For this, as per the code provided by Chris above, we are using ArvixQueryRun tool, a retriever component. It enables users to search and retrieve scientific articles from arXiv.org, covering disciplines of Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, Statistics, Electrical Engineering, and Economics. It interacts with the arXiv API to fetch articles matching the query, which can be useful for applications requiring access to the latest scientific research.  

(2) To add external knowledge: For this, as per the code provided by Chris above, we are using TavilyTool. Tavily's Search API is a search engine tailored specifically for AI agents (LLMs), delivering real-time, accurate, and factual results at speed. It enables users to retrieve up-to-date online information directly within the Langchain framework. It has parameters such as max_results which indicates the maximum search results to be returned, search_depth indicating depth of the search ("basic" or "advanced", and many more.)

</span>
</div>

### Model

Now we can set-up our model! We'll leverage the familiar OpenAI model suite for this example - but it's not *necessary* to use with LangGraph. LangGraph supports all models - though you might not find success with smaller models - as such, they recommend you stick with:

- OpenAI's GPT-3.5 and GPT-4
- Anthropic's Claude
- Google's Gemini

> NOTE: Because we're leveraging the OpenAI function calling API - we'll need to use OpenAI *for this specific example* (or any other service that exposes an OpenAI-style function calling API.

In [5]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o", temperature=0)

Now that we have our model set-up, let's "put on the tool belt", which is to say: We'll bind our LangChain formatted tools to the model in an OpenAI function calling format.

In [6]:
model = model.bind_tools(tool_belt) 

#### ❓ Question #1:

How does the model determine which tool to use?

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b>  

Using bind_tools allows the model decides whether or not to call a tool based on the input prompt. The model chooses whether to return one tool call, multiple tool calls, or no tool calls at all.  

Some models support a tool_choice parameter that gives us the ability to force the model to call a tool. We can pass the tool name that should be always called (tool_choice="tool_name").   

We can also set tool_choice="any" to force the model to call at least one tool, without specifying which specific tool.

tool_choice parameter must be used when:
1. When we know that a specific tool must be used for the best response.
2. To prevent the model from answering based on pre-trained knowledge only and enrich the model response.
3. To prioritize specific tools given the objective of the AI agent or overall application being built.

</span>
</div>

## Task 4: Putting the State in Stateful

Earlier we used this phrasing:

`coordinated multi-actor and stateful applications`

So what does that "stateful" mean?

To put it simply - we want to have some kind of object which we can pass around our application that holds information about what the current situation (state) is. Since our system will be constructed of many parts moving in a coordinated fashion - we want to be able to ensure we have some commonly understood idea of that state.

LangGraph leverages a `StatefulGraph` which uses an `AgentState` object to pass information between the various nodes of the graph.

There are more options than what we'll see below - but this `AgentState` object is one that is stored in a `TypedDict` with the key `messages` and the value is a `Sequence` of `BaseMessages` that will be appended to whenever the state changes.

Let's think about a simple example to help understand exactly what this means (we'll simplify a great deal to try and clearly communicate what state is doing):

1. We initialize our state object:
  - `{"messages" : []}`
2. Our user submits a query to our application.
  - New State: `HumanMessage(#1)`
  - `{"messages" : [HumanMessage(#1)}`
3. We pass our state object to an Agent node which is able to read the current state. It will use the last `HumanMessage` as input. It gets some kind of output which it will add to the state.
  - New State: `AgentMessage(#1, additional_kwargs {"function_call" : "WebSearchTool"})`
  - `{"messages" : [HumanMessage(#1), AgentMessage(#1, ...)]}`
4. We pass our state object to a "conditional node" (more on this later) which reads the last state to determine if we need to use a tool - which it can determine properly because of our provided object!

In [7]:
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

## Task 5: It's Graphing Time!

Now that we have state, and we have tools, and we have an LLM - we can finally start making our graph!

Let's take a second to refresh ourselves about what a graph is in this context.

Graphs, also called networks in some circles, are a collection of connected objects.

The objects in question are typically called nodes, or vertices, and the connections are called edges.

Let's look at a simple graph.

![image](https://i.imgur.com/2NFLnIc.png)

Here, we're using the coloured circles to represent the nodes and the yellow lines to represent the edges. In this case, we're looking at a fully connected graph - where each node is connected by an edge to each other node.

If we were to think about nodes in the context of LangGraph - we would think of a function, or an LCEL runnable.

If we were to think about edges in the context of LangGraph - we might think of them as "paths to take" or "where to pass our state object next".

Let's create some nodes and expand on our diagram.

> NOTE: Due to the tight integration with LCEL - we can comfortably create our nodes in an async fashion!

In [8]:
from langgraph.prebuilt import ToolNode

def call_model(state):
  messages = state["messages"]
  response = model.invoke(messages)
  return {"messages" : [response]}

tool_node = ToolNode(tool_belt)

Now we have two total nodes. We have:

- `call_model` is a node that will...well...call the model
- `tool_node` is a node which can call a tool

Let's start adding nodes! We'll update our diagram along the way to keep track of what this looks like!


In [9]:
from langgraph.graph import StateGraph, END

uncompiled_graph = StateGraph(AgentState)

uncompiled_graph.add_node("agent", call_model) # Agent node is the call_model node (it calls the model)
uncompiled_graph.add_node("action", tool_node) # Action node is the tool_node (it calls a tool added to the model toolset)

<langgraph.graph.state.StateGraph at 0x23b7c9166d0>

Let's look at what we have so far:

![image](https://i.imgur.com/md7inqG.png)

Next, we'll add our entrypoint. All our entrypoint does is indicate which node is called first.

In [10]:
uncompiled_graph.set_entry_point("agent") # Adding the entrypoint to the graph

<langgraph.graph.state.StateGraph at 0x23b7c9166d0>

![image](https://i.imgur.com/wNixpJe.png)

Now we want to build a "conditional edge" which will use the output state of a node to determine which path to follow.

We can help conceptualize this by thinking of our conditional edge as a conditional in a flowchart!

Notice how our function simply checks if there is a "function_call" kwarg present.

Then we create an edge where the origin node is our agent node and our destination node is *either* the action node or the END (finish the graph).

It's important to highlight that the dictionary passed in as the third parameter (the mapping) should be created with the possible outputs of our conditional function in mind. In this case `should_continue` outputs either `"end"` or `"continue"` which are subsequently mapped to the action node or the END node.

In [11]:
def should_continue(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  return END

uncompiled_graph.add_conditional_edges(
    "agent",
    should_continue
)

<langgraph.graph.state.StateGraph at 0x23b7c9166d0>

Let's visualize what this looks like.

![image](https://i.imgur.com/8ZNwKI5.png)

Finally, we can add our last edge which will connect our action node to our agent node. This is because we *always* want our action node (which is used to call our tools) to return its output to our agent!

In [12]:
uncompiled_graph.add_edge("action", "agent") # Connecting Action node to Agent node means action's output is returned to the agent

<langgraph.graph.state.StateGraph at 0x23b7c9166d0>

Let's look at the final visualization.

![image](https://i.imgur.com/NWO7usO.png)

All that's left to do now is to compile our workflow - and we're off!

In [13]:
compiled_graph = uncompiled_graph.compile() # Compile the AI agent workflow

#### ❓ Question #2:

Is there any specific limit to how many times we can cycle?

If not, how could we impose a limit to the number of cycles?

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

(1) By default, LangGraph sets a recursion limit of 25 steps before hitting a stop condition. This default limit is in place to prevent potential infinite loops and to manage resource consumption effectively. Source: [LangChain API Documentation](https://api.python.langchain.com/en/latest/_modules/langchain_core/runnables/config.html?utm_source=chatgpt.com)

(2) We can impose a limit on the number of cycles by defning recursion_limit. It is the parameter to define the maximum number of times a call can recurse for the entire graph execution.  

There is no specific limit to how many times we can cycle, unless the execution leads to the state of running out of memory to hold the state object. 

We can set a specific limit (count # of iterations and compare it to the desired threshold).

</span>
</div>

## Using Our Graph

Now that we've created and compiled our graph - we can call it *just as we'd call any other* `Runnable`!

Let's try out a few examples to see how it fairs:

In [14]:
from langchain_core.messages import HumanMessage

inputs = {"messages" : [HumanMessage(content="Who is the current captain of the Winnipeg Jets?")]}

async for chunk in compiled_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_GzHzmsZzfwGnhoW5Q79uvy5L', 'function': {'arguments': '{"query":"current captain of the Winnipeg Jets 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 162, 'total_tokens': 189, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_50cad350e4', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d9fc6db5-2819-4673-8c15-3f4f85f88dad-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'current captain of the Winnipeg Jets 2023'}, 'id': 'call_GzHzmsZzfwGnhoW5Q79uvy5L', 'type': 'tool_call'}], usage_metadata={'input_tokens': 162, 'output_t

Let's look at what happened:

1. Our state object was populated with our request
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `additional_kwarg`, and sent the state object to the action node
4. The action node added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node
5. The agent node added a response to the state object and passed it along the conditional edge
6. The conditional edge received the state object, could not find the "tool_calls" `additional_kwarg` and passed the state object to END where we see it output in the cell above!

Now let's look at an example that shows a multiple tool usage - all with the same flow!

In [15]:
inputs = {"messages" : [HumanMessage(content="Search Arxiv for the QLoRA paper, then search each of the authors to find out their latest Tweet using Tavily!")]}

async for chunk in compiled_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_yJjnh7Jr9ftJ7XoTqH3XeM9S', 'function': {'arguments': '{"query":"QLoRA"}', 'name': 'arxiv'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 178, 'total_tokens': 195, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_50cad350e4', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-56edf811-2397-4b64-87d9-d2df6eeb62d6-0', tool_calls=[{'name': 'arxiv', 'args': {'query': 'QLoRA'}, 'id': 'call_yJjnh7Jr9ftJ7XoTqH3XeM9S', 'type': 'tool_call'}], usage_metadata={'input_tokens': 178, 'output_tokens': 17, 'total_tokens': 195, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'a

####🏗️ Activity #2:

Please write out the steps the agent took to arrive at the correct answer.

## Part 1: LangSmith Evaluator

### Pre-processing for LangSmith

To do a little bit more preprocessing, let's wrap our LangGraph agent in a simple chain.

In [16]:
def convert_inputs(input_object):
  return {"messages" : [HumanMessage(content=input_object["question"])]}

def parse_output(input_state):
  return input_state["messages"][-1].content

agent_chain = convert_inputs | compiled_graph | parse_output

In [17]:
agent_chain.invoke({"question" : "What is RAG?"})

"RAG stands for Retrieval-Augmented Generation. It is a technique used in natural language processing (NLP) that combines retrieval-based methods with generative models to improve the quality and relevance of generated text. Here's a brief overview of how it works:\n\n1. **Retrieval**: In the first step, the system retrieves relevant documents or pieces of information from a large corpus or database. This is typically done using a retrieval model that identifies the most relevant content based on the input query or context.\n\n2. **Augmentation**: The retrieved information is then used to augment the input to a generative model. This means that the generative model has access to additional context or knowledge that can help it produce more accurate and contextually appropriate responses.\n\n3. **Generation**: Finally, the generative model, often a transformer-based model like GPT, uses the augmented input to generate a response or piece of text. The inclusion of retrieved information h

### Task 1: Creating An Evaluation Dataset

Just as we saw last week, we'll want to create a dataset to test our Agent's ability to answer questions.

In order to do this - we'll want to provide some questions and some answers. Let's look at how we can create such a dataset below.

```python
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]
```

####🏗️ Activity #3:

Please create a dataset in the above format with at least 5 questions.

In [18]:
questions = [
    "Can you generate a 500 word summary of the famous research paper 'Attention is All You Need'?",
    "What is the self-attention mechanism mentioned in the Attention is All You Need paper?",
    "What is a videoRAG and how does it differ from RAG?",
    "What are the major types of prompt engineering?",
    "What are the most recent research papers on the topic of tokenization?",
    "What are the key advancements in model distillation for improving the efficiency of large language models?",
    "How does adversarial prompt engineering help in testing and improving the robustness of generative AI models?",
    "How do the techniques of few-shot, zero-shot, and chain-of-thought prompting compare in the field of large language models?",
    "What are the latest advancements in automated prompt optimization using reinforcement learning or genetic algorithms?",
    "What are the most popular use cases requiring multi-modal LLM capabilties?",
]

answers = [
    {"must_mention" : ["self-attention", "transformer"]},
    {"must_mention" : ["Multi-Head attention", "encoder", "decoder"]},
    {"must_mention" : ["LVLM", "visual"]},
    {"must_mention" : ["Zero-Shot Prompting", "Few-Shot Prompting", "Chain-of-Thought (CoT) Prompting", "Instruction-Based Prompting"]},
    {"must_mention" : ["Byte Pair Encoding", "SentencePiece", "compression"]},
    {"must_mention" : ["self-distillation", "Explanation-Guided LLMs Active Distillation", "DistiLLM "]},
    {"must_mention" : ["RLHF", "jailbreaking", "hallucination", "prompt injection"]},
    {"must_mention" : ["accuracy", "computational efficiency", "reasoning"]},
    {"must_mention" : ["PRewrite", "TEMPERA", "GeneticPromptLab", "Promptimizer", "DeepSeek-R1"]},
    {"must_mention" : ["document processing", "personalization"]},
]

Now we can add our dataset to our LangSmith project using the following code which we saw last Thursday!

In [21]:
from langsmith import Client

client = Client()

dataset_name = f"Retrieval Augmented Generation - Evaluation Dataset - {uuid4().hex[0:8]}"

dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="Questions about latest research in Generative AI." # updated for the new set of 10 questions
)

client.create_examples(
    inputs=[{"question" : q} for q in questions],
    outputs=answers,
    dataset_id=dataset.id,
)

#### ❓ Question #3:

How are the correct answers associated with the questions?

> NOTE: Feel free to indicate if this is problematic or not

### Task 2: Adding Evaluators

Now we can add a custom evaluator to see if our responses contain the expected information.

We'll be using a fairly naive exact-match process to determine if our response contains specific strings.

In [22]:
from langsmith.evaluation import EvaluationResult, run_evaluator

@run_evaluator
def must_mention(run, example) -> EvaluationResult:
    prediction = run.outputs.get("output") or ""
    required = example.outputs.get("must_mention") or []
    score = all(phrase in prediction for phrase in required)
    return EvaluationResult(key="must_mention", score=score)

#### ❓ Question #4:

What are some ways you could improve this metric as-is?

> NOTE: Alternatively you can suggest where gaps exist in this method.

Task 3: Evaluating

All that is left to do is evaluate our agent's response!

In [23]:
experiment_results = client.evaluate(
    agent_chain,
    data=dataset_name,
    evaluators=[must_mention],
    experiment_prefix=f"RAG Pipeline - Evaluation - {uuid4().hex[0:4]}",
    metadata={"version": "1.0.0"},
)

  from .autonotebook import tqdm as notebook_tqdm


View the evaluation results for experiment: 'RAG Pipeline - Evaluation - edd2-26d5bc2e' at:
https://smith.langchain.com/o/a880a293-74b1-4b82-a94d-e9a3daf7aa11/datasets/a93afba0-9bc3-4c65-8532-da08139e11ea/compare?selectedSessions=b3390ed6-5ae2-45fc-b1a5-51906d7239ff




10it [01:27,  8.73s/it]


In [24]:
experiment_results

Unnamed: 0,inputs.question,outputs.output,error,reference.must_mention,feedback.must_mention,execution_time,example_id,id
0,What are the most popular use cases requiring ...,Multi-modal large language models (LLMs) are d...,,"[document processing, personalization]",False,6.854807,89b0b05f-97de-4217-a025-6a26164edc44,bb39a100-be6f-483b-8035-c89dc9b66ebd
1,What are the latest advancements in automated ...,Here are some of the latest advancements in au...,,"[PRewrite, TEMPERA, GeneticPromptLab, Promptim...",False,13.483537,15ae0956-0603-4732-95ad-b5601d39c108,caf89aad-1488-4386-b7c7-8d6b8289a35a
2,"How do the techniques of few-shot, zero-shot, ...","In the field of large language models (LLMs), ...",,"[accuracy, computational efficiency, reasoning]",False,9.241197,abdfdca7-03b5-4b0d-b305-209a9bda7d75,4c521b72-12b3-445f-ac96-7d1cc932cae6
3,How does adversarial prompt engineering help i...,Adversarial prompt engineering is a technique ...,,"[RLHF, jailbreaking, hallucination, prompt inj...",False,4.617192,ce0fb10a-d9e7-48af-aa0b-f9521f617542,20fadc80-3a9b-48cd-a9cb-ef2baa0a1af0
4,What are the key advancements in model distill...,Recent advancements in model distillation for ...,,"[self-distillation, Explanation-Guided LLMs Ac...",False,9.362955,62bc9c44-a488-461a-bf1d-025ce9143f63,f4accd44-e3f8-4353-a45b-c6da4ae02db8
5,What are the most recent research papers on th...,Here are some of the most recent research pape...,,"[Byte Pair Encoding, SentencePiece, compression]",False,8.294149,9caba207-6ec1-45e0-86e5-17154aad1772,abbd87b6-7d36-4eb0-8147-3b102581a4a4
6,What are the major types of prompt engineering?,Prompt engineering is a crucial aspect of work...,,"[Zero-Shot Prompting, Few-Shot Prompting, Chai...",False,6.553997,acaa9066-71ee-49d4-a5db-f9239d549166,75aeabca-ac73-4679-b7b5-c07727b25c0c
7,What is a videoRAG and how does it differ from...,VideoRAG and RAG (Retrieval-Augmented Generati...,,"[LVLM, visual]",True,8.171186,fa7d2e21-5afa-4d15-b0a5-0620608f381c,f6e52741-d6f9-4153-8f5e-220d5ff0247e
8,What is the self-attention mechanism mentioned...,"The self-attention mechanism, introduced in th...",,"[Multi-Head attention, encoder, decoder]",False,5.54919,bc88f666-f35a-4922-a18e-d365f7c7c837,67fbc86f-881a-4ee9-bbcd-731cc55de98a
9,Can you generate a 500 word summary of the fam...,It seems I couldn't retrieve the specific pape...,,"[self-attention, transformer]",False,14.723675,c38a0a36-8091-4c7c-bbe3-4a24e8ecbf19,76e8369c-1d0f-454e-a639-e96ecc7a67dd


## Part 2: LangGraph with Helpfulness:

### Task 3: Adding Helpfulness Check and "Loop" Limits

Now that we've done evaluation - let's see if we can add an extra step where we review the content we've generated to confirm if it fully answers the user's query!

We're going to make a few key adjustments to account for this:

1. We're going to add an artificial limit on how many "loops" the agent can go through - this will help us to avoid the potential situation where we never exit the loop.
2. We'll add to our existing conditional edge to obtain the behaviour we desire.

First, let's define our state again - we can check the length of the state object, so we don't need additional state for this.

In [25]:
class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

Now we can set our graph up! This process will be almost entirely the same - with the inclusion of one additional node/conditional edge!

####🏗️ Activity #5:

Please write markdown for the following cells to explain what each is doing.

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

The following code snippet is setting up a graph-based structure for an AI agent. 

Creating the State Graph:
The first line creates a new StateGraph object, initialized with the AgentState class and set to the variable 'graph_with_helpfulness_check'.  StateGraph is a LangGraph class representing a graph structure where nodes can modify a shared state (e.g., AgentState).

Adding Nodes to the Graph:  
The next two lines add two nodes to the graph -  
a. An "agent" node that uses call_model function to call the model that processes user messages and generates responses.  
b. An "action" node that uses the tool_node function that calls a tool added to the model toolset enabling the agent to perform tasks.  

The agent created has a state (consisting of a message history), and two main components: the AI model and a tool.  
This structure allows the agent to process information through the model, and execute actions defined by its tool(s).

</span>
</div>

In [27]:
graph_with_helpfulness_check = StateGraph(AgentState)

graph_with_helpfulness_check.add_node("agent", call_model)
graph_with_helpfulness_check.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x23b18d22f10>

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

The following line of code sets the entry point of the graph to the "agent" node. This means that the graph execution starts by running the "agent node". This ensures that the AI model (represented by "agent") is the first component to process any input on graph invocation.

</span>
</div>

In [29]:
graph_with_helpfulness_check.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x23b18d22f10>

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b>  

The function tool_call_or_helpful below intends to determine the next step in the agent's workflow based on the current state.  

1. It checks the last message in the state. If the last message contains tool calls, it returns "action", indicating a tool should be used next.
2. If there are no tool calls, it proceeds to evaluate the helpfulness of the response:
    - It retrieves the first message (initial_query) and the last message (final_response).
    - If the conversation has more than 10 messages (checked by len(state["messages"]) > 10), it returns "END" to stop the interaction.
3. To evaluate helpfulness of the response (assigned to the variable helpfulness_response):
    - It defines a prompt template (prompt_template) to assess the helpfulness of the final response compared to the initial query.
    - It defines a model (helpfulness_check_model) using the gpt-4 model to perform this evaluation.
    - It defines an evaluation chain (helpfulness_chain) is set up using the prompt template (prompt_template), the gpt-4 model (helpfulness_check_model), and a string output parser (StrOutputParser).
4. Depending on the helpfulness evaluation (helpfulness_response) generated, either of the following cases happen:
    - If the response contains 'Y' (indicating helpfulness), it returns "end".
    - Otherwise, it returns "continue", suggesting that the conversation should proceed.

</span>
</div>

In [30]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def tool_call_or_helpful(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  initial_query = state["messages"][0]
  final_response = state["messages"][-1]

  if len(state["messages"]) > 10:
    return "END"

  prompt_template = """\
  Given an initial query and a final response, determine if the final response is extremely helpful or not. Please indicate helpfulness with a 'Y' and unhelpfulness as an 'N'.

  Initial Query:
  {initial_query}

  Final Response:
  {final_response}"""

  prompt_template = PromptTemplate.from_template(prompt_template)

  helpfulness_check_model = ChatOpenAI(model="gpt-4")

  helpfulness_chain = prompt_template | helpfulness_check_model | StrOutputParser()

  helpfulness_response = helpfulness_chain.invoke({"initial_query" : initial_query.content, "final_response" : final_response.content})

  if "Y" in helpfulness_response:
    return "end"
  else:
    return "continue"

####🏗️ Activity #4:

Please write what is happening in our `tool_call_or_helpful` function!

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

This following code snippet creates a decision-making structure in the graph by adding conditional edges to the graph_with_helpfulness_check. It defines how the graph (graph_with_helpfulness_check) should proceed after the "agent" node is executed. 
1. The first argument (agent) specifies the source node from which these conditional edges originate.
2. The second argument  (tool_call_or_helpful) is the function we wrote above to determine which edge to follow. This analyzes the current state and provides guidance about the next step to be followed.
3. The third argument is a dictionary / mapping of possible outputs of tool_call_or_helpful to the next steps / nodes in the graph.
    - If "continue" is returned, then the graph will proceed to the "agent" node.
    - If "action" is returned, then the graph will proceed to the "action" node.
    - If "end" is returned, then the graph will terminate (END means end of execution).

</span>
</div>

In [31]:
graph_with_helpfulness_check.add_conditional_edges(
    "agent",
    tool_call_or_helpful,
    {
        "continue" : "agent",
        "action" : "action",
        "end" : END
    }
)

<langgraph.graph.state.StateGraph at 0x23b18d22f10>

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

The following line is adding a direct, unconditional edge from the "action" node to the "agent" node in graph_with_helpfulness_check.   
We add unconditional edges to a graphy by using add_edge function as compared to adding conditional edges by using add_conditional_edges function.
Adding an unconditional edge ensures:
1. Direction: It creates a one-way connection from the "action" node to the "agent" node.
2. Unconditional Transition: Unlike the conditional edges (e.g., continue), this is a straightforward connection.
3. Workflow Implication: After an action is completed, the graph will always proceed back to the "agent" node.
4. Cycle Creation: This edge completes a potential cycle path in the graph: agent -> action -> agent
This setup ensures control is always passed back to the agent after every action.  

</span>
</div>

In [33]:
graph_with_helpfulness_check.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x23b18d22f10>

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

The following line defines the LangGraph-based agent. 
1. Compilation (by calling complie() method) transforms the graph object (graph_with_helpfulness_check) built above into an executable agent.
2. Graph optimizations are performed during compilation to improve execution efficiency.
3. Validation of graph structure is another crucial step to ensure all necessary components are in place.
4. RESULT: agent_with_helpfulness_check is a fully functional agent that can be used to perform specific tasks.

</span>
</div>

In [34]:
agent_with_helpfulness_check = graph_with_helpfulness_check.compile()

##### YOUR MARKDOWN HERE

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 


The following code snippet sets up and executes an asynchronous streaming interaction with the compiled agent allowing us to observe the agent's thought process in real-time.

1. The input is a dictionary with a list containing one HumanMessage object with following 3 questions.
    - Related to machine learning, what is LoRA? 
    - Also, who is Tim Dettmers? 
    - Also, what is Attention?
2. Asynchronous Streaming: This starts an asynchronous stream from the agent.
    - astream() is an asynchronous method yields updates in real-time as user query is being processed.
    - stream_mode="updates" specifies our preference to receive updates from each node as they occur.
3. Processing Stream Updates: This loop processes each chunk of data received from the stream.
    - Each chunk contains updates from a specific node in the graph and each update prints name of node, messages associated, and line breaks.

</span>
</div>

In [35]:
inputs = {"messages" : [HumanMessage(content="Related to machine learning, what is LoRA? Also, who is Tim Dettmers? Also, what is Attention?")]}

async for chunk in agent_with_helpfulness_check.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_YmB7EKnqfWAbndLX64Qcsycw', 'function': {'arguments': '{"query": "LoRA machine learning"}', 'name': 'arxiv'}, 'type': 'function'}, {'id': 'call_ilP7SGWMRuQTIYNe4YiUS2BQ', 'function': {'arguments': '{"query": "Tim Dettmers"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_awYgjGXVgogNhKhqTqGDjRUJ', 'function': {'arguments': '{"query": "Attention mechanism machine learning"}', 'name': 'arxiv'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 72, 'prompt_tokens': 177, 'total_tokens': 249, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_50cad350e4', 'finish_reason': 'tool_calls', 'logprobs': None},

### Task 4: LangGraph for the "Patterns" of GenAI

Let's ask our system about the 4 patterns of Generative AI:

1. Prompt Engineering
2. RAG
3. Fine-tuning
4. Agents

<div style="background-color: #E6E6FA; padding: 10px; border-radius: 5px;">
<span style="color: black;">
<b> * **ANSWER:**  </b> 

The following code snippet is designed to run the AI agent (agent_with_helpfulness_check) multiple times in a loop, each time asking about a different concept related to AI and machine learning such as the following:

a) What is prompt engineering and when did it break onto the scene??  
b) What is RAG and when did it break onto the scene??  
c) What is fine-tuning and when did it break onto the scene??  
d) What is LLM-based agents and when did it break onto the scene??

The for loop then iterates over each concept in the list formulating the above 4 questions.

The agent is called with sinlge question at a time, processing it and generating a response.

The content of the last message is finally printed.

</span>
</div>

In [36]:
patterns = ["prompt engineering", "RAG", "fine-tuning", "LLM-based agents"]

In [37]:
for pattern in patterns:
  what_is_string = f"What is {pattern} and when did it break onto the scene??"
  inputs = {"messages" : [HumanMessage(content=what_is_string)]}
  messages = agent_with_helpfulness_check.invoke(inputs)
  print(messages["messages"][-1].content)
  print("\n\n")

**Prompt Engineering Definition:**

Prompt engineering is the process of designing inputs for generative artificial intelligence (AI) models to deliver useful, accurate, and relevant responses. It is used with generative AI models called large language models (LLMs), such as OpenAI’s ChatGPT and Google Gemini. The practice involves creating prompts to guide AI models in generating accurate and relevant responses, often used in customer support, content generation, and data analysis. Skilled prompt engineers design inputs that interact optimally with other inputs in a generative AI tool, aiming to optimize the output and minimize bias in AI outputs. Prompt engineering is an iterative process that involves experimenting and refining prompts to achieve the best results.

**When Prompt Engineering Became Popular:**

Prompt engineering started to gain recognition in the early 2000s, initially emerging from research in natural language processing (NLP) and human-computer interaction (HCI). H