# LangGraph and LangSmith - Agentic RAG Powered by LangChain

In the following notebook we'll complete the following tasks:

- 🤝 Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Creating our Tool Belt
  4. Creating Our State
  5. Creating and Compiling A Graph!

- 🤝 Breakout Room #2:
  1. Evaluating the LangGraph Application with LangSmith
  2. Adding Helpfulness Check and "Loop" Limits
  3. LangGraph for the "Patterns" of GenAI

# 🤝 Breakout Room #1

## Part 1: LangGraph - Building Cyclic Applications with LangChain

LangGraph is a tool that leverages LangChain Expression Language to build coordinated multi-actor and stateful applications that includes cyclic behaviour.

### Why Cycles?

In essence, we can think of a cycle in our graph as a more robust and customizable loop. It allows us to keep our application agent-forward while still giving the powerful functionality of traditional loops.

Due to the inclusion of cycles over loops, we can also compose rather complex flows through our graph in a much more readable and natural fashion. Effectively allowing us to recreate application flowcharts in code in an almost 1-to-1 fashion.

### Why LangGraph?

Beyond the agent-forward approach - we can easily compose and combine traditional "DAG" (directed acyclic graph) chains with powerful cyclic behaviour due to the tight integration with LCEL. This means it's a natural extension to LangChain's core offerings!

## Task 1:  Dependencies


## Task 2: Environment Variables

We'll want to set both our OpenAI API key and our LangSmith environment variables.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [2]:
os.environ["TAVILY_API_KEY"] = getpass.getpass("TAVILY_API_KEY")

In [3]:
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE7 - LangGraph - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key: ")

## Task 3: Creating our Tool Belt

As is usually the case, we'll want to equip our agent with a toolbelt to help answer questions and add external knowledge.

There's a tonne of tools in the [LangChain Community Repo](https://github.com/langchain-ai/langchain-community/tree/main/libs/community) but we'll stick to a couple just so we can observe the cyclic nature of LangGraph in action!

We'll leverage:

- [Tavily Search Results](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/tavily_search/tool.py)
- [Arxiv](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/arxiv/tool.py)

#### 🏗️ Activity #1:

Please add the tools to use into our toolbelt.

> NOTE: Each tool in our toolbelt should be a method.

In [6]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.tools.arxiv.tool import ArxivQueryRun
from langchain_community.tools.wikipedia.tool import WikipediaQueryRun
from langchain_community.utilities.wikipedia import WikipediaAPIWrapper

tavily_tool = TavilySearchResults(max_results=5)
wikipedia_tool = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

tool_belt = [
    tavily_tool,
    ArxivQueryRun(),
    wikipedia_tool,
]

### Model

Now we can set-up our model! We'll leverage the familiar OpenAI model suite for this example - but it's not *necessary* to use with LangGraph. LangGraph supports all models - though you might not find success with smaller models - as such, they recommend you stick with:

- OpenAI's GPT-3.5 and GPT-4
- Anthropic's Claude
- Google's Gemini

> NOTE: Because we're leveraging the OpenAI function calling API - we'll need to use OpenAI *for this specific example* (or any other service that exposes an OpenAI-style function calling API.

In [25]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o", temperature=0) #change here gpt4o

Now that we have our model set-up, let's "put on the tool belt", which is to say: We'll bind our LangChain formatted tools to the model in an OpenAI function calling format.

In [26]:
model = model.bind_tools(tool_belt)

#### ❓ Question #1:

How does the model determine which tool to use?
##### ✅ Answer:
The model knows which tool to use by using the provided information in the context which includes, but is not limited to, instructions in the query/prompt itself and the docstrings in the tool definitions.

## Task 4: Putting the State in Stateful

Earlier we used this phrasing:

`coordinated multi-actor and stateful applications`

So what does that "stateful" mean?

To put it simply - we want to have some kind of object which we can pass around our application that holds information about what the current situation (state) is. Since our system will be constructed of many parts moving in a coordinated fashion - we want to be able to ensure we have some commonly understood idea of that state.

LangGraph leverages a `StatefulGraph` which uses an `AgentState` object to pass information between the various nodes of the graph.

There are more options than what we'll see below - but this `AgentState` object is one that is stored in a `TypedDict` with the key `messages` and the value is a `Sequence` of `BaseMessages` that will be appended to whenever the state changes.

Let's think about a simple example to help understand exactly what this means (we'll simplify a great deal to try and clearly communicate what state is doing):

1. We initialize our state object:
  - `{"messages" : []}`
2. Our user submits a query to our application.
  - New State: `HumanMessage(#1)`
  - `{"messages" : [HumanMessage(#1)}`
3. We pass our state object to an Agent node which is able to read the current state. It will use the last `HumanMessage` as input. It gets some kind of output which it will add to the state.
  - New State: `AgentMessage(#1, additional_kwargs {"function_call" : "WebSearchTool"})`
  - `{"messages" : [HumanMessage(#1), AgentMessage(#1, ...)]}`
4. We pass our state object to a "conditional node" (more on this later) which reads the last state to determine if we need to use a tool - which it can determine properly because of our provided object!

In [27]:
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
  messages: Annotated[list, add_messages] #Annotated here lets the type remain a list but the messages provide context to the type

## Task 5: It's Graphing Time!

Now that we have state, and we have tools, and we have an LLM - we can finally start making our graph!

Let's take a second to refresh ourselves about what a graph is in this context.

Graphs, also called networks in some circles, are a collection of connected objects.

The objects in question are typically called nodes, or vertices, and the connections are called edges.

Let's look at a simple graph.

![image](https://i.imgur.com/2NFLnIc.png)

Here, we're using the coloured circles to represent the nodes and the yellow lines to represent the edges. In this case, we're looking at a fully connected graph - where each node is connected by an edge to each other node.

If we were to think about nodes in the context of LangGraph - we would think of a function, or an LCEL runnable.

If we were to think about edges in the context of LangGraph - we might think of them as "paths to take" or "where to pass our state object next".

Let's create some nodes and expand on our diagram.

> NOTE: Due to the tight integration with LCEL - we can comfortably create our nodes in an async fashion!

In [28]:
from langgraph.prebuilt import ToolNode

def call_model(state):
  messages = state["messages"]
  response = model.invoke(messages)
  return {"messages" : [response]}

tool_node = ToolNode(tool_belt)

Now we have two total nodes. We have:

- `call_model` is a node that will...well...call the model
- `tool_node` is a node which can call a tool

Let's start adding nodes! We'll update our diagram along the way to keep track of what this looks like!


In [29]:
from langgraph.graph import StateGraph, END

uncompiled_graph = StateGraph(AgentState)

uncompiled_graph.add_node("agent", call_model)
uncompiled_graph.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x110b33110>

Let's look at what we have so far:

![image](https://i.imgur.com/md7inqG.png)

Next, we'll add our entrypoint. All our entrypoint does is indicate which node is called first.

In [30]:
uncompiled_graph.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x110b33110>

![image](https://i.imgur.com/wNixpJe.png)

Now we want to build a "conditional edge" which will use the output state of a node to determine which path to follow.

We can help conceptualize this by thinking of our conditional edge as a conditional in a flowchart!

Notice how our function simply checks if there is a "function_call" kwarg present.

Then we create an edge where the origin node is our agent node and our destination node is *either* the action node or the END (finish the graph).

It's important to highlight that the dictionary passed in as the third parameter (the mapping) should be created with the possible outputs of our conditional function in mind. In this case `should_continue` outputs either `"end"` or `"continue"` which are subsequently mapped to the action node or the END node.

In [31]:
def should_continue(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  return END

uncompiled_graph.add_conditional_edges(
    "agent",
    should_continue
)

<langgraph.graph.state.StateGraph at 0x110b33110>

Let's visualize what this looks like.

![image](https://i.imgur.com/8ZNwKI5.png)

Finally, we can add our last edge which will connect our action node to our agent node. This is because we *always* want our action node (which is used to call our tools) to return its output to our agent!

In [32]:
uncompiled_graph.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x110b33110>

Let's look at the final visualization.

![image](https://i.imgur.com/NWO7usO.png)

All that's left to do now is to compile our workflow - and we're off!

In [33]:
simple_agent_graph = uncompiled_graph.compile()

#### ❓ Question #2:

Is there any specific limit to how many times we can cycle?

If not, how could we impose a limit to the number of cycles?
##### ✅ Answer:
No. A limit can be imposed several ways but one way is by keeping track of and setting a limit on something like number of cycles or runtime.

## Using Our Graph

Now that we've created and compiled our graph - we can call it *just as we'd call any other* `Runnable`!

Let's try out a few examples to see how it fairs:

In [34]:
from langchain_core.messages import HumanMessage

inputs = {"messages" : [HumanMessage(content="Who is the current captain of the Winnipeg Jets?")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_hvCUH5clzqxUygLxHA0BMSoE', 'function': {'arguments': '{"query":"current captain of the Winnipeg Jets 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 220, 'total_tokens': 246, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_a288987b44', 'id': 'chatcmpl-Bt3gwyPsYtWAN9MqGvUNC2X6cneoV', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--53017225-52e4-421e-8c57-555265f4ff14-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'current captain of the Winnipeg Jets 2023'}, 'id': 'call_hvCUH5clzqxUygLxHA0B

Let's look at what happened:

1. Our state object was populated with our request
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `additional_kwarg`, and sent the state object to the action node
4. The action node added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node
5. The agent node added a response to the state object and passed it along the conditional edge
6. The conditional edge received the state object, could not find the "tool_calls" `additional_kwarg` and passed the state object to END where we see it output in the cell above!

Now let's look at an example that shows a multiple tool usage - all with the same flow!

In [35]:
inputs = {"messages" : [HumanMessage(content="Search Arxiv for the QLoRA paper, then search each of the authors to find out their latest Tweet using Tavily!")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_5F02zl2WgG2HK2NBwXuyTVt7', 'function': {'arguments': '{"query":"QLoRA"}', 'name': 'arxiv'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 236, 'total_tokens': 252, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_a288987b44', 'id': 'chatcmpl-Bt3io59lFwM7tCwGkZpFFtTSepTQz', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--9a674473-e8ad-42f7-ad85-72b4aa56faa4-0', tool_calls=[{'name': 'arxiv', 'args': {'query': 'QLoRA'}, 'id': 'call_5F02zl2WgG2HK2NBwXuyTVt7', 'type': 'tool_call'}], usage_metadata={'input_tokens': 236, 'output_tokens': 16, 'total_tokens': 252, 'inpu

#### 🏗️ Activity #2:

Please write out the steps the agent took to arrive at the correct answer.
##### ✅ Answer:

1. Request sent to agent node and agent node decided it needed to use the arxiv tool with the query Qlora.
2. Action node runs arxiv tool with Qlora query and returns results to agent.
3. Agent node received results and determines it needs to now call the tavily tool search 4 times, one for each author to find their latest tweets.
4. Action node runs the tavily search for the four authors latest tweets.
5. Agent node received results decides no more tool calls necessary and moves to END.

In [36]:
# testing wikipedia tool
inputs = {"messages" : [HumanMessage(content="Use tavily to search for the name of Tesla's CEO, then search wikipedia to find out and return their birthdate!")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_1LFJ4xXBqWgLBfpOIq8wak04', 'function': {'arguments': '{"query": "Tesla CEO 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_I1124xYXKkAJgA3FSxxPmzxS', 'function': {'arguments': '{"query": "Elon Musk"}', 'name': 'wikipedia'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 53, 'prompt_tokens': 235, 'total_tokens': 288, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_a288987b44', 'id': 'chatcmpl-Bt45gpfCHjX9gk4B86YtdzJWF2ss1', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--a072545f-57ab-4f96-a5f0-8c3ea82e5395-0', tool_calls=[{'name': 'tavily_search_re

# 🤝 Breakout Room #2

## Part 1: LangSmith Evaluator

### Pre-processing for LangSmith

To do a little bit more preprocessing, let's wrap our LangGraph agent in a simple chain.

In [37]:
def convert_inputs(input_object):
  return {"messages" : [HumanMessage(content=input_object["question"])]}

def parse_output(input_state):
  return input_state["messages"][-1].content

agent_chain_with_formatting = convert_inputs | simple_agent_graph | parse_output

In [38]:
agent_chain_with_formatting.invoke({"question" : "What is RAG?"})

'Retrieval-augmented generation (RAG) is a technique that enhances large language models (LLMs) by allowing them to retrieve and incorporate new information from specified documents before generating responses. This approach enables LLMs to use domain-specific or updated information that is not available in their pre-existing training data. RAG helps improve the accuracy of LLMs by blending the language model process with a document look-up or web search process, thereby reducing AI hallucinations and the need for retraining with new data. It also allows LLMs to include sources in their responses, providing greater transparency and enabling users to verify the cited information. The term RAG was introduced in a 2020 research paper from Meta.'

### Task 1: Creating An Evaluation Dataset

Just as we saw last week, we'll want to create a dataset to test our Agent's ability to answer questions.

In order to do this - we'll want to provide some questions and some answers. Let's look at how we can create such a dataset below.

```python
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]
```

#### 🏗️ Activity #3:

Please create a dataset in the above format with at least 5 questions.
##### ✅ Answer:

questions = [<br>
    "Who did Elon Musk found Zip2 with?",<br>
    "Where was Elon Musk born?",<br>
    "Who are Elon Musk’s Parents?",<br>
    "Who did Elon Musk endorse for the 2024 presidential race?",<br>
    "What government organization was Elon Musk de facto head of?"<br>
]<br>

answers = [<br>
    {"must_mention" : ["Kimbal", "Greg"]},<br>
    {"must_mention" : ["Africa", "Pretoria"]},<br>
    {"must_mention" : ["Errol", "Maye"]},<br>
    {"must_mention" : ["Republican", "Trump"]},<br>
    {"must_mention" : ["Government", "Efficiency"]}<br>
]

In [39]:
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]

Now we can add our dataset to our LangSmith project using the following code which we saw last Thursday!

In [40]:
from langsmith import Client

client = Client()

dataset_name = f"Retrieval Augmented Generation - Evaluation Dataset - {uuid4().hex[0:8]}"

dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="Questions about the QLoRA Paper to Evaluate RAG over the same paper."
)

client.create_examples(
    inputs=[{"question" : q} for q in questions],
    outputs=answers,
    dataset_id=dataset.id,
)

{'example_ids': ['e8886f66-8640-4caf-9892-0f059191f604',
  '2d265c8f-7d08-468d-b35a-750fa43f3660',
  'ed002aa1-6d84-4568-95e0-a5efb3e0c0f0',
  '4cf9b5fc-b7c0-44b9-8d6c-baccd4a599b3',
  '7d71c2f4-ada8-4e93-b0fe-e935da78d6cc',
  'bc1b3968-9a2c-4b64-91f4-e5a044bfd0a0'],
 'count': 6}

#### ❓ Question #3:

How are the correct answers associated with the questions?

> NOTE: Feel free to indicate if this is problematic or not
##### ✅ Answer:
The format is associated by index in both lists. The content is associated by words that we expect, and I suppose will enforce, that our model MUST include in the answers to the provided questions. This is potentially promblematic if the test set is ambiguous or designed "poorly" but can also be useful in efforts to enforce certain responses from the model.

### Task 2: Adding Evaluators

Now we can add a custom evaluator to see if our responses contain the expected information.

We'll be using a fairly naive exact-match process to determine if our response contains specific strings.

In [41]:
from langsmith.evaluation import EvaluationResult, run_evaluator

@run_evaluator
def must_mention(run, example) -> EvaluationResult:
    prediction = run.outputs.get("output") or ""
    required = example.outputs.get("must_mention") or []
    score = all(phrase in prediction for phrase in required) #returns true if all phrases in the required list are in the prediction
    return EvaluationResult(key="must_mention", score=score)

#### ❓ Question #4:

What are some ways you could improve this metric as-is?

> NOTE: Alternatively you can suggest where gaps exist in this method.
##### ✅ Answer:
One way to improve the performance measuring capability of the metric as is, is to make the matching non-case sensitive. Another way to improve the metric would be to implement some kind of fuzzy matching, instead of the exact, to be able to capture semantic similarities.

Task 3: Evaluating

All that is left to do is evaluate our agent's response!

In [42]:
experiment_results = client.evaluate(
    agent_chain_with_formatting,
    data=dataset_name,
    evaluators=[must_mention],
    experiment_prefix=f"Search Pipeline - Evaluation - {uuid4().hex[0:4]}",
    metadata={"version": "1.0.0"},
)

View the evaluation results for experiment: 'Search Pipeline - Evaluation - a9cf-60b7714b' at:
https://smith.langchain.com/o/b538314a-4903-4172-841d-f92d30cdda3d/datasets/c1bef398-6e45-4381-9b69-a24da44e0362/compare?selectedSessions=9d8ecb24-0da0-4542-9108-ea418b4b5769




0it [00:00, ?it/s]

In [None]:
experiment_results #not sure what this is supposed to do, so added result output below

<ExperimentResults Search Pipeline - Evaluation - a9cf-60b7714b>


In [None]:
experiment_results._results

[{'run': RunTree(id=8582c9db-e698-458e-8c8f-7b1e6ac92973, name='Target', run_type='chain', dotted_order='20250714T121603006745Z8582c9db-e698-458e-8c8f-7b1e6ac92973'),
  'example': <class 'langsmith.schemas.Example'>(id=2d265c8f-7d08-468d-b35a-750fa43f3660, dataset_id=c1bef398-6e45-4381-9b69-a24da44e0362, link='https://smith.langchain.com/o/b538314a-4903-4172-841d-f92d30cdda3d/datasets/c1bef398-6e45-4381-9b69-a24da44e0362/e/2d265c8f-7d08-468d-b35a-750fa43f3660'),
  'evaluation_results': {'results': [EvaluationResult(key='must_mention', score=True, value=None, comment=None, correction=None, evaluator_info={}, feedback_config=None, source_run_id=UUID('c1e793ea-a25e-484b-9269-52d6dbe8f8ee'), target_run_id=None, extra=None)]}},
 {'run': RunTree(id=4261f78a-2e6e-4180-9f01-97241ae24b7f, name='Target', run_type='chain', dotted_order='20250714T121612181789Z4261f78a-2e6e-4180-9f01-97241ae24b7f'),
  'example': <class 'langsmith.schemas.Example'>(id=4cf9b5fc-b7c0-44b9-8d6c-baccd4a599b3, dataset_id

## Part 2: LangGraph with Helpfulness:

### Task 3: Adding Helpfulness Check and "Loop" Limits

Now that we've done evaluation - let's see if we can add an extra step where we review the content we've generated to confirm if it fully answers the user's query!

We're going to make a few key adjustments to account for this:

1. We're going to add an artificial limit on how many "loops" the agent can go through - this will help us to avoid the potential situation where we never exit the loop.
2. We'll add to our existing conditional edge to obtain the behaviour we desire.

First, let's define our state again - we can check the length of the state object, so we don't need additional state for this.

In [61]:
class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

Now we can set our graph up! This process will be almost entirely the same - with the inclusion of one additional node/conditional edge!

#### 🏗️ Activity #5:

Please write markdown for the following cells to explain what each is doing.

##### ✅ Answer:
Here we start to configure our graph by first initializing the graph and then adding an agent node and an action node.

In [64]:
graph_with_helpfulness_check = StateGraph(AgentState)

graph_with_helpfulness_check.add_node("agent", call_model)
graph_with_helpfulness_check.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x110b334d0>

##### ✅ Answer:
Next we provide an entry point to our graph, specifically the agent node. This will tell our graph where to direct the initial user input.

In [65]:
graph_with_helpfulness_check.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x110b334d0>

##### ✅ Answer:
Here we define our conditional edge. (explanation of functionality in activity 4 below)

In [66]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def tool_call_or_helpful(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  initial_query = state["messages"][0]
  final_response = state["messages"][-1]

  if len(state["messages"]) > 10:
    return "END"

  prompt_template = """\
  Given an initial query and a final response, determine if the final response is extremely helpful or not. Please indicate helpfulness with a 'Y' and unhelpfulness as an 'N'.

  Initial Query:
  {initial_query}

  Final Response:
  {final_response}"""

  helpfullness_prompt_template = PromptTemplate.from_template(prompt_template)

  helpfulness_check_model = ChatOpenAI(model="gpt-4.1-mini")

  helpfulness_chain = helpfullness_prompt_template | helpfulness_check_model | StrOutputParser()

  helpfulness_response = helpfulness_chain.invoke({"initial_query" : initial_query.content, "final_response" : final_response.content})

  if "Y" in helpfulness_response:
    return "end"
  else:
    return "continue"

#### 🏗️ Activity #4:

Please write what is happening in our `tool_call_or_helpful` function!
##### ✅ Answer:
This conditional edge first checks the last message in our state similar to the should continue edge. If the last message contains a tool calls kwarg, it returns action. It then sets variables for the first and final response (not sure why we have another variable for the final response when we already have the last message variable). It then checks the length of the messages in the state object and returns END if the lenght of the state in >10 imposing a processing limit on state changes or loops. Next a prompt template is composed where a quesry is given along with the first and last messages to determine with respect to the first message how helpful the last message is. That prompt is given to an open AI model via LCEL and the response is checked for a Y indicating that the model thought it was helpful. If the Y present the edge returns end, and if not it returns continue.

##### ✅ Answer:
Here we add our conditional edge to the graph, from the agent node to the conditional edge. We also provide mappings of the responses from the conditional edge. 

In [67]:
graph_with_helpfulness_check.add_conditional_edges(
    "agent",
    tool_call_or_helpful,
    {
        "continue" : "agent",
        "action" : "action",
        "end" : END
    }
)

<langgraph.graph.state.StateGraph at 0x110b334d0>

##### ✅ Answer:
Here we complete our reasoning action loop by inserting an edge into the graph connecting the action node to the agent node.

In [68]:
graph_with_helpfulness_check.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x110b334d0>

##### ✅ Answer:
Here we compile our graph. Redefining it as an "agent" now

In [70]:
agent_with_helpfulness_check = graph_with_helpfulness_check.compile()

##### ✅ Answer:
Here we pass a sample prompt to our graph and output the updates from each node as the graph runs and we traverse through the graph.

In [71]:
inputs = {"messages" : [HumanMessage(content="Related to machine learning, what is LoRA? Also, who is Tim Dettmers? Also, what is Attention?")]}

async for chunk in agent_with_helpfulness_check.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_FSIzT9bxaKSrB3Uh1D0zLJjW', 'function': {'arguments': '{"query": "LoRA machine learning"}', 'name': 'wikipedia'}, 'type': 'function'}, {'id': 'call_L9DOxj2DOuNKlFRdcTtdVbtm', 'function': {'arguments': '{"query": "Tim Dettmers"}', 'name': 'wikipedia'}, 'type': 'function'}, {'id': 'call_lMmjVQqdPJp6dlUwmC2XpC32', 'function': {'arguments': '{"query": "Attention (machine learning)"}', 'name': 'wikipedia'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 235, 'total_tokens': 300, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_a288987b44', 'id': 'chatcmpl-BtEQ1BWtGK6VVh57fsViUCl0k0JgK', 'service_tier': 'd



  lis = BeautifulSoup(html).find_all('li')


Receiving update from node: 'action'
[ToolMessage(content='Page: Fine-tuning (deep learning)\nSummary: In deep learning, fine-tuning is an approach to transfer learning in which the parameters of a pre-trained neural network model are trained on new data. Fine-tuning can be done on the entire neural network, or on only a subset of its layers, in which case the layers that are not being fine-tuned are "frozen" (i.e., not changed during backpropagation). A model may also be augmented with "adapters" that consist of far fewer parameters than the original model, and fine-tuned in a parameter-efficient way by tuning the weights of the adapters and leaving the rest of the model\'s weights frozen.\nFor some architectures, such as convolutional neural networks, it is common to keep the earlier layers (those closest to the input layer) frozen, as they capture lower-level features, while later layers often discern high-level features that can be more related to the task that the model is trained

### Task 4: LangGraph for the "Patterns" of GenAI

Let's ask our system about the 4 patterns of Generative AI:

1. Prompt Engineering
2. RAG
3. Fine-tuning
4. Agents

In [72]:
patterns = ["prompt engineering", "RAG", "fine-tuning", "LLM-based agents"]

In [73]:
for pattern in patterns:
  what_is_string = f"What is {pattern} and when did it break onto the scene??"
  inputs = {"messages" : [HumanMessage(content=what_is_string)]}
  messages = agent_with_helpfulness_check.invoke(inputs)
  print(messages["messages"][-1].content)
  print("\n\n")

**Prompt Engineering** is the process of crafting instructions to produce the best possible output from a generative AI model. It involves structuring natural language text to describe the task an AI should perform, whether it's a query, command, or a detailed statement including context and instructions. This technique is crucial for interacting with models like text-to-text, text-to-image, or text-to-audio, where prompts guide the AI to generate desired outputs.

**History and Emergence:**
Prompt engineering has its roots in the early days of AI, starting with rule-based inputs in the 1950s. It evolved significantly with advancements in machine learning, recurrent neural networks, and deep learning innovations in the 2000s. The development of transformer architectures and GPT models around 2017 marked a significant leap, making prompt engineering a more structured research domain. The release of models like ChatGPT in 2022 further highlighted its importance as a business skill, altho



  lis = BeautifulSoup(html).find_all('li')


Fine-tuning in deep learning is an approach to transfer learning where the parameters of a pre-trained neural network model are trained on new data. This can be done on the entire network or just a subset of its layers, with the rest being "frozen" (i.e., not changed during backpropagation). Fine-tuning is typically accomplished via supervised learning, but there are also techniques to fine-tune a model using weak supervision. It can be combined with reinforcement learning from human feedback to produce models like ChatGPT.

Fine-tuning became more prominent with the rise of deep learning and transfer learning techniques. The concept of transfer learning, which fine-tuning is a part of, has been around for a while, but it gained significant attention with the development of large pre-trained models like BERT and GPT in the late 2010s. These models demonstrated the effectiveness of fine-tuning for specific tasks, leading to its widespread adoption in various AI applications.



LLM-base