# LangGraph and LangSmith - Agentic RAG Powered by LangChain

In the following notebook we'll complete the following tasks:

- 🤝 Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Creating our Tool Belt
  4. Creating Our State
  5. Creating and Compiling A Graph!

- 🤝 Breakout Room #2:
  1. Evaluating the LangGraph Application with LangSmith
  2. Adding Helpfulness Check and "Loop" Limits
  3. LangGraph for the "Patterns" of GenAI

# 🤝 Breakout Room #1

## Part 1: LangGraph - Building Cyclic Applications with LangChain

LangGraph is a tool that leverages LangChain Expression Language to build coordinated multi-actor and stateful applications that includes cyclic behaviour.

### Why Cycles?

In essence, we can think of a cycle in our graph as a more robust and customizable loop. It allows us to keep our application agent-forward while still giving the powerful functionality of traditional loops.

Due to the inclusion of cycles over loops, we can also compose rather complex flows through our graph in a much more readable and natural fashion. Effectively allowing us to recreate application flowcharts in code in an almost 1-to-1 fashion.

### Why LangGraph?

Beyond the agent-forward approach - we can easily compose and combine traditional "DAG" (directed acyclic graph) chains with powerful cyclic behaviour due to the tight integration with LCEL. This means it's a natural extension to LangChain's core offerings!

## Task 1:  Dependencies


## Task 2: Environment Variables

We'll want to set both our OpenAI API key and our LangSmith environment variables.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [2]:
os.environ["TAVILY_API_KEY"] = getpass.getpass("TAVILY_API_KEY")

In [3]:
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE7 - LangGraph - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key: ")

## Task 3: Creating our Tool Belt

As is usually the case, we'll want to equip our agent with a toolbelt to help answer questions and add external knowledge.

There's a tonne of tools in the [LangChain Community Repo](https://github.com/langchain-ai/langchain-community/tree/main/libs/community) but we'll stick to a couple just so we can observe the cyclic nature of LangGraph in action!

We'll leverage:

- [Tavily Search Results](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/tavily_search/tool.py)
- [Arxiv](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/arxiv/tool.py)

#### 🏗️ Activity #1:

Please add the tools to use into our toolbelt.

> NOTE: Each tool in our toolbelt should be a method.

In [4]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.tools.arxiv.tool import ArxivQueryRun

tavily_tool = TavilySearchResults(max_results=5)

tool_belt = [
    tavily_tool,
    ArxivQueryRun(),
]

  tavily_tool = TavilySearchResults(max_results=5)


### Model

Now we can set-up our model! We'll leverage the familiar OpenAI model suite for this example - but it's not *necessary* to use with LangGraph. LangGraph supports all models - though you might not find success with smaller models - as such, they recommend you stick with:

- OpenAI's GPT-3.5 and GPT-4
- Anthropic's Claude
- Google's Gemini

> NOTE: Because we're leveraging the OpenAI function calling API - we'll need to use OpenAI *for this specific example* (or any other service that exposes an OpenAI-style function calling API.

In [9]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4.1-nano", temperature=0)

Now that we have our model set-up, let's "put on the tool belt", which is to say: We'll bind our LangChain formatted tools to the model in an OpenAI function calling format.

In [10]:
model = model.bind_tools(tool_belt)

#### ❓ Question #1:

How does the model determine which tool to use?

#### Answer to Question 1:
The LLM does the actual routing / decision making. 

arxiv example : “Search peer‑reviewed and pre‑print papers on arXiv. Use for academic literature queries (e.g., ‘latest transformer paper’).”
(Mentions “paper”, “arXiv”, “PDF”, “academic” – so questions containing those cues trigger the call.)

tavily example : “Do a live internet search and return factual web snippets from trusted sites. Use for general, real‑time questions.”	
(Mentions “real‑time”, “internet” – so news‑y or generic queries are routed here.
LangChain)

## Task 4: Putting the State in Stateful

Earlier we used this phrasing:

`coordinated multi-actor and stateful applications`

So what does that "stateful" mean?

To put it simply - we want to have some kind of object which we can pass around our application that holds information about what the current situation (state) is. Since our system will be constructed of many parts moving in a coordinated fashion - we want to be able to ensure we have some commonly understood idea of that state.

LangGraph leverages a `StatefulGraph` which uses an `AgentState` object to pass information between the various nodes of the graph.

There are more options than what we'll see below - but this `AgentState` object is one that is stored in a `TypedDict` with the key `messages` and the value is a `Sequence` of `BaseMessages` that will be appended to whenever the state changes.

Let's think about a simple example to help understand exactly what this means (we'll simplify a great deal to try and clearly communicate what state is doing):

1. We initialize our state object:
  - `{"messages" : []}`
2. Our user submits a query to our application.
  - New State: `HumanMessage(#1)`
  - `{"messages" : [HumanMessage(#1)}`
3. We pass our state object to an Agent node which is able to read the current state. It will use the last `HumanMessage` as input. It gets some kind of output which it will add to the state.
  - New State: `AgentMessage(#1, additional_kwargs {"function_call" : "WebSearchTool"})`
  - `{"messages" : [HumanMessage(#1), AgentMessage(#1, ...)]}`
4. We pass our state object to a "conditional node" (more on this later) which reads the last state to determine if we need to use a tool - which it can determine properly because of our provided object!

In [11]:
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

## Task 5: It's Graphing Time!

Now that we have state, and we have tools, and we have an LLM - we can finally start making our graph!

Let's take a second to refresh ourselves about what a graph is in this context.

Graphs, also called networks in some circles, are a collection of connected objects.

The objects in question are typically called nodes, or vertices, and the connections are called edges.

Let's look at a simple graph.

![image](https://i.imgur.com/2NFLnIc.png)

Here, we're using the coloured circles to represent the nodes and the yellow lines to represent the edges. In this case, we're looking at a fully connected graph - where each node is connected by an edge to each other node.

If we were to think about nodes in the context of LangGraph - we would think of a function, or an LCEL runnable.

If we were to think about edges in the context of LangGraph - we might think of them as "paths to take" or "where to pass our state object next".

Let's create some nodes and expand on our diagram.

> NOTE: Due to the tight integration with LCEL - we can comfortably create our nodes in an async fashion!

In [12]:
from langgraph.prebuilt import ToolNode

def call_model(state):
  messages = state["messages"]
  response = model.invoke(messages)
  return {"messages" : [response]}

tool_node = ToolNode(tool_belt)

Now we have two total nodes. We have:

- `call_model` is a node that will...well...call the model
- `tool_node` is a node which can call a tool

Let's start adding nodes! We'll update our diagram along the way to keep track of what this looks like!


In [13]:
from langgraph.graph import StateGraph, END

uncompiled_graph = StateGraph(AgentState)

uncompiled_graph.add_node("agent", call_model)
uncompiled_graph.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x121809940>

Let's look at what we have so far:

![image](https://i.imgur.com/md7inqG.png)

Next, we'll add our entrypoint. All our entrypoint does is indicate which node is called first.

In [14]:
uncompiled_graph.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x121809940>

![image](https://i.imgur.com/wNixpJe.png)

Now we want to build a "conditional edge" which will use the output state of a node to determine which path to follow.

We can help conceptualize this by thinking of our conditional edge as a conditional in a flowchart!

Notice how our function simply checks if there is a "function_call" kwarg present.

Then we create an edge where the origin node is our agent node and our destination node is *either* the action node or the END (finish the graph).

It's important to highlight that the dictionary passed in as the third parameter (the mapping) should be created with the possible outputs of our conditional function in mind. In this case `should_continue` outputs either `"end"` or `"continue"` which are subsequently mapped to the action node or the END node.

In [15]:
def should_continue(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  return END

uncompiled_graph.add_conditional_edges(
    "agent",
    should_continue
)

<langgraph.graph.state.StateGraph at 0x121809940>

Let's visualize what this looks like.

![image](https://i.imgur.com/8ZNwKI5.png)

Finally, we can add our last edge which will connect our action node to our agent node. This is because we *always* want our action node (which is used to call our tools) to return its output to our agent!

In [16]:
uncompiled_graph.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x121809940>

Let's look at the final visualization.

![image](https://i.imgur.com/NWO7usO.png)

All that's left to do now is to compile our workflow - and we're off!

In [17]:
simple_agent_graph = uncompiled_graph.compile()

#### ❓ Question #2:

Is there any specific limit to how many times we can cycle?

If not, how could we impose a limit to the number of cycles?

Yes – LangGraph already enforces a hard-stop.
Every graph (and every LCEL chain, because LCEL uses the same runtime) carries a recursion_limit. <br>

The default is 25 steps – once the flow has taken 25 node-to-node “hops” without hitting a normal terminal node, LangGraph raises GraphRecursionError <br>

A ReAct-style agent normally burns two steps per “iteration” (LLM-thinking step + tool step), so the default lets it loop about 12-ish times before it is stopped. <br>

You can lower or raise that cap with the recursion_limit config key when you call or compile the graph

## Using Our Graph

Now that we've created and compiled our graph - we can call it *just as we'd call any other* `Runnable`!

Let's try out a few examples to see how it fairs:

In [18]:
from langchain_core.messages import HumanMessage

inputs = {"messages" : [HumanMessage(content="Who is the current captain of the Winnipeg Jets?")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_VpltXQYCFh2BOQzGIc74EEKM', 'function': {'arguments': '{"query":"current captain of the Winnipeg Jets"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 162, 'total_tokens': 185, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': None, 'id': 'chatcmpl-BtZZOovewsxXPoC8yOAPTfNM9KR2U', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--97ddc0f3-2e07-4bb8-98c8-5ac5bbea0952-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'current captain of the Winnipeg Jets'}, 'id': 'call_VpltXQYCFh2BOQzGIc74EEKM', 'type': 

Let's look at what happened:

1. Our state object was populated with our request
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `additional_kwarg`, and sent the state object to the action node
4. The action node added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node
5. The agent node added a response to the state object and passed it along the conditional edge
6. The conditional edge received the state object, could not find the "tool_calls" `additional_kwarg` and passed the state object to END where we see it output in the cell above!

Now let's look at an example that shows a multiple tool usage - all with the same flow!

In [19]:
inputs = {"messages" : [HumanMessage(content="Search Arxiv for the QLoRA paper, then search each of the authors to find out their latest Tweet using Tavily!")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_8afVvBr3URymTMuLDyWNymXJ', 'function': {'arguments': '{"query": "QLoRA"}', 'name': 'arxiv'}, 'type': 'function'}, {'id': 'call_lyOKfefq1OWzAILL76car9tc', 'function': {'arguments': '{"query": "latest Tweet of the first author of QLoRA"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 59, 'prompt_tokens': 178, 'total_tokens': 237, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': None, 'id': 'chatcmpl-BtZajmNib0RRiLfF3ZuwVtB5A2I1o', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--235eb143-935a-4c08-994a-582d091657bb-0', tool_calls=[{'name': 'ar

#### 🏗️ Activity #2:

Please write out the steps the agent took to arrive at the correct answer.

Step 1 / Node Agent :
<br>
Decomposed the multi‑topic prompt into three sub‑questions.
Tool‑planning:
• “LoRA” → academic paper ⇒ arxiv
• “Tim Dettmers” → biographical info ⇒ tavily_search_results_json
• “attention mechanism” → survey papers ⇒ arxiv
Returned a single assistant message containing three function calls (one per sub‑task).
<br><br>

Step 2 / Node Action : 
<br>
LangGraph executed the three calls in parallel:
  a. arxiv("LoRA machine learning") → 3 latest LoRA papers
  b. tavily_search_results_json("Tim Dettmers") → biography & interviews
  c. arxiv("Attention mechanism in machine learning") → 3 survey papers
Each result set was inserted into state under its call‑ID.
<br><br>
Step 3 / Node Agent :
<br>
	Read all three result objects, distilled key facts, and wrote a multi‑section answer summarising:
  • what LoRA is and recent variants (KD‑LoRA, LoRA‑drop)
  • who Tim Dettmers is and his role in QLoRA/bits‑and‑bytes
  • why attention mechanisms matter and which variants are common.
The message had no further tool calls, so LangGraph detected finish_reason: "stop" → graph halted.



# 🤝 Breakout Room #2

## Part 1: LangSmith Evaluator

### Pre-processing for LangSmith

To do a little bit more preprocessing, let's wrap our LangGraph agent in a simple chain.

In [20]:
def convert_inputs(input_object):
  return {"messages" : [HumanMessage(content=input_object["question"])]}

def parse_output(input_state):
  return input_state["messages"][-1].content

agent_chain_with_formatting = convert_inputs | simple_agent_graph | parse_output

In [21]:
agent_chain_with_formatting.invoke({"question" : "What is RAG?"})

"RAG can refer to different concepts depending on the context. Could you please specify whether you're asking about RAG in the context of project management, machine learning, or another field?"

### Task 1: Creating An Evaluation Dataset

Just as we saw last week, we'll want to create a dataset to test our Agent's ability to answer questions.

In order to do this - we'll want to provide some questions and some answers. Let's look at how we can create such a dataset below.

```python
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]
```

#### 🏗️ Activity #3:

Please create a dataset in the above format with at least 5 questions.

In [22]:
# ✅ Evaluation dataset (minimum 5 Q‑A pairs)

questions = [
    "What new quantization technique does QLoRA introduce to shrink memory usage?",
    "Which component of an RAG pipeline is responsible for fetching external documents?",
    "Name two parameter‑efficient fine‑tuning (PEFT) methods other than LoRA.",
    "In LoRA, what happens to the original pretrained weights during adaptation?",
    "Which library is most commonly used with bitsandbytes to enable 4‑bit training?",
    "What role does a planner node play inside an LLM‑based agent architecture?"
]

answers = [
    {"must_mention": ["NF4", "NormalFloat"]},
    {"must_mention": ["retriever", "search"]},
    {"must_mention": ["prefix‑tuning", "prompt‑tuning"]},
    {"must_mention": ["frozen", "unchanged"]},
    {"must_mention": ["Hugging", "Face"]},
    {"must_mention": ["decide", "steps"]},
]


Now we can add our dataset to our LangSmith project using the following code which we saw last Thursday!

In [23]:
from langsmith import Client

client = Client()

dataset_name = f"Retrieval Augmented Generation - Evaluation Dataset - {uuid4().hex[0:8]}"

dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="Questions about the QLoRA Paper to Evaluate RAG over the same paper."
)

client.create_examples(
    inputs=[{"question" : q} for q in questions],
    outputs=answers,
    dataset_id=dataset.id,
)

{'example_ids': ['13331b73-c53d-470a-94d2-d69b311fa62f',
  '49c61c1c-8204-4114-96ee-579da15b37f8',
  '0ef15b00-0dbc-4029-910b-a5e8c4f43ebe',
  'de2eb49a-4fe5-4e7c-9978-04e6e8e12f36',
  '77cbf15c-5eb4-4498-b746-3213c2171c3d',
  'a3a7f72d-d5cc-4406-afc5-d61eef917b59'],
 'count': 6}

#### ❓ Question #3:

How are the correct answers associated with the questions?

> NOTE: Feel free to indicate if this is problematic or not

So the first dictionary in inputs ({"question": questions[0]}) is stored together with the first dictionary in outputs (answers[0]), the second with the second, and so on. As long as the two lists are the same length and already in the correct order, the mapping is correct.

Problematic if you later sort, filter, or extend one list without making the same change to the other—the associations will shift silently.

### Task 2: Adding Evaluators

Now we can add a custom evaluator to see if our responses contain the expected information.

We'll be using a fairly naive exact-match process to determine if our response contains specific strings.

In [24]:
from langsmith.evaluation import EvaluationResult, run_evaluator

@run_evaluator
def must_mention(run, example) -> EvaluationResult:
    prediction = run.outputs.get("output") or ""
    required = example.outputs.get("must_mention") or []
    score = all(phrase in prediction for phrase in required)
    return EvaluationResult(key="must_mention", score=score)

#### ❓ Question #4:

What are some ways you could improve this metric as-is?

Exact-String match : Normalise both strings: str.lower() → remove punctuation → collapse whitespace.
Optionally stem/lemmatise with NLTK or spaCy.
Allow regex patterns instead of literals so you can match `r"\bpaged\s+optim(izer
<br>
Boolean Score : Return a numeric 0‑1 score, e.g. len(found)/len(required) and include passed_phrases / missing_phrases in EvaluationResult.feedback for easy debugging.
<br>
Substring Pitfalls : 	Use word‑boundary checks (re.search(r'\bTim\b', …)) or full‑token comparison after tokenising.

Task 3: Evaluating

All that is left to do is evaluate our agent's response!

In [25]:
experiment_results = client.evaluate(
    agent_chain_with_formatting,
    data=dataset_name,
    evaluators=[must_mention],
    experiment_prefix=f"Search Pipeline - Evaluation - {uuid4().hex[0:4]}",
    metadata={"version": "1.0.0"},
)

View the evaluation results for experiment: 'Search Pipeline - Evaluation - a60f-107675f2' at:
https://smith.langchain.com/o/a8b64252-5f0f-4f35-a048-c004586e098a/datasets/1cfe60e9-166a-4c82-bea4-e0f6b3524d77/compare?selectedSessions=28ae365e-af5b-401a-8404-80d82774e754




0it [00:00, ?it/s]

In [26]:
experiment_results

## Part 2: LangGraph with Helpfulness:

### Task 3: Adding Helpfulness Check and "Loop" Limits

Now that we've done evaluation - let's see if we can add an extra step where we review the content we've generated to confirm if it fully answers the user's query!

We're going to make a few key adjustments to account for this:

1. We're going to add an artificial limit on how many "loops" the agent can go through - this will help us to avoid the potential situation where we never exit the loop.
2. We'll add to our existing conditional edge to obtain the behaviour we desire.

First, let's define our state again - we can check the length of the state object, so we don't need additional state for this.

In [27]:
class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

Now we can set our graph up! This process will be almost entirely the same - with the inclusion of one additional node/conditional edge!

#### 🏗️ Activity #5:

Please write markdown for the following cells to explain what each is doing.

Step 1: 
Creates a fresh StateGraph whose shared state is represented by the AgentState dataclass.
From this point on, you’ll add nodes (LLM steps, tool calls, routers, etc.) and connect them with edges to define the control‑flow of the agent.


Step 2 : Adds a node labelled "agent".

Whenever the graph engine reaches this node it will execute the call_model runnable—typically an OpenAI‑functions LLM call that “thinks”, decides which tool to call next, or produces the final answer.

The output of call_model is merged back into the graph’s mutable state (AgentState) so downstream nodes can inspect it.


Step 3 :

Adds another node labelled "action" wired to tool_node, the runnable that actually performs the chosen tool call (e.g., arxiv, tavily_search_results_json, database query, etc.).

After the tool returns, its result is again written to the state so the next “agent” step can consume it, completing the ReAct‑style loop.

In [28]:
graph_with_helpfulness_check = StateGraph(AgentState)

graph_with_helpfulness_check.add_node("agent", call_model)
graph_with_helpfulness_check.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x1220c2d50>

Declare where execution starts.
This line tells LangGraph that the node named "agent" is the entry point of the workflow.

When you later invoke the compiled graph (graph.invoke(...)), the runtime will:

Initialise an empty (or user‑supplied) AgentState.

Jump straight to the "agent" node and run call_model.

If you don’t set an entry point, LangGraph would raise an error at compile time because it wouldn’t know which node to execute first. ```

In [29]:
graph_with_helpfulness_check.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x1220c2d50>

tool_call_or_helpful is a router function that:

Sends the flow to the tool‑execution node when the LLM has requested tools.

Ends the conversation if it has looped too many times.

Otherwise, asks a smaller LLM to judge the answer’s helpfulness and either finishes or loops based on that score.

In [30]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def tool_call_or_helpful(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  initial_query = state["messages"][0]
  final_response = state["messages"][-1]

  if len(state["messages"]) > 10:
    return "END"

  prompt_template = """\
  Given an initial query and a final response, determine if the final response is extremely helpful or not. Please indicate helpfulness with a 'Y' and unhelpfulness as an 'N'.

  Initial Query:
  {initial_query}

  Final Response:
  {final_response}"""

  helpfullness_prompt_template = PromptTemplate.from_template(prompt_template)

  helpfulness_check_model = ChatOpenAI(model="gpt-4.1-mini")

  helpfulness_chain = helpfullness_prompt_template | helpfulness_check_model | StrOutputParser()

  helpfulness_response = helpfulness_chain.invoke({"initial_query" : initial_query.content, "final_response" : final_response.content})

  if "Y" in helpfulness_response:
    return "end"
  else:
    return "continue"

#### 🏗️ Activity #4:

Please write what is happening in our `tool_call_or_helpful` function!

After every run of the "agent" node LangGraph will call
tool_call_or_helpful(state).

The router returns one of three route keys:

Route key	Meaning	Destination node
"continue"	The answer wasn’t helpful – loop back so the agent can try again.	"agent" (self‑loop)
"action"	The agent asked for a tool call – go execute it.	"action"
"end"	The answer is judged helpful (or we hit a safety stop) – finish execution.	END (built‑in terminal)

Internally, add_conditional_edges stores these mappings so the runtime
can switch edges dynamically at run‑time, giving you a single‑file
ReAct‑style loop with a helpfulness fuse:

In [30]:
graph_with_helpfulness_check.add_conditional_edges(
    "agent",
    tool_call_or_helpful,
    {
        "continue" : "agent",
        "action" : "action",
        "end" : END
    }
)

<langgraph.graph.state.StateGraph at 0x7fec04fd3b10>

adds connection back from action to agent, closing out the loop

In [32]:
graph_with_helpfulness_check.add_edge("action", "agent")

Adding an edge to a graph that has already been compiled. This will not be reflected in the compiled graph.


<langgraph.graph.state.StateGraph at 0x1220c2d50>

Freeze the blueprint into a runnable object.
compile() walks through the StateGraph, validates that every node is reachable and every conditional route has a matching edge, then produces an executable RunnableGraph.

In [33]:
agent_with_helpfulness_check = graph_with_helpfulness_check.compile()

| Step                                        | What’s happening                                                                                                                                                                                                                      |
| ------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `agent_with_helpfulness_check.astream(...)` | Runs the graph **non‑blocking** and yields an **`update`** each time a node completes—perfect for tracing or live dashboards.                                                                                                         |
| `stream_mode="updates"`                     | Tells LangGraph to send a **dict** keyed by node‑name where the value is the latest `AgentState` snapshot at that node. Other modes (`"tokens"`, `"chat"`) stream lower‑level LLM output instead.                                     |
| `async for chunk in ...`                    | Because the call is asynchronous you can await other tasks while the LLM/tool call is running. Each `chunk` corresponds to one node’s completion.                                                                                     |
| Inner loop                                  | A single `chunk` might contain multiple nodes if they finished “at once” (rare but possible when parallel edges exist). The code prints: 1) which node fired, 2) the full `messages` list *as it looks **after** that node executed*. |


In [34]:
inputs = {"messages" : [HumanMessage(content="Related to machine learning, what is LoRA? Also, who is Tim Dettmers? Also, what is Attention?")]}

async for chunk in agent_with_helpfulness_check.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_btZJOTSU1rjlKbJYEWYBL2Xu', 'function': {'arguments': '{"query": "LoRA machine learning"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_GqrD63oXjxe25R2jxpS4WAY8', 'function': {'arguments': '{"query": "Tim Dettmers"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_PszAJ3ZWyVzoGEDW0UmbR03x', 'function': {'arguments': '{"query": "Attention in machine learning"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 79, 'prompt_tokens': 177, 'total_tokens': 256, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': None, 'id': 'chatcmpl-BtZ

### Task 4: LangGraph for the "Patterns" of GenAI

Let's ask our system about the 4 patterns of Generative AI:

1. Prompt Engineering
2. RAG
3. Fine-tuning
4. Agents

In [35]:
patterns = ["prompt engineering", "RAG", "fine-tuning", "LLM-based agents"]

In [36]:
for pattern in patterns:
  what_is_string = f"What is {pattern} and when did it break onto the scene??"
  inputs = {"messages" : [HumanMessage(content=what_is_string)]}
  messages = agent_with_helpfulness_check.invoke(inputs)
  print(messages["messages"][-1].content)
  print("\n\n")

Prompt engineering is the process of designing and refining prompts to effectively communicate with and elicit desired responses from AI language models like GPT-3 and GPT-4. It involves crafting specific, clear, and contextually appropriate prompts to improve the quality, relevance, and accuracy of the AI's outputs.

Prompt engineering has gained significant prominence with the rise of large language models (LLMs) around 2020-2021. As these models became more capable and widely accessible, users and developers started to recognize the importance of carefully designing prompts to maximize their utility. The practice has since evolved into a specialized skill, often considered a key aspect of working with AI in various applications, from chatbots to content creation and coding assistance.

Would you like more detailed information on its history or its techniques?



RAG, which stands for Retrieval-Augmented Generation, is a technique in natural language processing that combines pre-trai