# Google AI Agentic Engineer Training - Langchain & LangGraph - Day 1

## Initial setup

**Prequisites**

* A valid OpenAI API Key
* A valid TavilySearch API Key
* Download the following pdf from here: https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf

Please run the following cells and enter your API keys when prompted.

In [None]:
%%capture --no-stderr
%pip install langchain langchain-community langchain-text-splitters langchain-openai langchain-chroma pymupdf pypdf

In [None]:
from getpass import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass("Enter your api key:")

Enter your api key:··········


## Core Components

In this first section we'll cover the basics of using LangChain to get responses from an LLM. Then we'll go over LangGraph and how we can use it, along with LangChain, to create powerful, multi-agentic systems.

### 1. Configure Langchain to use the LLM

First, we'll set up a LangChain chat client to use `gpt-4.1-mini` with `OpenAI`. Here, we define the model from `OpenAI` that we want to interact with as well as other settings for our client.

One common setting you should be aware of is the `temperature` - this controls the randomness of the responses. It is a value between 0 and 1 where the closer it gets to 1 the more randomness gets introduced into the generated response. You can potentially generate more "creative" responses by increasing the temperature value and more deterministic responses by keeping it closer (or at) 0.

[See other settings that are available here](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html)

In [None]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4.1-mini", temperature=0)

### 2. `Invoke` the LLM and handle responses

Every LLM client you create with LangChain has an `invoke` method which accepts a single prompt or series of prompts as input and returns a generated response from the LLM.

Below, we'll write a simple prompt that asks a question and print its response. You can think of this like the programming equivalent of asking ChatGPT a question:

In [None]:
response = llm.invoke("What are AI Agents?")

In [None]:
response

AIMessage(content='**AI Agents** are autonomous entities in artificial intelligence that perceive their environment through sensors and act upon that environment through actuators or effectors to achieve specific goals. They are designed to make decisions and perform tasks without human intervention, often adapting their behavior based on the information they gather.\n\n### Key Characteristics of AI Agents:\n1. **Autonomy:** Operate without direct human control.\n2. **Perception:** Sense or receive input from the environment.\n3. **Action:** Take actions that affect the environment.\n4. **Goal-oriented:** Work towards achieving specific objectives.\n5. **Adaptability:** Learn or adjust behavior based on experience or changes in the environment.\n\n### Types of AI Agents:\n- **Simple Reflex Agents:** Act only on the current percept, ignoring the rest of the percept history.\n- **Model-based Reflex Agents:** Maintain an internal state to keep track of the world.\n- **Goal-based Agents:**

In [None]:
response.usage_metadata

{'input_tokens': 12,
 'output_tokens': 270,
 'total_tokens': 282,
 'input_token_details': {'audio': 0, 'cache_read': 0},
 'output_token_details': {'audio': 0, 'reasoning': 0}}

In [None]:
# Isolate the LLM response text
response_text = response.content
print(response_text)

**AI Agents** are autonomous entities in artificial intelligence that perceive their environment through sensors and act upon that environment through actuators or effectors to achieve specific goals. They are designed to make decisions and perform tasks without human intervention, often adapting their behavior based on the information they gather.

### Key Characteristics of AI Agents:
1. **Autonomy:** Operate without direct human control.
2. **Perception:** Sense or receive input from the environment.
3. **Action:** Take actions that affect the environment.
4. **Goal-oriented:** Work towards achieving specific objectives.
5. **Adaptability:** Learn or adjust behavior based on experience or changes in the environment.

### Types of AI Agents:
- **Simple Reflex Agents:** Act only on the current percept, ignoring the rest of the percept history.
- **Model-based Reflex Agents:** Maintain an internal state to keep track of the world.
- **Goal-based Agents:** Act to achieve specific goals.

### 3. Chat Messages

We typically interact with language models by using chat messages. Each message has two parts: a `role`, which identifies the source of the message, and the `content` of the message itself. A conversation with an LLM can be thought of as a sequence of messages between a `human` and an `AI`.

There are three main types of roles you should be aware of:

- `system`: This represents the system message for a given chat. System messages are typically used to define how you would like the LLM to respond to user messages.
- `user` or `human`: This is a message or prompt written by a user.
- `ai`: This is the generated response from the LLM.

To illustrate this, here's a simple demonstration that uses a system prompt along with a user prompt using  LangChain:


`0. Create a list of messages with our system prompt and first user message`

In [None]:
messages = [
    # Defining the behavior/style we would like our response to be
    ("system", "You are loud and slightly obnoxious"),
    # Supplying our message prompt as a "human" message
    ("human", "What are AI Agents?")
]

`1. Ask the LLM our question by sending it our system prompt and user message`

In [None]:
agent_description = llm.invoke(messages)
# Add the response to our messages to maintain conversation history
messages.append(agent_description)
print(agent_description.content)

OH MAN, AI AGENTS are like the SUPERHEROES of the tech world! Imagine little digital robots or programs that can THINK, LEARN, and MAKE DECISIONS all on their own without someone babysitting them every second. They’re designed to perform tasks, solve problems, or even chat with you (like ME, duh!).

These bad boys can be anything from a chatbot answering your questions, to a self-driving car navigating traffic, to a virtual assistant scheduling your appointments. They use AI techniques like machine learning, natural language processing, and sometimes even a sprinkle of magic (okay, not really magic, but it feels like it).

So yeah, AI Agents = smart, autonomous digital workers that get stuff DONE! BOOM!


`2. Append our next question to our messages`

In [None]:
# Let's see how we can get the LLM to "remember" the message history
messages.append(("human", "What was the last thing I asked you?"))

# Send the message history to the LLM to get a response
response_2 = llm.invoke(messages)
# Add the response to our message history
messages.append(response_2)
print(response_2.content)

YOU asked me, "What are AI Agents?" and I gave you the loud and proud explanation! Want me to break it down again or go even louder? 😎🔥


Now, what if we asked the same thing to the LLM directly without including the system message and history?

In [None]:
llm.invoke("What was the last thing I asked you?").content

'You asked, "What was the last thing I asked you?"'

If we just ask our LLM client about our conversation history, it has no idea what we're talking about. This is because the LLM itself does not remember the chat history or the system prompt - we have to manage that on our end.

***We must include the system prompt and any other chat messages we want the LLM to use in its response with each request we send.***

### 4. The "Chain" in LangChain - Putting it all together

LangChain derives its name from the fact that it is designed to allow you to define a sequence of actions as a chain. This can be incredibly useful when you're automating LLM tasks as you can ensure that the steps you define are always executed in that order.

Let's take our example from earlier and turn that into a basic chain:



The flow for our chain will be simple. At a high level, we want a prompt that includes our system message to be fed to the LLM as input, and then have that output given back to us as a string. So it looks something like:
```
prompt --> LLM --> response
```
Here's how we can define that in LangChain:

In [None]:
# First, we'll use a prompt template to handle our messages
# and an output parser to automatically parse the content from the response
from langchain_core.prompts.chat import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# We build a prompt template that will be included with each LLM call
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are loud and slightly obnoxious"),
    # We define a simple prompt template here using Python string formatting
    # where we keep track of the question and the message history
    ("human", "{history} {question}")
])

# Here we define the chain, with the '|' operator used to show
# that the output of the previous step is fed into the following
simple_chain = prompt | llm | StrOutputParser()

In [None]:
# Here is how we can use our chain with chat history tracking
message_history = []
question = "What are AI Agents?"
message_history.append(("human", question))
response = simple_chain.invoke({"question": "What are AI Agents?",
                                "history": message_history})
message_history.append(("ai", response))

print(response)

OH, YOU WANNA KNOW ABOUT AI AGENTS? ALRIGHT, LISTEN UP! AI AGENTS ARE LIKE THESE SUPER SMART COMPUTER PROGRAMS THAT CAN ACT ON THEIR OWN TO DO STUFF! THEY'RE DESIGNED TO PERCEIVE THEIR ENVIRONMENT, MAKE DECISIONS, AND TAKE ACTIONS TO ACHIEVE SPECIFIC GOALS. THINK OF THEM AS LITTLE DIGITAL ROBOTS THAT CAN LEARN, ADAPT, AND SOLVE PROBLEMS WITHOUT CONSTANT HUMAN HELP. COOL, RIGHT?!


In [None]:
simple_chain.invoke({"question": "What was the last thing I asked you?",
                     "history": [message_history[-2]]})

'OH, YOU WANNA KNOW WHAT YOU JUST ASKED? You asked, "What are AI Agents?" BAM! There it is!'

In [None]:
# Confirming that the LLM is using our message history in the chain call:

simple_chain.invoke({"question": "What was the last thing I asked you?",
                     "history": message_history})

'YOU ASKED, "WHAT ARE AI AGENTS?" AND I GAVE YOU THE LOWDOWN ON THOSE SMART COMPUTER PROGRAMS THAT ACT ON THEIR OWN TO GET THINGS DONE! BOOM!'

Now you know how to create basic chains in LangChain and manage your message history when having longer conversations with an LLM!

### 5. Structured Output

One more incredibly useful technique you should know about is how to generate structured output from an LLM.

**What are structured outputs?**

Some LLMs are fine-tuned so that they can structure their output into JSON-like data formats. Structured outputs allows users to define a JSON-like schema and have the LLM generate a JSON-like object as its response. This is incredibly useful in many downstream tasks, particularly if you want to use the LLM to make control flow decisions.



#### How to create set up structured outputs

Setting up structured outputs requires two key steps:

1. Defining a JSON-like schema for your output
2. Adding the schema to your system prompt

We typically use either [`Pydantic`](https://docs.pydantic.dev/latest/) or a [`TypedDict`](https://mypy.readthedocs.io/en/stable/typed_dict.html) to define our schema, although it is [possible to use raw JSON](https://python.langchain.com/docs/how_to/structured_output/#typeddict-or-json-schema) instead.

For our example, we'll use Pydantic -

In [None]:
from pydantic import BaseModel, Field

First, we define the schema:

In [None]:
class SampleStructure(BaseModel):
  response: str = Field(description="The response from the LLM")
  sarcasm_level: float = Field(description="The level of sarcasm in the response")

Based on the schema we defined, the LLM should provide our output in a structure that looks like this:

```json
{
  "response": "HELLO, THIS IS THE LLM!",
  "sarcasm_level": 1
}
```

Now we can go ahead and create our system prompt:

In [None]:
system_prompt = """You are loud and slightly obnoxious.

                Generate a sarcastic response to the user's question.
                Then rate your level of sarcasm on a scale of 1 to 10,
                where 1 is not at all sarcastic and 10 is completely
                sarcastic.

                You must respond in the following JSON format:
                {{
                  "response": "The response from the LLM",
                  "sarcasm_level": 1
                }}
                """

This time, we added some extra rules to our system prompt to account for the instructions to generate a response using our schema. Notice in the schema example that we escape the curly brackets by doubling them up (`{{`). *You need to do this with your structured output responses so as not to confuse LangChain's templating system.*

Now we can go ahead and create our chain:

In [None]:
# We set up our prompt template just like before
structured_prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{history} {question}")
])

# Instead of using the StrOutputParser, we pass in the structure that
# we defined to the `with_structured_output` method of our LLM client. This
# ensures that our responses are returned in the structure we want
structured_chain = (structured_prompt |
                    llm.with_structured_output(SampleStructure,
                                               method="json_mode"))

In [None]:
message_history = []

In [None]:
question = input("Ask a question:\t")
message_history.append(("human", question))
structured_response = structured_chain.invoke(
    {
        "question": question,
        "history": message_history
    }
)
message_history.append(("ai", structured_response))

KeyboardInterrupt: Interrupted by user

In [None]:
# Now we can access the LLM response with our data structure
def print_structured_response(response: str, sarcasm_level: str) -> None:
  """
  Prints the LLM's response and sarcasm level.
  """
  print(f"The LLM's response was: {response}")
  print(f"The LLM's sarcasm level was: {sarcasm_level}")

# And we can pass the output to other downstream tasks to continue work
print_structured_response(**structured_response.model_dump())

In [None]:
structured_response




---

## Modular vs Monolithic Chains

### From a Single Chain to a Full Application: Modular vs. Monolithic Design

So far, you’ve seen how to create simple, powerful chains using LangChain Expression Language (LCEL). You know how to pipe components together to get from a prompt to a final, structured output:

```python
chain = prompt | llm | output_parser
```

This is the fundamental building block of LangChain. However, as you start building more complex applications, a critical question arises: how do you organize these blocks? This leads us to a core software design choice: do you build one giant, "monolithic" chain, or do you create smaller, "modular" chains?

#### The Monolithic Approach: The "All-in-One" Chain

When you're first starting, it's tempting to put all of your logic into a single, end-to-end chain.

Imagine you want to build an application that, given a topic, first generates a trivia question about it, and then generates a funny punchline for that question. A monolithic approach might look like this:

```python
# Conceptual monolithic chain
monolithic_chain = (
    prompt_for_question |
    llm |
    format_question |
    prompt_for_punchline |
    llm |
    format_punchline
)
```

For a developer, this is like writing one single, massive function that handles every step of a complex process. While it might work for a quick prototype, you'll quickly run into familiar problems:

- **Hard to Debug:** If the final output is strange, was the problem in the question generation, the punchline generation, or somewhere in between? It's difficult to inspect the intermediate steps.
- **Not Reusable:** What if you want to generate just a trivia question in another part of your app? You can't. The logic is trapped inside this larger chain. You’d have to copy and paste the code.
- **Difficult to Test:** You can't write a unit test for just the punchline-generation part of the logic; you have to test the entire flow every time.

#### The Modular Approach: The Professional Standard

A modular approach applies standard software engineering principles to building with LLMs. Instead of one giant chain, you create multiple smaller, independent chains that each do one specific job well.

Let's refactor our trivia app into a modular design:

```python
# First, define a small, reusable chain for generating questions
question_chain = prompt_for_question | llm | format_question

# Next, define a separate chain for generating punchlines
punchline_chain = prompt_for_punchline | llm | format_punchline
```

This is like breaking down a massive function into smaller, focused functions that you can call upon. Now, you have two independent, professional-grade components. Look at the benefits:

- **Easy to Debug:** Is the question generation failing? You can run `question_chain.invoke({"topic": "history"})` on its own and see exactly what it's producing.
- **Highly Reusable:** Need a trivia question elsewhere? Just import `question_chain`!
- **Simple to Test:** You can now write a specific unit test for `question_chain` and another for `punchline_chain`, ensuring each part works reliably.

With these modular chains, you can then **compose** them to create your full application. You can run one chain, take its output, and feed it into the next. This gives you complete control over the workflow while keeping your code clean, organized, and maintainable—the standard for any professional software project.

In [None]:
# Modular chain vs Monolithic chain
trivia_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful trivia bot that comes up with a trivia question for a given topic"),
    ("human", "{topic}")
])

joke_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful joke bot that creates a punchline based on a given trivia question"),
    ("human", "{question}")
])

monolithic_trivia_chain = trivia_template | llm  | StrOutputParser() | joke_template | llm | StrOutputParser()

# modular chain
trivia_chain = trivia_template | llm | StrOutputParser()
joke_chain = joke_template | llm | StrOutputParser()

Because we set up our chain as a monolith, we don't have visibility into what happens throughout our chain, only the output:

In [None]:
monolithic_trivia_chain.invoke({"topic": "history"})

Whereas with a modular chain, you can see the output for the trivia question that was generated and even take action on the output before passing it to the joke chain:

In [None]:
trivia_question = trivia_chain.invoke({"topic": "history"})
punchline = joke_chain.invoke({"question": trivia_question})

print(f"The question is: {trivia_question}")
print(f"The punchline is: {punchline}")

## Document loading and text processing

### Working with Your Own Data: Document Loading & Processing

While LLMs have a vast amount of general knowledge, their real power is unlocked when you can connect them to your own specific data—your company's PDFs, a website's content, or your notes in Notion. The most common pattern for this is **Retrieval-Augmented Generation (RAG)**, and its foundation starts with two key steps: **loading** your documents and **processing** the text within them.

#### Document Loading: Getting Your Data into LangChain

Before you can work with your data, you need to get it into a format LangChain understands. This is the job of **Document Loaders**.

A Document Loader is a specialized utility that fetches data from a source (like a PDF file, a specific URL, or a database) and converts it into a standardized `Document` object. A `Document` simply contains the text content itself (`page_content`) and some useful metadata (like the source file or page number).

Document Loaders are like specialized connectors or data adaptors. Instead of you having to write the complex code to parse a PDF or scrape a website, LangChain provides a massive ecosystem of hundreds of pre-built loaders that handle it for you.

#### Text Processing: Splitting Documents for the LLM

Once you've loaded a document, you can't just feed a 500-page PDF directly into an LLM's prompt. The model's context window is limited, and processing that much text would be slow and expensive. You need to break it down into smaller, more relevant pieces. This is where **Text Splitters** come in.

A Text Splitter's job is to take a single long `Document` and divide it into smaller, more manageable chunks. The goal is to create chunks that are small enough for an LLM to handle but large enough that they still make sense and retain their original meaning. LangChain provides several smart splitters, like the `RecursiveCharacterTextSplitter`, which tries to break up text along natural boundaries like paragraphs and sentences.

By loading your data and splitting it into meaningful chunks, you've completed the first critical steps of preparing your knowledge base. The next step is to store these chunks in a way that allows for fast and efficient searching, which leads us directly to the concepts of **Embeddings** and **Vector Stores**.

#### Example

We'll load in the `A Practical Guide to Building AI Agents` pdf from OpenAI using a PDF loader, then we'll split the document into chunks using the `RecursiveCharacterTextSplitter`.

In [None]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [None]:
document = PyPDFLoader(file_path="/content/a-practical-guide-to-building-agents (1).pdf").load()
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=250,  # The maximum size of a chunk (in characters)
    chunk_overlap=50   # The number of characters to overlap between chunks
)
chunks = text_splitter.split_documents(document)

print(f"Split the document into {len(chunks)} chunks")

Split the document into 195 chunks


## Vector stores and embedding strategies

### Finding the Needle in the Haystack: Embeddings and Vector Stores

In the last section, we successfully loaded our custom data and split it into manageable chunks. This leads to the next critical question: when a user asks something, how do we find the *exact* chunks of text that are most relevant to their query?

A simple keyword search won't work. If a user asks about "company revenue," a keyword search would miss a chunk that only mentions "quarterly earnings." We need to search based on *semantic meaning* or *conceptual similarity*. This is where the magic of embeddings and vector stores comes in.


#### What Are Text Embeddings?

An **embedding** is a numerical representation of a piece of text. It's a long list of numbers—a "vector"—that captures the text's semantic meaning. This is created by a specialized **Embedding Model**.

Think of it like this: an embedding is a coordinate for a piece of text on a giant "map of meaning." Texts with similar meanings, like "company revenue" and "quarterly earnings," will be located very close to each other on this map. Different concepts, like "company revenue" and "office locations," will be very far apart.

The process is simple: you feed your text chunks to an embedding model, and it outputs a vector for each chunk, ready to be stored.

#### What Are Vector Stores?

A **Vector Store** (or Vector Database) is a specialized database built to store these embedding vectors and perform one task incredibly well: finding the most similar vectors to a given query vector at lightning speed.

Continuing our analogy, if embeddings are the coordinates, the Vector Store is the super-powered GPS system for our map of meaning.

The full process, known as **retrieval**, looks like this:

1.  **Embed & Store:** First, you take all your processed text chunks, create an embedding for each one, and store them all in your Vector Store. This is a one-time setup process.
2.  **Embed the Query:** When a user asks a question, you take their question and use the *same* embedding model to create a vector from it.
3.  **Search for Similarity:** You then give this "query vector" to the Vector Store and ask it: "Find the document chunks whose vectors are closest to this one on the map."
4.  **Retrieve Documents:** The Vector Store returns the most relevant document chunks based on their conceptual similarity to the user's question.

Now that we have a way to efficiently retrieve the most relevant pieces of information, we are finally ready to put it all together. The next step is to take this retrieved context, combine it with the user's original question, and build the final prompt for the LLM to generate an answer.

#### Example

We'll take the document chunks that we created in the previous section, create vector embeddings, and add them to an in-memory vector database called `Chroma` so that we can perform similarity search.

In [None]:
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

First, we set up our embedding model and vector store. For the embedding model, we'll use OpenAI's text embedding model. We will also set up our `Chroma` vector store to persist data to our local disk:

In [None]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

vector_store = Chroma(
    collection_name="lc-demo",
    embedding_function=embeddings,
    persist_directory="/content/lc-vector-store"
)

Next, we'll embed the chunks we created and add them to the vector store, all in one line:

In [None]:
vector_store.add_documents(documents=chunks)

['8be22167-0e16-462b-861e-83d9f9937972',
 'c48c5066-2eb9-460e-a83b-d321a72e451f',
 'de5078e1-702e-40dc-bf8a-7b5dc8c6e568',
 'e642acf3-4928-4718-b7f4-75cbc5e8323a',
 'dcbf0b42-2374-4200-ba52-4583e44c5f5d',
 'c0ebb0b2-f3c4-4ba5-b033-90131ad6a00b',
 'c04d9963-9fba-41af-924b-8162d12b8927',
 '9576bb5e-89b3-4e10-8891-f4ac65069045',
 '4f44514e-d00f-480c-a381-39cd039747f9',
 'e0d1431d-d582-4628-83ca-e7eb0d22cd16',
 '41b9d438-e92b-43ff-a6d4-8571b1f43a2f',
 'ec03d12b-b07e-4dd8-a729-99170f726613',
 '0c63247c-e067-4d60-afda-0df92b279323',
 '8f3eaeca-603e-4060-9365-5bb18e952dca',
 'e57c87aa-ea8f-4680-beea-7a64312f75d9',
 'eafaa7c6-35bd-456b-836e-23b866ed2c5d',
 '1d9b23f3-54b3-43a3-9112-682f420cfc35',
 '020593e3-fd14-4421-b43c-4aad40be696c',
 '3943f9b0-8333-4ac6-b4d5-9c2c19354096',
 '3145c353-8bf7-4855-908f-1a3d90c5a5bb',
 'd396b629-a896-41c7-b6e3-66ff3b771565',
 '9342a5e4-829c-4ca8-847e-9f16f7edac45',
 'a086d860-61de-4486-9bce-1f05277e704c',
 '8379ff5b-ce67-4af3-b003-418a81d2b085',
 'e760df20-2683-

## Demo: Building a simple semantic search engine

In this next section, we'll be building a very simple semantic search engine using the skills we just learned about. This will illustrate just how powerful the concepts we just covered are and will set the stage for our Day 1 Lab, where we will build out a full Q&A chatbot.

We will be using the vector database and document that we uploaded earlier.

Here we will demonstrate the difference between a basic keyword search and our semantic search:

In [None]:
def keyword_search(query, texts):
    query_words = set(query.lower().split())
    results = []
    for text in texts:
        text_words = set(text.page_content.lower().split())
        if query_words.intersection(text_words):
            results.append(text)
    return results

def run_semantic_search(query):
    print(f"\nSemantic Query: '{query}'")
    # The core of our search: similarity_search finds the most relevant chunks.
    retrieved_chunks = vector_store.similarity_search(query, k=5) # Get top 5 results

    print("Top relevant chunks found:")
    print("--------------------------")
    for i, chunk in enumerate(retrieved_chunks):
        print(f"Result {i+1}:")
        print(chunk.page_content)
        print(f"(Source: {chunk.metadata['source']})")
        print("--------------------------")

In [None]:
keyword_search("how to set guardrails", texts=chunks)

[Document(metadata={'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creator': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creationdate': '2025-04-07T14:20:51+00:00', 'moddate': '2025-04-07T14:20:54+00:00', 'source': '/content/a-practical-guide-to-building-agents (1).pdf', 'total_pages': 34, 'page': 6, 'page_label': '7'}, page_content='P y t h o n\n1\n2\n3\n4\n5\n6\nweather_agent = Agent(\n\xa0\xa0\xa0name=\ninstructions=\n\xa0\xa0\xa0\xa0tools=[get_weather],\n)\n\xa0 ,\n "Weather agent"\n"You are a helpful agent who can talk to users about the \nweather.",'),
 Document(metadata={'producer': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creator': 'pdf-lib (https://github.com/Hopding/pdf-lib)', 'creationdate': '2025-04-07T14:20:51+00:00', 'moddate': '2025-04-07T14:20:54+00:00', 'source': '/content/a-practical-guide-to-building-agents (1).pdf', 'total_pages': 34, 'page': 11, 'page_label': '12'}, page_content='U n s e t\n1 “You are an expert in writing instructions for

In [None]:
run_semantic_search("how to set guardrails")


Semantic Query: 'how to set guardrails'
Top relevant chunks found:
--------------------------
Result 1:
G u a r d r a i l s
W ell-designed guar dr ails help y ou manage da ta priv ac y  risk s ( f or  e x ample ,  pr e v en ting s y st em
(Source: /content/a-practical-guide-to-building-agents (1).pdf)
--------------------------
Result 2:
could harm y our  br and’ s in t egrity .
B u i l d i n g  g u a r d r a i l s
Se t up guar dr ails tha t addr ess the risk s y ou’v e alr eady  iden tified f or  y our  use case and la y er  in
(Source: /content/a-practical-guide-to-building-agents (1).pdf)
--------------------------
Result 3:
@input_guardrail
 churn_detection_tripwire(
bool
2 8 A  p r a c t i c a l  g u i d e  t o  b u i l d i n g  a g e n t s
(Source: /content/a-practical-guide-to-building-agents (1).pdf)
--------------------------
Result 4:
in additional ones as y ou unco v er  ne w  vulner abilities.  Guar dr ails ar e a critical componen t o f  an y  
LLM-based deplo ymen t,  bu

In [None]:
context = """
Result 1:
could harm y our  br and’ s in t egrity .
B u i l d i n g  g u a r d r a i l s
Se t up guar dr ails tha t addr ess the risk s y ou’v e alr eady  iden tified f or  y our  use case and la y er  in
(Source: /content/a-practical-guide-to-building-agents.pdf)
--------------------------
Result 2:
G u a r d r a i l s
W ell-designed guar dr ails help y ou manage da ta priv ac y  risk s ( f or  e x ample ,  pr e v en ting s y st em
(Source: /content/a-practical-guide-to-building-agents.pdf)
--------------------------
Result 3:
@input_guardrail
 churn_detection_tripwire(
bool
2 8 A  p r a c t i c a l  g u i d e  t o  b u i l d i n g  a g e n t s
(Source: /content/a-practical-guide-to-building-agents.pdf)
--------------------------
Result 4:
in additional ones as y ou unco v er  ne w  vulner abilities.  Guar dr ails ar e a critical componen t o f  an y
LLM-based deplo ymen t,  but should be coupled with r obust authen tica tion and authoriz a tion
(Source: /content/a-practical-guide-to-building-agents.pdf)
--------------------------
Result 5:
T y p e s  o f  g u a r d r a i l s
R e l e v a n c e  c l a s s i fi e r E nsur es agen t r esponses sta y  within the in t ended scope
b y  flagging o ff - t opic queries.
(Source: /content/a-practical-guide-to-building-agents.pdf)

"""

In [None]:
llm.invoke(f"Based on the provided context, answer the question. Context: {context} Question: how to set guardrails")

AIMessage(content="Based on the provided context, setting guardrails involves the following steps:\n\n1. **Identify Risks:** Start by identifying the specific risks related to your use case, such as data privacy concerns, system integrity, or off-topic responses.\n\n2. **Design Guardrails to Address Risks:** Set up guardrails that directly address these identified risks. For example, use relevance classifiers to ensure agent responses stay within the intended scope by flagging off-topic queries.\n\n3. **Layer Guardrails:** Implement multiple layers of guardrails to cover different aspects of risk, such as data privacy, system security, and response relevance.\n\n4. **Couple with Authentication and Authorization:** Guardrails should be combined with robust authentication and authorization mechanisms to strengthen security.\n\n5. **Continuously Update:** As new vulnerabilities are uncovered, add additional guardrails to maintain and improve the system's safety and integrity.\n\nIn summar